Writing a TCP/IP port scanner, it's not so hard. This article shows you how to develop a simple but effective port scanner for Win32, using a self written socket classes (not the
CSocket which comes with MFC) and a small framework based on the
The scanner also includes a module for testing connections (the
Connect page) and allows the handling of the local database of services (the
What you should to know
The sample application presented here does not requires a deep knowledge of the TCP/IP protocol. If you have some notions about the MFC and the Winsock API, then you are on the right direction.
How a scanner works
Each time you send or receive data through the Internet, your mail (or web, chat, or whatever) program must connect to a remote port of a remote host. Some of the services which runs over the Internet are resumed into the following table:
|ftp||21||File Transfer Protocol|
|ssh||22||SSH Remote Login Protocol|
|smtp||25||Simple Mail Transfer|
|nameserver||42||Host Name Server|
|domain||53||Domain Name Server|
|http||80||World Wide Web HTTP|
|pop3||110||Post Office Protocol|
|netbios-ns||137||NETBIOS Name Service|
|netbios-dgm||138||NETBIOS Datagram Service|
|netbios-ssn||139||NETBIOS Session Service|
In fact, the list of services/ports is greater than the above and includes three different port ranges:
|0 - 1023||well known ports, which include the most common services, like SMTP, POP3, FTP, etc.|
|1024 - 49151||registered ports, which are assigned by the IANA organization|
|49152 - 65535||dynamic and/or private ports, which can be freely used|
As the RFCs say (
RFC793 for the TCP and
RFC768 for the UDP protocol), the above ports can be used for TCP and UDP connections, so it's not unusual to see duplicated entries for the same service, like in:
The TCP/IP protocol is based on the OSI (Open Systems Interconnection) model, developed between 1977 and 1984. That model uses a proposal of the ISO (International Organization for Standardization), so it is well known as the ISO/OSI model also. The ISO/OSI architecture divides the network into various layers (application, presentation, session, transport, network, data-link and physical).
Think to a layer like a plant of a building. At the lower level is your network card, used to exchange raw data between the computers, while at the higher level is your preferred client sending mail or browsing the web. In fact, the TCP/IP is not a protocol, but a suite of protocols, composed by:
|IP (Internet Protocol)||the protocol used to exchange raw data between remote hosts|
|TCP (Transport Control Protocol)||the protocol used to exchange data between applications|
|UDP (User Data Protocol)||used as the TCP to exchange data between applications, but more simpler and less reliable than the TCP|
|ICMP (Internet Control Message Protocol)||used at the network layer for error messaging|
The correspondence between the ISO/OSI model and the suite of the TCP/IP protocols can be represented by the following table:
|ISO/OSI model||TCP/IP suite|
|application layer||client program|
|presentation layer||client program|
|session layer||client program|
The IP and TCP protocols are the ground of data transfer. The IP protocol, which works at network layer, handles the transfer of raw data between computers. At this level, each packet contains the data which must be transferred and the IP address of the sender and the receiver. The TCP protocol works at the transport layer, but the principle is the same. TCP replaces the IP address concept with the port concept. Each transferred packet contains the data and the port number, where the port number is associated with a service instead of a computer.
In other words, the IP protocol moves the raw data from one computer to other, using the IP address to identify each computer of the network. Well, but when the data arrives to the destination computer, where must it be left? Are you browsing the WEB? If so, the TCP protocol redirects the incoming data through the port used by the HTTP service (80). Are you using an FTP client? The TCP protocol redirects the incoming data through the port used by the service (21), and so on.
As you can see, the port and service concepts are the basic principles used during data transfer through the Internet. Now you can understand which is the main purpose of a TCP/IP scanner. If you want to know what are the services currently running on a remote host, you must scan its ports.
The sample code presented here is only an example of Winsock programming. Do not think it like the best approach to the scan topical. As you know, the Winsock API can be used with three different modes,
asyncronous modes. Our scanner uses the third (asyncronous) mode.
Probably this is not the best mode which can be used with this topical (maybe using the blocking mode into a separate thread guarantees a better result), but this is not the point. Commercial scanners use a lot of threads to scan as fast as possible the mayor number of ports at the same time. This is not our purpose. Our scanner must scan one port at time, without blocking the UI until the work is done, and to do it in that way, the best approach is the asyncronous mode, used here to practice and exercise with the Winsock API. Of course, you are free to modify the proposed code to apply the Winsock mode you prefer.
All the code contained into the sample project can be divided into two sections, the simple framework used to build a
CPropertySheet based MFC application and the classes used for the Winsock interface.
The CPropertySheetDialog.cpp/.h and CPropertyPageDialog.cpp/.h files contain all the classes required to use the property sheet paradigm as a base for our MFC application.
TcpPropertySheet.cpp/.h files contain the code for the main app, while all the Tcp[...]Page.cpp/.h files contain the code for the pages of the sheet.
All the code used to interface the Winsock API is contained into the CWinsock.cpp/.h and CSock.cpp/.h files. CAsyncSock.cpp/.h file contains the classes used to access the Winsock API in asyncronous mode, while the rest of the code is used internally to handle the list control (CListCtrlEx.cpp/.h), the program's configuration, etc.
Note that the base class for the Winsock interface is the
CWinsock class, not the
CSock. As you can see looking into the code, the
CWinsock class is used to map all the Winsock services to the real library or to a dummy implementation, depending on the
_DEBUGSOCKET macro definition. This is because the
CWinsock class contains the code to handle locally a minimal SMTP/POP3 server, which you can use to exercise the SMTP/POP3 protocols without using an active Internet connection.
As usual for an MFC app, you must derive you own class (in this case, the
CTcpScanApp class) from the
CWinApp. Our scanner is not dialog, but property sheet-based, so you must override the default
InitInstance() implementation for creating the property sheet and all the related pages:
CPropertyPageList* pPropertyPageList = new CPropertyPageList();
p = new PROPERTYPAGE(IDD_PAGE_SCAN,&ScanPage,RUNTIME_CLASS(CScanPage));
p = new PROPERTYPAGE(IDD_PAGE_CONNECT,
p = new PROPERTYPAGE(IDD_PAGE_SERVICES,
= new CTcpScanPropertySheet(NULL,pPropertyPageList);
m_pMainWnd = pPropertySheetDialog;
The property sheet dialog is based on the
CPropertyPageDialog classes, which handles all the required stuff. The Apply and the Help buttons are removed by the framework, so each page of the property sheet has only the OK and the Cancel buttons. Each page of the sheet modifies the text of the OK button, according to its needs.
As you can see, each class used as a page of the sheet defines the following handlers:
If you decide to use the
CPropertyPageDialog classes into your own code, do not miss to do the same.
About the features of our scanner, consider that the
Connect page allows you to test only the connections which use the same channel for commands and data. This means you can use that page to send and receive data with protocols like SMTP, POP3, HTTP, etc., but not with protocols like FTP, which uses two different channels (one for the commands and other for the data).
When using the
Connect page, remember that you are entering raw data, so if you want to get some HTML page through the HTTP protocol, you must specify the HTTP request as the standard say (in this case, specifying the pair of CR/LF chars at the end of the request).
Any kind of ideas, suggestions or enhancements are always welcome.
I like C and C++, Acid Jazz, James Brown, gli Spaghetti Aglio e Olio, alla Bolognesa, alla Puttanesca e le Fettuccine alla Matriciana ('Maccaroni' over the world). Of course I like beautiful big tits girls too, my little car, Frank Zappa, the art of Zen, italian coffee and much more...