Email Protocols: SMTP and POP3
(This is an edited version of an article that originally appeared in PC Network Advisor. It is the fourth part of a series. Some of the terms used in this article were explained in the first, second or third articles.)
A striking point about many of the application layer protocols is how simple they are. The protocols based on TCP mostly use commands and responses in plain ASCII text, making them easier for a user to understand and for a programmer to implement.
For further illustration we shall look at the two protocols that you may use every day to send and receive Internet email: SMTP and POP3.
Simple Mail Transfer Protocol: SMTP
Simple Mail Transfer Protocol (SMTP) is one of the most venerable of the Internet protocols. Designed in the early 1980s, its function is purely and simply to transfer electronic mail across and between networks and other transport systems. As such, its use need not be restricted to systems that use TCP/IP. Any communications system capable of handling lines of up to 1,000 7-bit ASCII characters could be used to carry messages using SMTP. On a TCP/IP network, however, TCP provides the transport mechanism.
In SMTP the sender is the client, but a client may communicate with many different servers. Mail can be sent directly from the sending host to the receiving host, requiring a separate TCP connection to be made for each copy of each message. However, few mail recipients run their own SMTP servers.
It is more usual for the destination of an SMTP message to be a server that serves a group of users such as an Internet domain. The server receives all mail intended for its users and then allows them to collect it using POP3 (Post Office Protocol version 3) or some other mail protocol. Similarly, most SMTP clients send messages to a single "smart host" server, whose job it is to relay those messages on to their eventual recipients.
An SMTP transaction begins when the sender client opens a TCP connection with the receiver using the well known port number 25. The server acknowledges the connection by sending back a message of the form “220 SMTP Server Ready”. SMTP uses a similar format of replies to FTP, which we looked at previously. The three digit code is all the client software needs to tell if everything is going OK. The text is there to help the humans who might be troubleshooting a problem by analyzing a log of the transaction. The box “Application Protocol Reply Codes” provides more information about message reply codes.
An SMTP relay server might refuse a connection by sending back a message with a “421 Service not available” reply code. For example, an Internet Service Provider’s SMTP server provided for use by its subscribers to relay outgoing mail might refuse a connection from a host whose IP address indicates that it is not a subscriber to that ISP. The basic SMTP protocol has no form of access control - the way it can be used to relay messages would make this impractical - so this is about the only way ISPs can prevent non-subscribers such as spammers from using their mail servers to send out messages.
Having received the correct acknowledgement the sender signs on to the server by sending the string “HELO hostname”. HELO is the sign-on command and hostname is the name of the host. As we will see, the hostname is used in the Received: header which the server adds to the message when it sends it on its way. This information allows the recipient to trace the path taken by the message.
Once the sender gets a “250 OK” acknowledgement it can start sending messages. The protocol is extremely simple. All the sender has to do is say who the message is from, who it is to, and supply the contents of the message.
Who a message is from is specified with the command “MAIL FROM: <address>”. This command also tells the receiver that it is about to receive a new message, so it knows to clear out its list of recipients. The address in the angle brackets (which are required) is the return path for the message. The return path is the address that any error report - such as would be generated if the message is undeliverable - is sent to.
It is valid for the return path to be null, as in “MAIL FROM: <>”. This is typically used when sending an error report. A null return path means that no delivery failure report is required. Its main purpose is to avoid getting into the situation in which delivery failure messages continually shuttle back and forth because both sender and recipient addresses are unreachable.
The recipients of a message are defined using the command “RCPT TO: <address>”. Each address is enclosed in angle brackets. A message may have many recipients, and an RCPT TO: command is sent for each one. It is the RCPT TO: command, not anything in the message headers, that results in a message arriving at its destination. In the case of blind carbon copies or list server messages the recipient address will not appear in the headers at all.
Each recipient is acknowledged with a “250 OK” reply. A recipient may also be rejected using a reply with a 550 reply code. This depends on how the server has been configured. Dial-up ISP SMTP relay servers may accept every RCPT TO: command, even if the address specified is invalid, because the server doesn’t know that the address is invalid until it does a DNS lookup on it. However, a mail server intended to receive messages for local users or a specific domain would reject mail for addresses that are not at that domain.
Other replies may be received in response to RCPT TO: messages as a result of the SMTP server being helpful. If an address is incorrect but the server knows the correct address it could respond with “251 User not local; will forward to <address>” or “551 User not local; please try <address>”. Note the different reply codes signifying whether the server has routed the message or not. These replies aren’t common, and a mail client may simply treat the 551 response as an error, rather than try to parse the alternative address out of the reply text.
For the sake of completeness it should be pointed out that RCPT TO: commands may specify routes, not merely addresses. A route would be expressed in the form “RCPT TO: <server1,server2:someone@server3>”. Today this capability is rarely needed.
Once all the recipients have been specified, all that remains is for the sender to send the message itself. First it sends the command “DATA”, and then waits for a reply like: “354 Start mail input; end with <CRLF>.<CRLF>”. The message is then sent as a succession of lines of text. No acknowledgement is received for each line, though the sender needs to watch for a reply that indicates an error condition.
The end of the message is, as indicated by the reply shown above, a period (full stop) on a line of its own. Thus, one of the simplest but most essential things that a mail client must do is ensure that a line containing a single period does not appear in the actual text. The end of the message is acknowledged with “250 OK”. It’s worth noting that SMTP isn’t in the least bit interested in the content of the message. It could be absolutely anything, though strictly speaking it should not contain any characters with ASCII values in the range 128 to 255, and lines of text may not exceed 1,000 characters. There is no requirement for the headers to show the same sender and recipient addresses that were used in the SMTP commands, which makes it easy to make a message appear to have come from someone other than the true sender.
When a message is relayed by the server it inserts a “Received:” header at the start of the message showing the identity of the host that sent the message, its own host name, and a time stamp. Each SMTP server that a message passes through adds its own “Received:” header. Thus it is possible to track the path taken by a message. Although this information doesn't identify the sender it may shed some light on where the message came from.
After the “250 OK” that acknowledges the end of the message, the sender can start again with a new message by sending a new “MAIL FROM:” command, or it can sign off from the server using “QUIT”. A 221 reply will be received in response to the QUIT command.
SMTP servers should support two further commands for a minimum implementation. NOOP does nothing, but should provoke a “250 OK” reply. RSET aborts the current message transaction. Other commands such as HELP are really only of interest to those trying to communicate with SMTP servers interactively.
Post Office Protocol 3: POP3
SMTP is capable of delivering mail direct to the recipient’s desktop, but in practice it isn’t the ideal protocol for this. If an SMTP relay is unable to deliver a message to the next (or final) host in the chain, it will try at ever-lengthening intervals over a period of a few days before giving up and sending a delivery failure notification to the return path address.
SMTP offers no way for the recipient to prompt a server into sending mail that it is trying to deliver. If a recipient connects to the Internet infrequently - for example, using a dial-up connection - their mail server may never be active during the periods that they are online. In this case the mail will eventually bounce. SMTP is rather like a courier delivery service. If you aren’t in when it calls then, after a couple of re-delivery attempts, the letter is returned to the sender.
Post Office Protocol version 3 (POP3) - as the name suggests - lets you have your mail held at the post office so you can collect it at a time of your own choosing. POP3 is another TCP application, and uses the well-known port number 110. As with the other text-based application protocols you can connect with a POP3 server using a Telnet terminal emulator and interact with it using POP3 commands. This can sometimes be useful, as for example to manually delete a corrupt message that crashes a mail client whenever it is downloaded.
On connecting to the server, the server should respond with the message “+OK POP3 server ready”. POP3 uses “+OK” and “-ERR” at the start of replies to indicate acceptance or rejection of commands. This is simpler than the numeric codes used by SMTP and other protocols: software need only check the first character for a plus or a minus. The text that may appear after a “+OK” is a prompt for what to do next. After “-ERR” it is an error description. The exact content of the text may vary between server implementations.
A POP3 server holds people’s personal mail, so unsurprisingly you need to enter a user name and a matching password before you can gain access to it. To log in you must send “USER username”. A “+OK” response may show that the user name is valid, though this would give useful information to a hacker, so most modern POP3 servers send “+OK” to any user name, valid or not, and only reject the authentication attempt when the password is entered.
The password is sent using “PASS password”. If the username and password combination is correct, you will receive another positive acknowledgement in a reply like “+OK username has two message(s) (914 octets)”. The reply “-ERR” is received if the user name is not known, the password is incorrect or the server is for some reason unable to open a user’s mailbox.
Once a client is successfully logged in, it can issue several different commands which allow it to find out how many messages are waiting and how big they are, and to download the messages and delete them from the server. The “STAT” command returns the number of messages waiting <mw> and their total size in bytes <sb>, as a response in the form “+OK <mw> <sb>”. Note that this is the same information given in the login acknowledgement, but in a form (two numbers separated by a single space) that is easier for the client software to process.
The command “LIST” can be used to determine the size of each message. After the “+OK” the server sends, on separate lines, the message numbers <mn> and the message sizes <ms> separated by spaces. Waiting messages are numbered sequentially, starting from 1. The command “LIST <mn>” can be used to find out the size of a specific message. The LIST command is typically used by mail clients that implement a user-defined restriction on the size of messages that will be downloaded, or those that want to display a progress indicator that shows how much of each message has been downloaded.
POP3 provides no commands that enable a client to find out the subject of a message or who it is from. However, the TOP command lets the client download a message’s headers and a specified number of lines from the message body, from which this information may be obtained. TOP is an optional POP3 command, but its implementation is strongly recommended. The format of the command is “TOP <mn> <nl>” where <mn> is the message number and <nl> the number of lines required. The response is “+OK” (if <mn> is valid) followed by a partial download of the message. The end of the download is indicated by a line containing a single period (full stop).
The command “RETR <mn>” is used to retrieve messages from the server. The command must include a message number <mn>. After an “+OK” acknowledgement the server sends the whole message. Again, the end of the message is indicated by a line containing just a period.
The command “DELE <mn>” is used to delete a message. In fact, the DELE command only marks messages for deletion. Any messages marked for deletion during a session maybe undeleted by issuing an “RSET” command. The messages are only deleted once the client has closed the POP3 session by issuing a “QUIT” command. If a client never gets to close a session properly because the connection is lost or timed out then you may find some messages being downloaded a second time the next time the mail client connects to the server.
In order to avoid downloading messages twice, a POP3 client can use the command “UIDL” or “UIDL mn” to obtain unique, server-generated IDs for each message. By storing the UIDLs of downloaded messages in a file, a client can easily determine whether a message on the server has been previously retrieved or not. Implementation of the UIDL command is optional, but most POP3 servers seem to support it and most mail clients use it if it is available.
SMTP and POP3 are two of the most commonly-used Internet protocols. Their text-based nature, which makes it possible to send and receive messages by communicating with a server interactively using a simple Telnet client, also makes it easy to write client software using just about any programming language that can send and receive text using TCP.
In this series of articles it has only been possible to give an overview of the most important protocols used on the Internet. The full specifications of these and other Internet protocols can be found in Requests For Comments (RFCs) published by the Network Working Group. RFCs are freely available for download from the Internet. Anyone interested in finding out more about TCP/IP, and particularly in implementing their own TCP/IP applications, should obtain and study the RFCs for the protocols concerned. However, even if you never have to write your own Internet software it is hoped that these articles have piqued your interest and contributed to a better understanding of how TCP/IP and the Internet really work.