Clients and Servers

At the endpoints of the internet, you find two different kinds of computers: servers and clients. The servers own resources like web pages and databases. Clients request these resources.

If a resource is popular, then a server must be very fast in order to keep up with the requests. The requests may come in from all across the world at any time of day. The software that listens for these requests often runs in an infinite loop and has no graphical interface. Many servers do not even have a monitor or keyboard attached, as no human is expected to operate the computers directly. Instead, servers reside in data centers that have very fast connections to the internet. Developers log in to the servers remotely to manage them.

Not every resource requires such fanfare. You can turn the machine that you are currently using into a server just by writing code. First you open a socket, which is a communication channel managed by the operating system. Then you write an infinite loop that awaits client requests and serves out the resources.

The internet is a general purpose utility, and many different kinds of traffic run across it. A server must therefore be able to manage many different kinds of requests. Imagine all the businesses in your city having a single storefront with a single door. The poor customer who walks through that door would likely have to explain their needs several times before getting the proper help.

Instead of having all requests handled by a single door, a server has many doors. Each service that the server provides has its own infinite loop and socket. When a client makes a request, it identifies the door or socket through which it wishes to communicate using an integer port number.

Conventions govern how port numbers are associated with different services. Messages sent across port 123, for instance, are expected to conform to the Network Time Protocol (NTP). Computers use NTP messages to synchronize their clocks. Consider these other common ports:

Port 25 is used to send email via the Simple Mail Transfer Protocol (SMTP).
Port 143 is used to receive email via the Internet Message Access Protocol (IMAP).
Port 194 is used to exchange Internet Relay Chat (IRC) messages.
Port 119 is used for sharing posts on Usenet, which uses the Network News Transfer Protocol (NNTP). Usenet was accessed directly via this port in the early days of the internet. These days, we are more likely to access it via Google Groups.
Port 80 is used for requesting web pages using the Hypertext Transfer Protocol (HTTP).
Port 443 is like port 80, but the messages are encrypted so that eavesdroppers can't read them. The traffic through this port follows the secure HTTP protocol (HTTPS).
Port 25565 is used by Minecraft clients to send requests to a Minecraft server.

The port number identifies a service. The client must also have a way of identifying the server with which it wants to communicate. That problem was solved in the 1980s by the Internet Protocol (IP). When a computer joins a network, it is assigned an IP address. In version 4 of IP, an IP address is made up of four bytes. Inside the machine, this is a 32-bit integer. For human audiences, an IP address is often displayed in dotted quad notation, as in 172.16.254.1. The client opens a socket and specifies both the IP address and the port number, the combination of which uniquely identifies a door on the internet.

IP addresses are relatively hostile to the user experience in that they have no mnemonic value. To remedy this, a service was built on top of them. Routers in the network maintain a directory that defines a mapping from human-readable names to IP addresses. The human-readable names are called domain names. When a request comes in using the domain name jmu.edu, the routers look up the domain name in the directory and replace it with the IP address 134.126.10.50. This directory lookup is called the Domain Name Service (DNS).