At the endpoints of the internet, you find two different kinds of computers: servers and clients. The servers own resources like web pages and databases. Clients request these resources.
If a resource is popular, then a server must be very fast in order to keep up with the requests. The requests may come in from all across the world at any time of day. The software that listens for these requests often runs in an infinite loop and has no graphical interface. Many servers do not even have a monitor or keyboard attached, as no human is expected to operate the computers directly. Instead, servers reside in data centers that have very fast connections to the internet. Developers log in to the servers remotely to manage them.
Not every resource requires such fanfare. You can turn the machine that you are currently using into a server just by writing code. First you open a socket, which is a communication channel managed by the operating system. Then you write an infinite loop that awaits client requests and serves out the resources.
The internet is a general purpose utility, and many different kinds of traffic run across it. A server must therefore be able to manage many different kinds of requests. Imagine all the businesses in your city having a single storefront with a single door. The poor customer who walks through that door would likely have to explain their needs several times before getting the proper help.
Instead of having all requests handled by a single door, a server has many doors. Each service that the server provides has its own infinite loop and socket. When a client makes a request, it identifies the door or socket through which it wishes to communicate using an integer port number.
Conventions govern how port numbers are associated with different services. Messages sent across port 123, for instance, are expected to conform to the Network Time Protocol (NTP). Computers use NTP messages to synchronize their clocks. Consider these other common ports:
The port number identifies a service. The client must also have a way of identifying the server with which it wants to communicate. That problem was solved in the 1980s by the Internet Protocol (IP). When a computer joins a network, it is assigned an IP address. In version 4 of IP, an IP address is made up of four bytes. Inside the machine, this is a 32-bit integer. For human audiences, an IP address is often displayed in dotted quad notation, as in 172.16.254.1
. The client opens a socket and specifies both the IP address and the port number, the combination of which uniquely identifies a door on the internet.
IP addresses are relatively hostile to the user experience in that they have no mnemonic value. To remedy this, a service was built on top of them. Routers in the network maintain a directory that defines a mapping from human-readable names to IP addresses. The human-readable names are called domain names. When a request comes in using the domain name jmu.edu
, the routers look up the domain name in the directory and replace it with the IP address 134.126.10.50
. This directory lookup is called the Domain Name Service (DNS).