Computer Networking Chapter 2 (Layer 3-4)
Categories: Computer Networking
Tags: Computer Networking
Important Words to Remember:
Node: It is the connection point in a network. Examples include hubs, switches, computers etc.
Interface: It is the port on the node in a network. They can either be physical (like NICs which make up an ethernet port), or virtual (like loopback or VPN).
PDU (Protocol data unit): The small data units used to slice chunks into smaller/more specialized chunks. Segments, packets, datagrams, and frames are all PDU.
Fragmentation: The process exclusively in OSI layer 3 that involves slicing packets into smaller pieces.
Segmentation: The process exclusively in the OSI layer 4 for the TCP protocol that involves the slicing of data stream into smaller pieces.
Encapsulation: Adding a new header/info to a PDU.
Decapsulation: Removing a header/info to a PDU.
Packets: The PDU of the Internet layer in the TCP/IP model.
Segments: PDU for OSI layer 4 (transport layer) for TCP related tasks.
Datagram: PDU for OSI layer 4 (transport layer) for UDP related tasks.
TCP (Transmission Control Protocol):
- An OSI layer 4 protocol for loading a webpage, exchanging messages, email, and anything else that involves reliability instead of speed.
- It uses segmentation before being transmitted.
- Unicast uses TCP only.
UDP (User Datagram Protocol):
- An OSI layer 4 protocol for game servers, VoIP (Viber, WhatsApp, Telegram etc.) and other time-sensitive applications.
- It does NOT use segmentation before being transmitted.
- Multicast and broadcast usually use UDP.
IP: An OSI layer 3 protocol that defines how to address and route packets for delivery.
Understanding TCP/IP Protocol Suite
Recalling from the previous chapter, TCP/IP suite is a collection of protocols that allows connectivity between different local or remote networks. It covers layer 3 to layer 7 of the OSI model.
Here is a table to illustrate:
TCP/IP Layer | OSI Equivalent Layer(s) | Common Protocols | PDUs (Protocol Data Units) at this layer |
---|---|---|---|
Application | Application, Presentation, Session | HTTP, HTTPS, FTP, SMTP, POP3, IMAP, DNS, SSH, Telnet, SNMP | Data, Messages, Streams |
Transport | Transport | TCP, UDP | Segments (TCP), Datagrams (UDP) |
Internet | Network | IP (IPv4, IPv6), ICMP, ARP, IGMP | Packets (or Datagrams) |
Network Access | Data Link, Physical | Ethernet, Wi-Fi (802.11), PPP, Frame Relay, ATM | Frames (at Data Link), Bits (at Physical) |
They function together and are collevtively named due to both TCP and IP being ubiquitous protocols.
Note: The reason why it is not "UDP/IP" is because TCP is older than UDP. Moreover, most TCP/IP suite applications require TCP to function because it is much more dominant than UDP.
Ok, but what do protocols have in common?
They are inherently categorized into numerous separate categories. Here are four well-known ones:
- Connection-Oriented Protocol: Involves two hosts making assurances (ex: "handshake process") and establish an agreement prior to any data being exchanged.
- Ex: TCP, SCTP, ATM
Note: Protocols are often reliant on other protocols to function. Some of the most common protocols that we use like SMTP or IMAP for mails, FTP for file transfers, and SSH for remote computer access need TCP in order to function. This is why unlike the SCTP and ATM, TCP stands out because it has better support and is the most common.
-
Connectionless Protocol: Involves a host releasing data to another without any prior acknowledgement or any pre-arranged preparations. It spews out data without any verifications.
- Ex: UDP, IP, ARP
-
Stateful protocols: Keeps track of each packet during a session in order to quickly retransmit data in case of a network failure.
- Ex: FTP, TCP, SSH
-
Stateless protocols: Does not remember or track packets during a session. Data is picked up where they were left off once a connection gets reestablished.
- Ex: IP, HTTP, DNS
Understanding PDUs
It is important to note that TCP/IP deals with packets and segments/datagrams, not frames.
Here is an image to illustrate what a packet looks like. It is essentially data wrapped around lower-level data:

Here are the PDUs that you need to learn:
- Frame: PDU of OSI Layer 2
- Packet:
- Generic meaning: Any data that are grouped together as a bundle.
- Specific meaning: PDU OSI of Layer 3
- Segment: TCP PDU of OSI Layer 4
- Datagram: UDP PDU of OSI Layer 4
In a colloquial setting, all of the PDUs mentioned above can be referred to as "packets" unless otherwise noted.
Ok, but what about OSI layer 1?
OSI Layer 1's PDU is just "bits" or "symbol sequences". They are just raw streams of 1s and 0s. They bear no logical structure and only bring signals (electrical/light impulses or radio waves) across the physical medium.
TCP/IP in Action
When PDUs traverse across the network, information gets added as they leave the client to a remote destination. They go through several layers in the OSI model before they are disassembled when they arrive at their destination.
Encapsulation to Decapsulation:
In order to understand how either encapsulation or decapsulation work, it is better to provide a real life example of a packet being assembled and sent over from one host to the other.
Let's say PC1 wants to send a file to PC2 via an Ethernet cable or wirelessly using an SSH session with the scp
command. The concise steps to achieve this are as follows, starting with TCP/IP's Application Layer:
- Application Layer:
- As the SSH session begins, PC1 must resolve PC2's hostname to an IP address. Only one of the following scenarios happen:
- If PC2's IP address is not found, PC1's Domain Name System (DNS) resolver will ask the DNS server (often the home router) to obtain it.
- If found, it is done by checking though PC1's hosts file or DNS cache in memory.
- When the client generates the data necessary to peform a particular action, it will be formatted and ecrypted through SFTP.
- Once data is ready, it gets passed down to the Transport layer.
- Transport Layer:
-
Because the Application Layer's SSH request requires a connection-oriented protocol, TCP will be used.
-
Since TCP requires proper ordering to prevent packet loss or misordering, it divides data into little chunks of manageable PDUs ("segmentation") called "segments".
-
After segmentation, new information gets added or encapsulated into segments. This forms the segment header that contains the source and destination port numbers. This is done in order to identify the application process on both hosts (the destination port number will be 22)
Note: There are two things to be aware of:
- Usually when a header is added, it is not just soure/destination fields. There are many more fields that are added to the header that they are beyond the scope of this chapter. For now, it is important to know the most notable fields are the source/destination ones.
- The most common confused Transport Layer aspect is how the source port and the destination port are not equal to each other: We go through port numbers here.
-
Once the data is segmented and the Transport Layer header is fully assembled, the segment will be passed down to the Internet Layer.
- Internet Layer:
- Once the Transport Layer segment is received, Internet Protocol (IP) will be used to prepare routing.
- TCP segment is now encapsulated within the IP header to form a new PDU called "packet". If the size of the packet exceeds the Maximum Transmission Unit (MTU), it will go through a process called "fragmentation" in order to achieve proper processing of data by the destination (PC2).
- The packet now holds both the source and the destination's IP addresses that are initially retrieved and carried over from the Application Layer. It finally has the information needed for the location to be sent based on the destination address alone.
- Network Access Layer:
- As the IP packet arrives at this layer, one final encapsulation is required before a transmission takes place.
- The IP packet is now encapsulated into an Ethernet frame by adding an Ethernet header (which includes MAC addresses) and a trailer (used for error detection).
- The frame now heads off to the physical layer for delivery across the network.
As the frame arrives at its destination, one of the two scenarios happen:
- If unsuccessful: The Time to Live (TTL) field in the IP header decrements by one count with every router hop. For this case, the IP packet is not discarded nor the TTL is decremented because both hosts belong in the same network and are at a short distance between each other.
- If successful: The IP packet is in tact and the PC2 will reverse the adding of data through the processes of decapsulation. The headers are removed until data gets processed by the Application Layer.
The copied file is now accessible in the current working directory.
OSI Layer 3 and 4
Layer 3 (network layer) is the layer where routing takes place. It is where former frames (now called packets), move from one network to another. It forms the basis of the internet.
OSI Layer 3
IP (Internet protocol) is the most important protocol out of all the OSI model. Without IP routing, there will be no internet:
- It implements the logical aspect of networking:
- It relies on IP addresses to enable internetwork routing.
- It is a connectionless protocol:
- IP relies on other protocols for verification and to establish a secure connection.
- IP is responsible to help route packets between different networks:
- It implements the routing aspect of the network but it does not oversee them. It is only for identification purposes.
IP comes with IP addresses. They sit on top of networking interfaces on any computer. They are either routers, switches, PCs, etc. They all connect together to perform the tasks needed to form a network.
The most popular devices that belong in layer 3 are routers and switches (not to be confused with layer 2 switches).
IP addresses sit on top of router or switch interfaces (i.e. gigabitEthernet0/0/0, fastEthernet0/0/0, etc.) to make IP reliable.
Explaining IP Addresses
Recall that layer 2 relies on MAC address. Layer 3 however, relies on IP addresses.
IP (internet protocol) address: It identifies a device on a local network and operates at level 3 of the OSI model. Unlike MAC addresses that are permanently embedded in NICs, they are configured in any layer 3-supported device's operating system (i.e. PCs, switches, routers).
All IP addresses consist of two parts:
- Host ID: Identifies the host portion of the IP address.
- Network ID: Identifies the network portion of the IP address.
There are two popular types of IP addresses, IPv4 (32-bits) and IPv6 (128-bits). They can all be either public, private, static, or dynamic.
-
Static IP address: IP addresses that do not change over time.
-
Dynamic IP address: IP addresses that can change over time.
-
Public/external IP address: IP addresses that are acccessible outside the local network. It is no different to a home addresses.
-
Private IP address: IP addresses that are only accessible to hosts connected on the same network.
Recall:
A "bit" is one digit that is either one or zero. Therefore, an octet is 8 "bits" of ones and zeroes.
In computer science, the word "byte" is used more commonly instead of "octet". Because "byte" did not historically always mean 8 bits, the word "octet" better fits the description.
Note:
- In computer networking terminology:
- "octet" == "8 bits"
- "hextet" = "16 bits"
Let's discuss the history then dive deeper into IPv4 and IPv6 respectively:
History of IP
- The Experimentation Age (1980-1981): Initially, it was decided to only have the first 8-bits (octet) of an IPv4 address to define the network of a given host. This made the internet to only have 254 networks (2 addresses were reserved from 256). As you may well know, the internet has far more than 254 networks today.
- Here is RFC 762 to see the various highest networking numbers with internet access back in 1980.
- Networking numbers: Back in the days before regional internet registries, IPv4 address blocks were called "network numbers". They had to be manually allocated across educational institutions, government bodies, scientific labs etc. They were all overseen by the famous Jon Postel.
-
The Classful Age (1981-1993): It was later decided to divide IPv4 addresses into fixed-size classes to accommodate different network sizes. By the time IPv4 became widely available in 1983, it was still not a sufficient way to resolve IPv4's rapid growth in the long-term.
- Although the following table is a bit outdated for practical use, it is often shown in textbooks as to provide better context of IPv4:
Note:
- 255.255.255.255 is exlucded in this table as it is exclusively used for broadcasting the entire local network.
- IPv4 addresses that start with 127 is exlucded because they are now reserved for loopback addresses.
First Decimal Value Addresses Hosts per Network ID Class A 1-126 1.0.0.0 - 126.255.255.255 16,777,214 Class B 128-191 128.0.0.0 - 191.255.255.255 65,534 Class C 192-223 192.0.0.0 - 223.255.255.255 254 Class D 224-239 224.0.0.0 - 239.255.255.255 Multicast Class E 240-255 240.0.0.0 - 254.255.255.255 Reserved
- The Classless Age (1993-present): Classless addressing (Class Inter-Domain Routing or CIDR) was introduced to resolve IPv4 waste. This was to enable companies divide networks to their own liking instead of being limited to five IPv4 classes. For example, if a company has 67,000 hosts, it will waste over a thousand addresses. This is unimaginable for today's standards.
Overall, IP has had 4 upgrades since the 80s:
- Private addressing: It is a remnant of IPv4 classes that allow IPv4 address chunks to be reusable for private networking use.
- CIDR/classless addressing: Allows network admins to lessen IPv4 exhaustion by dividing networks into smaller, scalable networks (aka "subnetting").
- NAT: Allows multiple private networks to be represented as one Public IP address.
- Loopback address: It is the internal mock networking address of the system and is used to troubleshoot TCP/IP. It is also known as "localhost".
- IPv6: A successor of IPv4 that is in use since 2012. Both IPv4 and IPv6 are widely used.
- I recommend watching Vint Cerf's interview on this. Here, he discusses on what lead to IPv6.
IPv4
IPv4 uses a dotted decimal notation which consists of four sets of numbers that are separated by a period (.). Each set is 8 bits (one octet) that ranges from 0 to 255 inclusive. In total, it uses 32-bits (or 4 octets).
This is what an IPv4 address looks like:
192.168.18.146
Ok, but how did this all come to be?
Since all computation and data allocation are done in binary, a fixed number of bits were required to designate an IP address. For DARPA back in the 1970s, it was originally decided that 32 bits would be sufficient.
To "feel" the sense IP addresses, I highly recommend watching this video by Carl Oliver. It perfectly captures the context of binary and decimal conversion for IP addresses:
Here is the conversion done in reverse (binary to decimal):
As computers only understand binary, this is what they would see (based on the last example):
11000000101010000001001010010010
In order to make this more legible, we add in a dotted decimal notation:
*1 octet* *1 octet* *1 octet* *1 octet*
| | | |
V V V V
11000000.10101000.00010010.10010010
To convert it back to decimal, each octet must be within the range 0-255 inclusive. Therefore, we need to perform a conversion one by one.
Let's take the first octet 11000000...
Binary Digit | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
---|---|---|---|---|---|---|---|---|
Power of 2 | $2^7$ | $2^6$ | $2^5$ | $2^4$ | $2^3$ | $2^2$ | $2^1$ | $2^0$ |
Decimal Value | 128 | 64 | 0 | 0 | 0 | 0 | 0 | 0 |
Total: 192 |
We repeat this for every octet to get back to this:
192.168.18.146
You've mentioned that all IP addresses have two parts, the Network ID and the Host ID. How do we determine them for IPv4 addresses?
We use a subnet mask to determine the Host ID and the Network ID. It will be impossible for an IPv4 address to be reachable without it.
If we take the same address from last time:
192.168.18.146
And have our subnet mask be assigned like this:
255.255.255.0
We can tell the Network ID and the Host ID by the two portions of 255s and 0(s) respectively. In this case:
- Network ID: 192.168.18.0
- Host ID: 0.0.0.146
Let's take another IPv4 address:
10.5.20.75
And if our subnet mask is:
255.255.0.0
Then our Network ID and our Host ID are as follows:
- Network ID: 10.5.0.0
- Host ID: 0.0.20.75
How can a subnet mask be concisely represented alongside an IPv4 address?
We use the CIDR notation for this precise use.
Let's bring back our first IPv4 address and its subnet mask as an example:
192.168.18.146
255.255.255.0
The shorthand for this is:
192.168.18.146/24
The number '24' represents the Network ID with the number of bits assigned to 1. Here is a representation of the subnet mask in binary and its decimal form:
11111111.11111111.11111111.00000000
255.255.255.0
The '1's pertain to the Network ID while the '0's pertain to the Host ID. By counting the number of '1's in the Network ID portion, we get 24.
What were the cons of IPv4?
It lead to the exhaustion of IPv4. The consequences if ignored were tremendous:
- There was a need for better IPv4 allocation for large corporations and government institutions.
- Increase demand of networking infrastructure as it was becoming cheaper to build.
- Routers could not handle large routing tables, leading to performance issues.
Private addressing is defined in RFC 1918. This became a standard for routers to never use one of the three IP ranges below for public routing:
Private IP Range | CIDR Notation | Number of Addresses | Common Use Cases |
---|---|---|---|
10.0.0.0 – 10.255.255.255 | 10.0.0.0/8 | 16,777,216 | Large internal enterprise LANs |
172.16.0.0 – 172.31.255.255 | 172.16.0.0/12 | 1,048,576 | Medium-scale networks |
192.168.0.0 – 192.168.255.255 | 192.168.0.0/16 | 65,536 | Home and small office LANs |
Within each row in the table above, we see a block of private IP addresses and CIDR notations used as a shorthand to represent the network id and the host id.
Note:
- Just because there is a private network starting with 10.x.x.x, it doesn't always mean the CIDR notation is always /8. It can be /24 through subnetting (dividing a local network into smaller sections). Only a portion of the "172" and "192" address ranges are designated for private use (hence /12 and /16 respectively).
All these implementations were not enough to permanently resolve IPv4's problems. Over time, an entirely new IP address was needed to ensure proper routing alongside IPv4.
IPv6
IPv6 is the most recent version of IP and is an improved version of IPv4. Unlike IPv4 using 32-bits, it uses 128-bits.
IPv6 addresses are separated by a colon (:) into eight sections of 16 bits (one hextet). Each segment uses a hexadecimal notation ranging from 0 to ffff.
This is what an IPv6 looks like:
2001:db8:3333:4444:5555:6666:7777:8888
In a home environment, it is the responsbility of the Internet Service Provider (ISP) to configure dynamic IP addresses automatically through a protocol called DHCP (Dynamic Host Control Protocol). They also provide other parameters like subnet masks and default gateways automatically.
For more information about IPv4, IPv6 and IP addressing, there will be a dedicated article in the works.
OSI Layer 4
OSI layer 4 helps coordinate packets to their destination. It is the "how" for layer 3.
Clarifying the "how"
The most common protocols for Layer 4 are TCP and UDP. Each protocol has different applications but serve the same purpose as they are run on top of IP.
So if Layer 2 has MAC addresses, Layer 3 has IP addresses, what does Layer 4 have?
Port numbers
As the packet gets delivered to a destination with a particular IP address, the sending application will still need to provide clarification on what port number will be used. This is because the receiving end's operating system (application layer) needs to know which specific service the data is intended for.
Here is a table of some of the most port numbers, all corresponding to an application layer protocol that utilizes both TCP and UDP:
Port Number | Service/Application | Protocol |
---|---|---|
20 | FTP (Data) | TCP |
21 | FTP (Control) | TCP |
22 | SSH (Secure Shell) | TCP |
23 | Telnet | TCP |
25 | SMTP (Simple Mail Transfer Protocol) | TCP |
53 | DNS (Domain Name System) | Both TCP and UDP |
67 | DHCP (Server) | UDP |
68 | DHCP (Client) | UDP |
80 | HTTP (Hypertext Transfer Protocol) | TCP |
110 | POP3 (Post Office Protocol version 3) | TCP |
123 | NTP (Network Time Protocol) | UDP |
143 | IMAP (Internet Message Access Protocol) | TCP |
161 | SNMP (Simple Network Networking Protocol) | UDP |
443 | HTTPS (Hypertext Transfer Protocol Secure) | TCP |
If both hosts are getting the data they need to process, wouldn't they use the same port number?
No. Let's take an example:
If one browser tab were to use source port 443 while connecting to a website that uses the same number as its destination port, a second tab making the same request cannot use the same number again as its source. Each outbound port must be unique on the client side.
This is done by setting the source port to a random temporary number. It is called "ephemeral ports" (also known as private or dynamic ports).
Ephemeral Port: They are temporary ports in OSI layer 4 used by client applications when initiating a connection to a server. Unlike reserved ports from the table above, they are allocated from a specific range defined by the operating system for the duration of the communication session.
Here is a table to show the ranges for an ephemeral port specified by the operating system:
Category | Recommended Range (IANA) | Common Linux Default Range | Common Windows Default Range (Vista and later) | Common macOS Default Range |
---|---|---|---|---|
Ephemeral Ports | 49152-65535 | 32768-60999 (can vary) | 49152-65535 | 49152-65535 |
TCP
Transmission Control Protocol (TCP) is a set of rules to establish or terminate connectivity between two different hosts.
-
It enables host-to-host communication as two computers establish a reliable connection through a process called "three-way handshaking" before a transmission is made.
-
When a TCP session ends, a "four-way handshake" will start when both hosts agree to terminate the session.
Here is an example of how a TCP conversation starter would look like from a Layman's perspective:
PC1: "Are you there?" (Synchronizing... [SYN])
PC2: "Yes, I am here. Are you ready?" (Acknowledging and Synchronizing... [SYN-ACK])
PC1: "Ok, I am here and ready." (Acknowledging... [ACK])
So:
- The sender initiates the connection by sending a packet with the SYN bit set to 1.
- The receiver responds with a packet that has both the SYN bit and the ACK bit set to 1.
- The sender completes the handshake by sending a final packet with its ACK bit set to 1.
What are SYN, SYN-ACK, and ACK mean exactly?
They are control bits/flags explicitly defined in TCP headers to control establishing connections, connection abortion, flow control, mode of transfer etc.
From here on, the SEQ and the ACK control bits will be used to keep track of the session.
Here is a more detailed description of the succeeding conversation from a high-level perspective:
PC1: I want to send you some data. It will be in 12 packets, each 256 bytes long.
PC2: I am ready to receive packet 1 of 12.
PC1: I am sending packet 1 of 12, it is 256 bytes long.
PC2: I have received packet 1 of 12, it was 256 bytes long. I am ready to receive packet 2 of 12.
PC1: I am sending packet 2 of 12, it is 256 bytes long.
PD2: I have received packet 2 of 12, but it was only 183 bytes long. Please resend.
PC1: I am sending packet 2 of 12, it is 256 bytes long.
PC2: I have received packet 2 of 12, it was 256 bytes long. I am ready to receive packet 3 of 12.
As computers send data to each other, each TCP/IP layer will either strip (de-encapsulation) or join chunks of data together (segmentation or reassembly) as they move towards or away from the application layer.
If the receiver were to transmit data, TCP would slice the data into segments and wrap each segment with a header (containing all the necessary information) before sending it to the other end.
UDP
- The application layer has to strip data on its own before being transmitted.
- Packets will be sent and can be dropped in a different order than they were transmitted.
- They are sent to the receiver directly without authentication (no three-way handshaking).