Internet Protocols Explained

0
169
Internet Protocols

Internet protocols refer to the rules required by different applications for the exchange of data over the internet. In layman terms, they are like languages. Just as language is used as a medium of communication, a standardised protocol is necessary to communicate across different computers using different hardware and software. This first article in a two-part series focuses on the internet protocols used till the network layer.

The internet protocol (IP) address is a unique number allocated to a domain or device by which other devices identify it in a network. There are two versions of this IP address:

  • Addresses in IPv4 are 32-bits, 2 power 32; example: 12.244.233.165
  • Addresses in IPv6 are 128-bits, 2 power 128; example:
  • 2001:0db8:0000:0000:0000:ff00:0042:7879

You may ask that if the IP address serves as a unique identity and so does the MAC address, can we use the MAC address itself? Well, we can, but it would take a very long time to search and find the destination. In the case of an IP address, the International Assigned Numbers Authority (IANA) assigns both the IPv4 and IPv6 addresses in a hierarchical manner. IANA assigns blocks of IP addresses to regional internet registries. The regional registries in turn assign smaller blocks to national registries, and so on, with blocks eventually being assigned to individual internet service providers (ISP). It’s the ISPs that assign specific IP addresses to individual devices, and there are a couple of ways they can do this. As you can see, it’s easy to find the IP address from the tree and get its geo location.

Open system interconnection model (Source: https://www.cloudflare.com/en-in/learning/ddos/glossary/open-systems-interconnection-model-osi/)
Figure 1: Open system interconnection model (Source: https://www.cloudflare.com/en-in/learning/ddos/glossary/open-systems-interconnection-model-osi/)

Ports

Ports are used for each application that is used for communication. They allow computers to differentiate between the different kinds of traffic such as web, email, file, voice, etc. Ports are represented by 16-bit numbers. There are 2^16, i.e., 65336 port numbers and they come in three ranges.

Well-known port numbers: 0 to 1023 are well-known port numbers as they are used by well-known protocol services. These are allocated to server services by the IANA, which is a division of ICANN (Internet Corporation for Assigned Names and Numbers). ICANN is a non-profit organisation that was established in the United States in 1998 to help maintain the security of the internet and allow it to be usable by all.

Registered port numbers: 1024 to 49151 are registered port numbers, i.e., these can be registered to specific protocols by software corporations.

Dynamic port numbers: 49152 to 65536 are dynamic port numbers and they can be used by anyone.

Let’s try and understand these terms in a simple way. If I want to send a message to Han who understands only Japanese, Japanese is my protocol to communicate with him, his home address is the IP address, and the letter drop box number where he collects all the post is the port number.

Data segmentation at each level (Source: https://www.javatpoint.com/ip)
Figure 2: Data segmentation at each level (Source: https://www.javatpoint.com/ip)

OSI model

The OSI (open system interconnection) model provides a standard for different computer systems to communicate with each other. Figure 1 explains each layer.

Let’s take a simple example and see what happens in each layer when you hit google.com.

  • In the application layer, the HTTP protocol is used and messages are formed. This is the only layer that interacts with the user.
  • In the presentation layer, the messaging gets done. This layer makes sure the data goes through the required transition to be passed to the next layer such as compression/decompression, encryption/decryption, and translation from ASCII to EBCDIC.
  • The session layer is responsible for the communication channel. If we have 1GB data to transfer, it takes control and makes sure the channel is open until the whole message is transferred.
  • If we are using HTTP v2, which is on top of the TCP protocol, the transport layer helps to get the actual message split into chunks. It controls the flow and manages errors. Flow is controlled by sending the traffic according to the receiver bandwidth. If the segments/message chunks are transmitted to the sender, the layer tries to send them again, thus managing errors. This layer is said to be the heart of the model. Above it are all the software layers and below it are hardware layers.
  • When we have a message to be transported across a network, we need the network layer. Since google.com is not in our network, in this layer the segments are shortened into packets and find the best path to the destination network. This is commonly known as routing.
  • In the data link layer, the data is broken down further from packets to frames. The main function of this layer is to make sure data transfer is error-free from one node to another. When a packet arrives in a network, this layer is responsible for transmitting it to the host using its MAC address.
  • The physical layer is responsible for the actual physical connection between the devices. Here the data gets converted into a bit stream, which is a string of 1s and 0s.

Just the reverse takes place when the message hits the Google server. Here, the received signal is converted into a bit stream, from frame to packet to segment to conversion and, finally, the message is sent to the application layer. Figure 2 gives a glimpse of how the data is segmented at each level and adds a header to it.

What is internet protocol?

The internet protocol or IP is what makes the internet possible. It is responsible for routing and addressing packets of data so that they can travel across networks and arrive at the correct destination. In this layer, data is divided into packets and also reassembled for incoming packets.

IP is a connectionless service once the data reaches the destination, but it acts as per the transport layer protocol, usually TCP (connection-oriented) or UDP (connectionless), commonly called TCP/IP or UDP/IP, respectively.

Network layer (Source: https://www.ipxo.com/blog/network-routing/)
Figure 3: Network layer (Source: https://www.ipxo.com/blog/network-routing/)

What is an IP packet?

IP packets are created by adding an IP header to each packet of data before it is sent on its way. Each packet contains two parts — the header and the payload. A payload is the data that is transmitted. An IP header records several pieces of information about the packet, such as:

  • Source IP address
  • Destination IP address
  • Header length
  • Packet length
  • Time to live (TTL), or the number of network hops a packet can make before it is discarded
  • Checksum, for error checking in the received data
  • Transport protocol being used (TCP, UDP)

There are 14 fields of information in IPv4 headers, although one of them is optional. The method of adding a header to the payload is called encapsulation.

How does IP routing work?

IP routing is a process to determine the best path for data to transfer from source to destination. The data packets follow the path based on the routing table configuration along with routing algorithms. The routing algorithm considers various factors like the size of the packet and its header to determine the most efficient route for the data from its source to the destination. Packets travel from the router following the routing table that indicates the next hop’s address, until they reach the targeted IP address. A variety of routing protocols help route packets based on their destination IP addresses. A ‘traceroute’ command helps to understand the route a packet takes to reach the destination.

IP packet structure (Source: https://web.stanford.edu/class/msande91si/www-spr04/readings/week1/InternetWhitepaper.htm)
Figure 4: IP packet structure (Source: https://web.stanford.edu/class/msande91si/www-spr04/readings/week1/InternetWhitepaper.htm)

How do IP addresses work?

An IP address is a unique identifier assigned to the computer that is connected to the internet. IP protocol needs to add the IP address of the source and destination to the header.

The source IP has two types of addresses.

Public IP address: The scope of the public address is global, which means that we can communicate outside the network. This address is assigned by the ISP (internet service provider). It is not available free of cost.

The ‘dig +short myip.opendns.com @resolver1.opendns.com’ command will show the public address attached to the system. It could also be Googled by asking ‘What is my IP?’

Private IP address: The scope of this address is local, as we can communicate within the network only. It is generally used for creating a local area network and is available free of cost. The ‘ifconfig’ command will show the private IP address attached to the system.

Two types of private IP addresses are assigned.

  • Static: Here, the assigned IP address does not change but can be changed as part of network administration. It helps to find out the device its mapped to.
  • Dynamic: In this case, the IP address keeps changing; the dynamic host configuration protocol or DHCP server assigns the IP address. A DHCP server needs to be configured with an appropriate pool of IP addresses, which it will assign to client devices as they join the network, ensuring that the addresses are unique. DHCP is very useful when the network has many host devices and reduces the chance of errors in assigning addresses.
  • The next question is: what is the source IP address? The packet uses the public IP address of the computer as its source; however, a LAN/WAN uses a private IP address.

We can use a private IP address in our private network for communication. But if the destination is outside the network, and needs to access the internet, one public IP address is needed.

Network address translation (NAT) is a process in which one or more local IP addresses are translated into one or more global IP addresses and vice versa in order to provide internet access to local hosts. NAT masks the port number of the host with another port number in the packet that is routed to the destination. It then makes the corresponding entries of IP address and port number in the NAT table. NAT generally operates in a router or firewall.

IP security (Source: http://www.sharetechnote.com/image/IP_Security_IPSec_ESP_01.png )
Figure 5: IP security (Source: http://www.sharetechnote.com/image/IP_Security_IPSec_ESP_01.png )

Let’s now talk about the destination IP address. Assume we want to connect to yahoo.com. It is hard to remember the IP address, and the IP can change anytime. Also, there will be more than one IP address for a site for load balancing reasons. The example given below returns six IP addresses for Yahoo and the browser needs to choose one.

% dig yahoo.com
;; ANSWER SECTION:
yahoo.com. 600 IN A 74.6.231.20
yahoo.com. 600 IN A 74.6.143.26
yahoo.com. 600 IN A 74.6.231.21
yahoo.com. 600 IN A 98.137.11.163
yahoo.com. 600 IN A 74.6.143.25
yahoo.com. 600 IN A 98.137.11.164

This is why we have domain names. The DNS or domain name system is a hierarchical and decentralised naming system that works just like a phone book. IP protocol uses DNS to get the destination IP address. An excellent explanation of this is given at https://www.open.edu/openlearncreate/mod/oucontent/view.php?id=129584&printable=1

IPSec

Internet protocol security (IPSec) is a network protocol suite that authenticates and encrypts each packet of data to provide secure encrypted communication between two computers over an IP network.

IPSec is a connection-oriented suite of protocols that is used to establish mutual authentication between computers at the beginning of a communications session and to negotiate cryptographic keys during the session. It uses the keys to encrypt the data for communication.

This protocol is commonly used in virtual private networks (VPNs), helping data transfer over a public network but with encryption of the IP packets and also with authentication of the source of the packets. IPSec usually uses port 500.

IPSec ensures the following.

  • Confidentiality: Protects data from being accessed by unauthorised persons. Only the sender and receiver can read the data, as it uses encryption.
  • Integrity: Ensures data is not being tampered intentionally or unintentionally. The hash value of the data from the sender is matched with that of the received data for data integrity.
  • Authentication: Data is received only from the permitted endpoint.
  • Anti-replay: Even if a packet is encrypted and authenticated, an attacker could try to capture these packets and send them again. By using sequence numbers, IPSec ensures no duplicate packets are transmitted.
IPSec modes (Source: https://www.twingate.com/blog/ipsec-tunnel-mode/)
Figure 6: IPSec modes (Source: https://www.twingate.com/blog/ipsec-tunnel-mode/)

Protocols used by IPSec

IPSec is not a protocol; rather, it’s built with a suite of protocols. Please note additional headers are added to the packet, which increases its size. In general, the following protocols make the IPSec suite.

Authentication header (AH): Authentication headers ensure the data packet is from a trusted source (authentication) and the data is not tampered with (data integrity) but do not provide encryption. The data is hashed, and the host that receives the packet can use this hash to ensure that the payload hasn’t been modified in transit. The protocol field of the IP header for AH is 51.

Encapsulating security payload (ESP): This provides data integrity, encryption, and authentication for payload. The encryption takes place based on the operating mode. ESP adds its own header and trailer in each packet. It also produces a unique sequence identifier for each packet. This identifier then allows a device to determine whether a packet has been authorised or not. Packets that are not authorised are discarded and not given to the receiver. Newer versions of ESP also incorporate AH functionality. The protocol field of the IP header for ESP is 50.

Internet key exchange (IKE): Security association (SA) refers to the number of protocols used for negotiating the encryption keys and algorithms. SA helps to encrypt raw data and primarily uses the IKE protocol. Internet key exchange (IKE) is used to negotiate cryptographic keys and algorithms that will be used in a session and provides message content protection. The beauty of this protocol, which is being used for the past 20+ years, is that there is a wide range of encryption methods to choose from, like PKI or any other.

IPSec operating modes

There are two modes used by IPSec to treat packet headers.

Tunnel mode: Here, the additional IP header is encapsulated in the original IP packet including the headers. This mode is generally used for connections between gateways that sit outside the private network. A packet is encrypted as it leaves one network and put inside a new packet, whose destination is the gateway for the target network.

Once it arrives at the gateway, it’s decrypted and removed from the encapsulating packet, and sent to the target host on the internal network. The header data about the topography of the private networks is thus never exposed while the packet traverses the public internet.

Transport mode: This mode retains the original packet IP header and encrypts the payload alone. It ensures fast and secure end-to-end communications.

IPSec flow

Detailed information on this subject is given at https://networklessons.com/cisco/ccie-routing-switching/ipsec-internet-protocol-security.

IPSec is complex and can be implemented in different ways. IKE is used to establish the connection in two phases.

IKE phase I: In this phase a session is established to negotiate the encryption, authentication and hashing protocols This collection of protocols is called security association or SA, as mentioned earlier. This phase is used only for the management of traffic.

Here there are two modes. The main mode uses six messages to establish a connection. This mode is more secure but slower. Identification is encrypted. The second mode is the aggressive mode, which uses only three messages to establish connection. It’s less secure but the identification is in clear text and faster.

IKE phase II: In this phase, IKE is used to build the connection and not encrypt user data. AH or ESP is used to authenticate data and ensure its integrity. Both AH and ESP can also be used at the same time. The data is prepared based on the transfer mode. Termination will happen if there is no data transfer for a while.

IPSec passthrough

As we have seen, NAT allows devices to share the same public IP address. IPSec protocols encrypt the data packets to establish secure connections. This prevents NAT from accessing certain IP header information, like updating the IP source and port numbers, and blocks the packets.

IPSec passthrough allows IPSec tunnels to pass through the router. It safely maintains IP connections over routers that require NAT.

We have learnt about the secure transfer of data until the network layer pretty much in depth. In the next article in this two-part series, we will focus on transport and application layer security.

LEAVE A REPLY

Please enter your comment!
Please enter your name here