Introduction to Computer Networking: Transport and Application Layers

Transport Layer Overview

The Transport Layer provides logical communication between application processes running on different hosts. The main transport layer protocols are TCP (Transmission Control Protocol) and UDP (User Datagram Protocol).

  • Send Side: Breaks application messages into segments and passes them to the network layer.
  • Receive Side: Reassembles segments into messages and passes them to the application layer.

Comparison with Network Layer

  • Network Layer: Provides logical communication between hosts.
  • Transport Layer: Provides logical communication between processes (i.e., it works above the network layer and relies on its services).

Household Analogy

  • Hosts = Houses
  • Processes = Kids
  • Messages = Letters in envelopes
  • Transport Protocol = Ann and Bill (delivering messages to in-house siblings)
  • Network Layer Protocol = Postal Service

Transport Layer Services

The transport layer provides several services to applications, including:

  1. Multiplexing and Demultiplexing

    • Multiplexing at Sender: Handles data from multiple sockets and adds transport headers.
    • Demultiplexing at Receiver: Uses header information (like port numbers) to deliver the received segments to the correct socket.
  2. Reliable Data Transfer

    • Ensures all data is delivered correctly and in order.
  3. Flow Control

    • Prevents the sender from overwhelming the receiver by managing the rate of data transmission.
  4. Congestion Control

    • Manages the flow of packets to prevent network congestion.

Internet Transport Layer Protocols

  1. TCP (Transmission Control Protocol)

    • Reliable, In-Order Delivery: Uses acknowledgments (ACKs) and retransmissions to ensure all data is delivered correctly and in sequence.
    • Flow Control and Congestion Control: Uses mechanisms like sliding windows to ensure the sender does not overload the receiver or the network.
    • Connection Setup: Uses a three-way handshake for connection establishment.
  2. UDP (User Datagram Protocol)

    • Unreliable, Unordered Delivery: Provides a simple, connectionless transport layer service, which extends the “best-effort” IP.
    • No Connection Setup: Segments are handled independently, which reduces latency.
    • Examples of Use: Streaming multimedia applications (loss-tolerant, rate-sensitive) and DNS.

Multiplexing and Demultiplexing

  • How Demultiplexing Works:
    • The host receives IP datagrams, each containing a transport-layer segment.
    • Each segment has a source and destination port number.
    • The host uses IP addresses and port numbers to direct the segment to the appropriate socket.

Connectionless Demultiplexing (UDP)

  • When a UDP socket is created, it is assigned a host-local port number.
  • The destination port in an incoming segment is used to direct it to the correct socket.
  • UDP segments with the same destination port but different source IPs or ports will be directed to the same socket.

Connection-Oriented Demultiplexing (TCP)

  • A TCP socket is identified by a 4-tuple: source IP address, source port number, destination IP address, destination port number.
  • The receiver uses all four values to determine which socket to direct the segment to.
  • Example: A web server can have different sockets for each connecting client, even if they connect to the same destination IP and port.

UDP: User Datagram Protocol (RFC 768)

  • Connectionless Protocol: No handshaking between sender and receiver; each UDP segment is handled independently.
  • Uses:
    • Streaming multimedia applications (e.g., VoIP).
    • DNS requests, where simplicity and low delay are desired.
  • Reliable Transfer over UDP: Applications can add reliability at the application layer.

UDP Segment Structure

  • Source Port Number: Port from which the segment is sent.
  • Destination Port Number: Port to which the segment is addressed.
  • Length: Total length of the UDP segment, including header and data.
  • Checksum: Used for error detection in transmitted segments.

UDP Checksum Calculation

  • Treat the segment (including header fields) as a sequence of 16-bit integers.
  • Compute the checksum by adding these integers and take the 1’s complement of the result.
  • The receiver checks for errors by calculating the checksum again and comparing it to the value in the checksum field.

Example of Checksum Calculation

  • Adding two 16-bit integers:
    • If a carryout occurs from the most significant bit, it is added back to the result (known as wraparound).
    • This ensures that the checksum is always a 16-bit value.

TCP: Transmission Control Protocol (RFCs 793, 1122, 1323, 2018, 2581)

  • Reliable, In-Order Byte Stream: Data is sent and received in order.
  • Connection-Oriented: Uses a handshake (exchange of control messages) to initialize sender and receiver states before data exchange.
  • Flow Controlled: Sender will not overwhelm the receiver.

TCP Segment Structure

  • Sequence Number: Byte stream number of the first byte in the segment’s data.
  • Acknowledgment Number: Indicates the next byte expected from the other side (cumulative ACK).
  • Receive Window (rwnd): Number of bytes the receiver is willing to accept, used for flow control.
  • Checksum: Same error-checking mechanism as UDP.
  • Flags: Control bits such as SYN, ACK, FIN for connection setup, acknowledgment, and teardown.

Network Layer Service Models

  • Datagram Network (Connectionless Service):
    • No call setup at the network layer. Routers do not store any state information about end-to-end connections.
    • Packets are forwarded using destination host addresses, and there is no concept of a “connection.”
  • Virtual-Circuit Network (Connection Service):
    • More complex inside the network, involving maintaining states for each connection.
    • Requires setup and tear-down phases, similar to a telephone network.

Forwarding vs. Routing

  • Forwarding:
    • The process of moving packets from a router’s input to the appropriate router output.
    • It’s a local decision made by each router based on a forwarding table that maps destination addresses to output links.
  • Routing:
    • The process of determining the route (or path) taken by packets from source to destination.
    • Uses routing algorithms to determine the best path through the network.
  • Analogy:
    • Routing is like planning a trip from source to destination.
    • Forwarding is like getting through a single interchange along that route.

How a Router Works

  • Router Functions:
    • Routing Processor: Runs routing algorithms (such as RIP, OSPF, and BGP) and updates the forwarding table.
    • Forwarding Data Plane (Hardware): Handles packet forwarding from incoming to outgoing links at high speed.
  • Input Ports:
    • Line Termination: Handles bit-level reception.
    • Link Layer Protocol: Receives and processes incoming frames (e.g., Ethernet).
    • Forwarding: Looks up the output port in the forwarding table to determine where to send the datagram.
  • Switching Fabric:
    • Transfers packets from input to output buffers.
    • Switching Rate: The rate at which packets are transferred from inputs to outputs. Higher switching rates (relative to input/output line rate) are more desirable.
  • Output Ports:
    • Buffering: Used when datagrams arrive from the switching fabric faster than they can be transmitted, leading to queuing and potential loss if the buffer overflows.

Routing (Path Selection)

  • Routing Algorithms:
    • Global Information-Based Routing (Link State Routing): Each node has a complete view of the network and uses link costs to determine the best route.
    • Distance Vector Routing: Nodes learn about the network by exchanging information with their neighbors.
  • Forwarding Table: Uses the routing algorithm output to determine how to forward packets based on destination addresses.
  • Longest Prefix Match: When multiple routes match a destination address, the router selects the route with the longest matching prefix.

Addressing on the Network Layer

  • IP Address:
    • A 32-bit identifier for each interface (router or host).
    • Dotted-Decimal Notation: E.g., 223.1.1.1 is represented in binary as 11011111 00000001 00000001 00000001.
    • Each interface (connection between a router or host and the physical link) is assigned a unique IP address.
  • Classes of IP Addresses:
    • Class A (0): NetID = 7 bits, HostID = 24 bits (128 networks, 16 million hosts per network).
    • Class B (10): NetID = 14 bits, HostID = 16 bits (16,384 networks, 65,536 hosts per network).
    • Class C (110): NetID = 21 bits, HostID = 8 bits (2 million networks, 256 hosts per network).
    • Class D (1110): Used for multicast addresses.
    • Class E (1111): Reserved for future use.
  • CIDR (Classless InterDomain Routing):
    • Allows for a variable-length subnet mask.
    • Address format: a.b.c.d/x, where x represents the number of bits in the subnet portion of the address.
  • Subnets:
    • A subnet is a set of device interfaces with the same subnet portion of an IP address, and these devices can physically reach each other without an intervening router.
  • DHCP (Dynamic Host Configuration Protocol):
    • Allows a host to dynamically obtain an IP address from a DHCP server.
    • The process includes the following steps:
      • DHCP Discover: Host broadcasts a message to find a DHCP server.
      • DHCP Offer: Server offers an IP address.
      • DHCP Request: Host requests the offered IP address.
      • DHCP Acknowledgment: Server assigns the IP address to the host.

Additional Network Layer Concepts

  1. Datagram Forwarding Table

    • Each router has a forwarding table that is used to determine the output link for incoming datagrams.
    • The table contains address ranges that map to specific output interfaces.
  2. IP Datagram Format

    • Fields in the IP Datagram Header

      • Version: Identifies the IP version.
      • Header Length: Specifies the length of the header.
      • Type of Service: Indicates the quality of service.
      • Total Length: Indicates the length of the entire datagram.
      • Time to Live (TTL): Specifies the maximum number of hops a packet can take before being discarded.
      • Fragmentation and Reassembly

        • IP Fragmentation occurs when datagrams are split into smaller fragments to fit the MTU (Maximum Transfer Unit) of the network links.
        • Reassembly occurs at the destination.
  3. Router Queueing and Buffering

    • Input Port Queueing

      • Head-of-the-Line (HOL) Blocking: A queued datagram at the front of the queue may prevent others from moving forward, resulting in delays.
    • Output Port Queueing

      • Buffering is required when the arrival rate exceeds the output line speed, leading to potential delays and packet loss due to buffer overflow.
  4. Hierarchical Addressing and Route Aggregation

    • Allows efficient advertisement of routing information.
    • Uses hierarchical structure to aggregate routes, reducing the size of routing tables and improving scalability.

Application Layer Concepts

  1. Application-Level Protocols

    • HTTP (HyperText Transfer Protocol): A client-server protocol used by web browsers and servers to communicate, primarily using port 80. It is a stateless protocol, meaning the server does not retain information about client requests between sessions.
    • FTP (File Transfer Protocol): Used for transferring files between a client and server. The server listens on port 21 for control commands, and a separate data connection is established for transferring files.
    • SMTP (Simple Mail Transfer Protocol): An email protocol for sending messages between servers. It uses port 25.
    • POP3 (Post Office Protocol): A mail access protocol used by clients to retrieve email from a server using port 110. It is simple, providing download-and-delete or download-and-keep functionality.
    • IMAP (Internet Message Access Protocol): A more sophisticated email retrieval protocol that allows manipulation of messages on the server, using port 143.
    • DNS (Domain Name System): An application-layer protocol that resolves domain names into IP addresses, using UDP on port 53.
  2. Creating Network Applications

    • Applications are written to run on end systems (e.g., a web server or client software) and communicate over a network.
    • There is no need to write software for network-core devices (e.g., routers).
  3. Application Architectures

    • Client-Server Architecture:
      • Server: Always-on host with a permanent IP address. Servers can be scaled using data centers.
      • Clients: Communicate with the server and are often intermittently connected, possibly with dynamic IP addresses. Clients do not communicate directly with each other.
    • Peer-to-Peer (P2P) Architecture:
      • No Always-On Server: End systems (peers) directly communicate with each other.
      • Peers request services from other peers and provide services in return. IP addresses may change as peers connect intermittently.
  4. Processes Communicating

    • A process is a program running on a host.
    • Client Process: Initiates communication.
    • Server Process: Waits to be contacted.
    • Processes on different hosts communicate by exchanging messages.
    • Sockets: A socket is an interface between the process and the transport layer. It allows a process to send/receive messages. The socket is controlled by the application developer, while transport protocols are managed by the operating system.
  5. Addressing Processes

    • A process must have an identifier to receive messages, consisting of an IP address and a port number.
    • Examples:
      • HTTP server: Port 80
      • Mail server: Port 25
  6. HTTP: HyperText Transfer Protocol

    • Based on TCP and uses port 80.
    • It is a stateless protocol.
    • Procedure:
      1. The client initiates a TCP connection to the server.
      2. The client sends an HTTP request message, and the server responds.
      3. The server closes the connection after responding.
    • HTTP Request Methods:
      • GET: Requests the resource identified by the URL.
      • HEAD: Requests header meta-information only.
      • POST: Submits data to be processed to a specified resource (e.g., form data submission).
  7. HTTP Response Time

    • RTT (Round Trip Time): Time for a packet to travel from the client to the server and back.
    • HTTP response time = 2 RTTs (for connection setup and data transfer) + file transmission time.

Application Layer Concepts

  1. Cookies

    • Small information containers stored by a web server on the client side.
    • Used for authorization, shopping carts, and personalization.
    • Cookies can raise privacy concerns as they allow tracking of user activity.
  2. Web Technologies

    • HTML (HyperText Markup Language): Used to define the structure of web pages using tags (e.g., <p>, <a>).
    • CSS (Cascading Style Sheets): Defines the style of HTML elements (e.g., fonts, colors).
    • JavaScript: Used to implement dynamic behavior on web pages and can interact with the DOM (Document Object Model) to change content, style, and even access cookies.
  3. HTTPS

    • HyperText Transfer Protocol Secure: Encrypted communication using TLS (Transport Layer Security). Uses port 443 by default.
    • Provides authentication and ensures the confidentiality and integrity of data in transit.
  4. DNS (Domain Name System)

    • A distributed, hierarchical system that maps domain names to IP addresses.
    • Root Name Servers: First point of contact for domain name resolution.
    • Recursive Queries: A server takes responsibility for resolving the query.
    • Iterative Queries: A server returns the address of another server that might have the information.
  5. SMTP, POP3, and IMAP

    • SMTP is used for sending emails between servers.
    • POP3 and IMAP are used by clients for retrieving emails from a server.
    • SMTP Commands and Responses: The client sends commands such as HELO, MAIL FROM, RCPT TO, and DATA. The server replies with status codes like 250 OK.
  6. FTP (File Transfer Protocol)

    • A client-server model used for transferring files. The client initiates a connection, and the server responds.
    • Control Connection (Port 21) for commands and Data Connection for file transfer.
  7. Socket Programming

    • A socket is an endpoint for sending and receiving data across the network.
    • UDP Socket Programming: Clients attach an IP address and port number to each packet. The server extracts this information from received packets. UDP provides unreliable data transfer.
    • TCP Socket Programming: The server must first be running, and it waits for incoming connection requests. TCP provides reliable, in-order byte-stream transfer.
  8. Client/Server Socket Interaction

    • For UDP: The server and client use the recvfrom() and sendto() methods.
    • For TCP: The server uses accept() to accept a connection, and the client establishes the connection using connect().

TCP Sequence Numbers and Acknowledgments

  • Sequence Numbers: Identify bytes of data sent over the network.
  • Acknowledgments: Indicate successful receipt of data; acknowledgments are cumulative.

TCP Retransmission Scenarios

  • Lost ACK: If the sender does not receive an ACK, it retransmits the segment.
  • Timeout: The sender retransmits if it times out waiting for an ACK.
  • Cumulative ACK: Acknowledges receipt of multiple segments.

TCP Fast Retransmit

  • If a sender receives three duplicate ACKs for the same data, it retransmits the missing segment immediately, without waiting for a timeout.

Flow Control

  • The receiver uses the rwnd value to advertise free buffer space.
  • The sender limits the amount of unacknowledged data (“in-flight” data) to the receiver’s advertised rwnd value, ensuring the receiver’s buffer does not overflow.

TCP Connection Management

Three-Way Handshake

  1. SYN: The client sends a segment with the SYN bit set to establish a connection.
  2. SYN-ACK: The server responds with a SYN-ACK segment.
  3. ACK: The client sends an ACK segment to acknowledge the SYN-ACK, and the connection is established.

Closing a Connection

  • Both client and server must send a FIN segment to close their side of the connection.
  • Upon receiving a FIN, the receiver sends an ACK and may also send its own FIN.

TCP Congestion Control

Congestion occurs when too much data is sent too quickly for the network to handle, resulting in packet loss and increased delays.

Approaches to Congestion Control

  1. End-to-End Congestion Control

    • No explicit feedback from the network; congestion is inferred from packet loss or delay.
    • This approach is used by TCP.
  2. Network-Assisted Congestion Control

    • Routers provide explicit feedback to end systems, such as a single-bit indicating congestion or an explicit rate for the sender.

TCP Congestion Control Mechanisms

  1. Additive Increase, Multiplicative Decrease (AIMD)

    • Additive Increase: Increase the congestion window (cwnd) by one MSS (Maximum Segment Size) every RTT until loss is detected.
    • Multiplicative Decrease: Cut cwnd in half when loss is detected.
  2. Slow Start

    • Initially, cwnd is set to 1 MSS.
    • cwnd is doubled every RTT until a loss is detected, allowing the sender to quickly ramp up the transmission rate.
  3. Threshold Switching

    • After a loss, the ssthresh (slow start threshold) is set to half of the cwnd value before the loss.
    • Once cwnd reaches ssthresh, the increase becomes linear instead of exponential.

Layering and Protocol Stack

Internet Protocol Stack

  • Application Layer: Supports network applications (e.g., HTTP, FTP, SMTP).
  • Transport Layer: Manages process-to-process data transfer (TCP, UDP).
  • Network Layer: Handles routing of datagrams (IP).
  • Link Layer: Manages data transfer between neighboring network elements (e.g., Ethernet).
  • Physical Layer: Manages the transmission of raw bits.

OSI Model

  • Application, Presentation, Session, Transport, Network, Data Link, Physical.
  • The Internet stack does not include the Presentation and Session layers, and their functionalities are often integrated into the Application Layer.

Protocol Concepts

  • Protocols define the format and sequence of messages exchanged and the actions taken on message receipt.
  • Examples of human and network protocols:
    • Human: “What’s the time?” (request-response).
    • Network: HTTP request and response.

Encapsulation

  • Each layer adds its own header to the data from the previous layer.
  • This process allows the data to be passed down the protocol stack and then back up at the destination.

Network Security

  • Focuses on protecting against attacks like:
    • Denial of Service (DoS): Overwhelming resources with bogus traffic.
    • Traffic Sniffing: Reading unprotected traffic.
    • Address Spoofing: Sending packets with a false source address.
  • Security Concepts:
    • Confidentiality: Prevent unauthorized access to data.
    • Integrity: Ensure data has not been altered.
    • Authentication: Verify the source of the data.
    • Non-Repudiation: Prove the source of data/actions.

Packet vs. Circuit Switching

  • Packet Switching: Messages are broken into packets, which are sent independently and reassembled at the destination.
    • Queueing Delay: Occurs when packets arrive faster than they can be transmitted.
    • Loss: Packets are dropped if buffer capacity is exceeded.
  • Circuit Switching: Resources are reserved end-to-end for the duration of the connection (e.g., traditional telephone networks).

Internet Basics

  • The Internet is a network of networks that enables logical communication between devices.
  • Protocols control how messages are sent and received (e.g., TCP, IP, HTTP).
  • Internet Standards: Defined through RFCs (Requests for Comments), published by the IETF (Internet Engineering Task Force).

Structure of the Internet

  • Network Edge: Clients and servers (e.g., end-user devices and data centers).
  • Access Networks: How devices connect to the edge router (e.g., DSL, cable, Ethernet).
  • Network Core: Routers and Internet Service Providers (ISPs).