Backend Communication Design Patterns: Protocols (Day 1, Section 3)
1. Introduction to Protocols
- A communication protocol is a set of rules that allows two parties to communicate.
- If both parties follow the same protocol, they can exchange information effectively.
- Protocols are designed with specific properties based on their purpose and the problem they solve.
2. Why Are Protocols Important?
- Every protocol is created to address a specific problem.
- Example: TCP was designed in the 1960s for low-bandwidth networks.
- Modern data centers push TCP to its limits, leading to the creation of newer protocols like Homa (2022).
- Different protocols are used in various contexts:
- Classic Protocols: DHCP, UDP, HTTP, gRPC, FTP, SMTP, etc.
- Emerging Protocols: Homa (not yet an RFC, still a research paper).
3. Key Properties of Communication Protocols
A. Data Format
Defines how data is structured for transmission. Two main types:
- Text-Based Protocols (Human-readable)
- Examples: Plaintext, JSON, XML, HTTP
- Can be read directly from the wire (unless encrypted).
- Binary Protocols (Machine-optimized)
- Examples: Protocol Buffers, Redis Serialization, gRPC, HTTP/2
- More efficient but not human-readable.
B. Transfer Mode
Determines how data is sent over the network. Two main types:
- Message-Based Protocols (Discrete messages)
- Each message has a clear start and end.
- Examples: HTTP, UDP (messages are packed into IP packets).
- Stream-Based Protocols (Continuous data flow)
- No fixed start or end, just a stream of bytes.
- Examples: TCP, RTSP (used in video streaming).
- Challenge: Clients must parse TCP streams to find message boundaries.
C. Addressing System
How a protocol identifies source and destination.
- Domain Name System (DNS): Converts domain names to IP addresses.
- IP Addressing (Layer 3): Identifies devices over the internet.
- MAC Address (Layer 2): Used within local networks.
- ARP (Address Resolution Protocol): Converts IP addresses to MAC addresses.
D. Directionality
Defines data flow:
- Unidirectional: Data flows in one direction.
- Bidirectional: Two-way communication.
- Full-Duplex: Both parties can send data simultaneously.
- Half-Duplex: Devices take turns to send data (e.g., WiFi).
E. State Management
- Stateful Protocols (Maintain session information)
- Examples: TCP, gRPC, Apache Threads
- Stateless Protocols (Each request is independent)
- Examples: UDP, HTTP
F. Routing & Proxying
How a protocol interacts with networks and proxies.
- Example: DNS resolution for Google.com
- DNS finds the IP → TCP connection is established → HTTP request is sent.
- If a proxy is used, the connection first reaches the proxy, which forwards it to Google.
G. Reliability & Congestion Control
- Reliable Protocols: Ensure correct delivery with retransmissions.
- Example: TCP (has flow control, congestion control, and retransmissions).
- Unreliable Protocols: Best-effort delivery, no guarantees.
- Example: UDP (no retransmission or flow control).
H. Error Handling
- Defines how errors are managed:
- Standard error codes (e.g., HTTP status codes).
- Timeouts & Retries: Whether to resend lost requests.
4. Conclusion
- Protocols define how data is structured, transmitted, and managed.
- Different protocols have different strengths based on their design choices.
- Understanding these properties helps in choosing or designing the right protocol for a use case.