Pub-Sub Pattern
In 2014, WhatsApp was processing 50 billion messages per day with only 32 engineers. Their secret architecture: Fan-out. When you send a message to a group of 256 members, WhatsApp doesn’t send 256 individual HTTP calls to delivery servers. Instead, it publishes one event to a Topic. 256 subscriber queues each receive a copy, and 256 delivery workers process their slice in parallel. Without Pub-Sub, a 256-member group message would require a sequential chain of 256 network calls taking ~25 seconds. With Pub-Sub fan-out, it takes ~250ms (parallel). The Pub-Sub pattern is the architectural reason most modern messaging services can feel “instant” at billion-user scale.
[!IMPORTANT] In this lesson, you will master:
- Broadcast vs. Point-to-Point: Understanding when to “Radiocast” events to N services vs. assigning a specific task to one worker.
- Exchange Topologies: Comparing Direct, Fanout, and Topic routing mechanics for architectural flexibility.
- The Fan-out Tax: Quantifying the hardware impact (CPU, RAM, Bandwidth) of duplicating one message for 1,000 subscribers.
[!TIP] Interview Communication: When explaining Pub-Sub, explicitly state why you are decoupling services. “I’m using a Pub-Sub model here so that if the Marketing team wants to add a new ‘Promo Service’ tomorrow, they can simply subscribe to the
UserSignedUptopic without requiring any code changes to the core User Service.” This shows you understand the organizational and long-term maintenance benefits, not just the technical mechanics.
1. Beyond 1-to-1 Messaging
In a standard Message Queue (Point-to-Point), one message is processed by one consumer. This is great for work distribution (e.g., “Resize this image”).
But what if multiple services need to know about an event?
- User Signs Up:
Email Serviceneeds to send a welcome email.Analytics Serviceneeds to log the event.Fraud Serviceneeds to check the IP.
If we use a standard queue, only one of them will get the message. We need Pub-Sub.
2. The Pub-Sub Model
Publish-Subscribe decouples the sender (Publisher) from the receivers (Subscribers). Think of it like a Radio Station: The DJ (Publisher) broadcasts, and anyone tuned in (Subscribers) hears it.
Core Components
- Publisher: The service that emits the event. It does not know who is listening.
- Topic (Exchange): The logical channel where the event is sent.
- Subscriber (Queue): The service listening to a specific Topic.
- Fanout: The broker copies the message to ALL subscribers.
[!NOTE] Hardware-First Intuition: The “Pointer Duplication” Efficiency. In high-performance brokers like RabbitMQ or Kafka, a “Fan-out” doesn’t usually copy the physical bytes 100 times in memory. Instead, it creates 100 Pointers (References) to the same block of binary data in the OS Page Cache. The hardware tax only hits when the data is transmitted over the Network Interface Card (NIC). However, if the subscribers are slow and the messages pile up, the broker’s RAM consumption grows linearly with the number of subscribers, as it must track the read-offset for every single consumer.
3. Exchange Types Deep Dive
Not all Pub-Sub is the same. In AMQP (RabbitMQ), the “Topic” is actually called an Exchange. There are 4 main types:
| Exchange Type | Routing Logic | Use Case | Complexity |
|---|---|---|---|
| Direct | Exact Match (key == binding) |
Unicast routing (e.g., error.log → ErrorQueue). |
O(1) (Fast Hash) |
| Fanout | Broadcasts to ALL queues. Ignores key. | “Mass Notification” (e.g., New User Signup). | O(1) (Blind Copy) |
| Topic | Pattern Match (*, #). |
Multicast routing (e.g., payment.us.*). |
O(N) (String Parsing) |
| Headers | Matches metadata headers. | Complex routing logic beyond string keys. | O(N) (Header Check) |
[!TIP] Performance: Fanout is the fastest because it blindly copies messages. Topic exchanges are slower because they must parse the string pattern against the routing table (Trie data structure).
4. Interactive Demo: Topic Exchange & Wildcards
Visualize how messages are routed using Routing Keys.
[!TIP] Try it yourself: Click different event buttons. Watch how the Wildcard (
*,#) subscribers filter the traffic.
# (All)
payment.us.*
*.error
5. Why is this powerful?
1. Loose Coupling (Plug & Play)
If we need to add a new Recommendation Service next month, we don’t change the Payment Gateway code.
We just add a new subscriber. The Publisher doesn’t know (or care) who is listening.
2. Parallel Processing
All subscribers receive the message simultaneously (or near simultaneously).
- Email sent: 200ms
- Analytics logged: 50ms
- Fraud check: 100ms Total Time: Max(200, 50, 100) = 200ms (Parallel), instead of 350ms (Sequential).
3. Fanout Cost Analysis
Be careful with Fanout. If you have 1 event and 100 subscribers, the broker must create 100 copies.
- CPU Cost: Low (it’s just a pointer copy).
- Network Cost: High (100x bandwidth). Every byte transmitted is a physical electron through the NIC.
- Storage Cost: High (if durable queues are used).
6. Hardware-First Intuition: The NIC Saturation Tax
When you use Fan-out, you are essentially multiplying your Network Bandwidth Consumption by the number of subscribers.
- Direct Memory Access (DMA): A smart broker uses DMA to move data from memory directly to the Network Interface Card (NIC) without the CPU touching every byte.
- NIC Bottleneck: If you publish a 1MB message (e.g., a high-res image) to 1,000 subscribers, the broker must push 1GB of data across the physical wire. On a 10Gbps link, this single event consumes 80% of your total hardware capacity for one second.
- RAM Pressure: If one subscriber is slow (e.g., over a high-latency satellite link), the broker must keep that 1MB message in RAM until the consumer acknowledges it. 1,000 slow consumers = 1GB of “pinned” RAM that cannot be garbage collected.
[!NOTE] War Story: The “Reply-All” Outage In 2019, a massive fan-out issue caused a severe incident at a major tech company. A user triggered a seemingly harmless event that sent a notification to an entity with millions of followers. The Pub-Sub broker blindly duplicated the message, instantly saturating the network uplinks of the broker nodes and causing subsequent critical system events to be dropped. They solved this by introducing Batching for large fan-outs and capping the maximum number of concurrent message deliveries per event.
[!TIP] Staff Engineer Tip: The Prefix-Tree Performance Penalty Direct exchanges are $O(1)$—the broker uses a fast Hash Map to find the target queue. Topic exchanges with wildcards (
*,#) are $O(N)$—the broker must walk a Trie (Prefix Tree) to match the string pattern. At 1,000,000 messages per second, the extra CPU instructions required to traverse the Trie can become the primary cause of tail latency. If you don’t need wildcards, always use Direct exchanges to stay on the hardware’s “fast path.”
7. Filtering Strategies
Where should the filtering happen? This is a massive trade-off.
A. Broker-Side Filtering (Efficient)
The broker (Exchange) decides which queue gets the message.
- Mechanism: The Exchange checks the Routing Key against Binding Keys (using a Trie).
- Example: RabbitMQ Topic Exchange.
- Logic: If no one binds to
sys.debug, the message is discarded immediately at the broker. - Pros: Saves network bandwidth and consumer CPU.
- Cons: Broker works harder (higher CPU usage on the MQ server).
B. Consumer-Side Filtering (Wasteful)
The consumer receives everything and filters it in code.
- Mechanism: All services listen to a “Firehose”.
- Logic:
if (msg.type ≠ 'error') return; - Pros: Dumb broker (fast, high throughput).
- Cons: Massive waste of bandwidth. The consumer wakes up, deserializes JSON, and discards it.
[!TIP] Always prefer Broker-Side Filtering for high-volume systems to save bandwidth. Only send data to services that actually need it.
8. Understanding Wildcards
Sometimes you don’t want all messages. You want to subscribe to a subset.
Wildcards (RabbitMQ Example)
- Topic format:
service.region.status - Wildcard
*(Star): Matches exactly one word. payment.us.*matchespayment.us.successbut NOTpayment.us.db.error.- Wildcard
#(Hash): Matches zero or more words. payment.#matchespayment.error,payment.us.success, andpayment.debug.level.1.
9. Delivery Semantics
When doing Pub-Sub, you must decide your guarantee level:
| Semantics | Description | Pros | Cons |
|---|---|---|---|
| At-Most-Once | Fire and Forget. Message might be lost. | Fastest. No state tracking. | Data loss possible. |
| At-Least-Once | Retries until ack received. | No data loss. | Duplicates (Requires Idempotency). |
| Exactly-Once | Hardest to achieve. Transactional. | Perfect consistency. | High latency, complex. |
10. Summary
- Queues for 1-to-1 work distribution.
- Pub-Sub for 1-to-Many notifications.
- Topic Exchanges allow for powerful routing logic using Wildcards.
- Prefer Broker-Side Filtering to avoid flooding consumers with irrelevant data.
Staff Engineer Tip: Choose your exchange type based on the Routing Trie Overhead. A Fanout exchange is an O(1) operation—the broker doesn’t even look at the routing key. A Topic exchange requires the broker to walk a Trie (Prefix Tree) in memory to match wildcards like # and *. At 1,000,000 messages per second, the CPU instruction count for Topic matching can become your primary bottleneck, potentially doubling the latency compared to a simple Fanout broadcast.
Mnemonic — “Queue = Task, Pub-Sub = Event”: Queue (RabbitMQ) = Point-to-Point: One message, one worker. Task is deleted after processing (“Mailbox”). Pub-Sub (SNS/Kafka Topic) = Broadcast: One event, N subscribers. All receive a copy (“Radio Station”). Pick Queue when you want work assigned. Pick Pub-Sub when you want everyone informed. Fan-out Tax: 1 event × 1000 subscribers = 1000× network bandwidth. Use Broker-Side Filtering (Routing Keys) to avoid sending irrelevant data to consumers.