Replication: Master-Slave vs Multi-Master
In October 2018, GitHub suffered its longest outage — 24 hours and 11 minutes. The root cause: a routine network maintenance caused a brief 43-second disconnection between its MySQL master database in the US-East and its replication slave in the US-West. During that 43 seconds, both regions wrote data — a Split Brain event. When the connection restored, GitHub’s orchestration tool (Orchestrator) automatically promoted the US-West slave to master. But 43 seconds of writes to US-East were never replicated. GitHub had to write a custom tool to detect and manually resolve 946 conflicting rows — live, in production. The incident redefined how the industry thinks about automatic failover: promotion without verification is its own disaster.
[!IMPORTANT] In this lesson, you will master:
- Topology Choices: Deciding between Single-Leader, Multi-Leader, and Leaderless architectures based on write-availability requirements.
- Hardware Durability: Understanding how the Write-Ahead Log (WAL) ensures safety on disk before replication begins.
- Consistency Trade-offs: Quantifying the performance cost of Synchronous vs. Asynchronous network handshakes.
1. Why Replicate?
Replication means keeping a copy of the same data on multiple machines that are connected via a network. Why do we do this?
- Availability: If one node goes down, the data is still available on another.
- Scalability: You can distribute read queries across multiple replicas.
- Latency: You can place replicas geographically closer to users (e.g., US, EU, Asia).
2. Strategy 1: Master-Slave (Single Leader)
Architecture:
- One Master (Leader): Handles ALL Writes. Can also handle Reads.
- Multiple Slaves (Followers): Replicate data from the Master. Handle ONLY Reads.
Flow:
- Client writes
x = 5to Master. - Master writes to its disk (WAL).
- Master sends the change to Slaves (Async or Sync).
- Slaves update their disk.
Pros: Simple. No write conflicts (because only one Master). Cons: Master is a Write Bottleneck. If Master dies, you need a failover process.
Analogy: Think of Master-Slave like a restaurant kitchen. The Head Chef (Master) is the only one allowed to cook (write) new dishes. The waiters (Slaves) can only serve (read) what the Head Chef has already prepared.
[!NOTE] Hardware-First Intuition: The “Binary Log” Throughput. In a Master-Slave setup, the Master doesn’t just write to its database; it also streams every change to a Binary Log (Binlog) or WAL. A single physical disk or SSD controller has a limited amount of Sequential Write Bandwidth. If you have 50 slaves all fetching the Binlog simultaneously, you might saturate the Master’s Network Transmit (TX) Bandwidth or its storage controller’s ability to read the log for the followers. This is why high-scale systems use Intermediate Slaves to relay logs to others, spreading the hardware load.
[!WARNING] Replication Lag: If you use Asynchronous replication (common for performance), a user might write
x = 5, then immediately read from a Slave and seex = 4. The slave hasn’t updated yet! This is Eventual Consistency.
Timeline: The Danger of Async Replication
3. Strategy 2: Multi-Master (Multi-Leader)
Architecture:
- Multiple Masters: Any Master can accept Writes.
- Masters replicate changes to each other.
Use Case: Global apps. A user in the US writes to the US Master. A user in EU writes to the EU Master.
Pros: High Write Availability. Local write latency.
Cons: Write Conflicts. What if US User sets x = 5 and EU User sets x = 10 at the exact same time? You need conflict resolution (Last Write Wins, Vector Clocks).
Analogy: Think of Multi-Master as collaborating on a Google Doc without an internet connection. You both edit the same paragraph offline. When you reconnect, the system has to merge both changes—often resulting in a conflict requiring resolution.
4. Strategy 3: Leaderless (Dynamo-style)
Architecture:
- No Masters. All nodes are equal.
- Client sends Writes to multiple nodes (e.g., 3 out of 5).
- Client reads from multiple nodes to detect conflicts.
Examples: Cassandra, DynamoDB. Pros: Zero downtime. No single point of failure. Cons: Complex consistency logic (Read Repair, Quorums).
Analogy: Think of Leaderless replication like a jury trial. No single juror (node) is the ultimate authority. When making a decision, you must poll a majority (quorum) of jurors. If 3 out of 5 jurors agree on the verdict, you accept it as the truth.
Example: In a 5-node cluster, if you write to 3 nodes (W=3) and read from 3 nodes (R=3), you are mathematically guaranteed to read from at least one node that saw the latest write (W + R > N), ensuring strong consistency without a dedicated leader.
5. Interactive Demo: Replication Lag & Split Brain
Simulate network conditions and see their impact.
- Lag Slider: Increase latency. Write to Master. Read from Slave immediately. Do you see old data?
- Cut Network: Creates a Partition. Slaves stop receiving updates.
- Promote Slave: Causes Split Brain (Two Masters).
6. Deep Dive: Split Brain
What happens in a Master-Slave system if the Master loses network connectivity but doesn’t crash?
- The Slaves think the Master is dead.
- They elect a New Master (Slave 1).
- The Old Master comes back online. It still thinks it is the Master.
- Now you have Two Masters accepting writes. This is Split Brain.
The Solution: Fencing Tokens
To solve this, we use an Epoch Number (Generation ID) as a Fencing Token.
- Old Master was Epoch 1.
- New Master is elected as Epoch 2.
- If Old Master tries to write to the storage (or send a request), the storage sees “Epoch 1” and rejects it because it has already seen “Epoch 2”.
- This effectively “fences off” the Zombie Master.
7. Advanced: Chain Replication
Used by systems like MongoDB and CockroachDB.
Instead of Master sending to all Slaves (Star topology), it forms a chain:
Master → Slave 1 → Slave 2.
- Write: Goes to Head (Master).
- Propagate: Head → S1 → S2.
- Commit: Tail (S2) confirms to Client.
- Pros: Strong Consistency (all nodes have data before success).
- Cons: Higher Latency (Wait for longest path).
Interview Tip: Always clarify if the system needs Strong Consistency (Bank) or Eventual Consistency (Social Media). This dictates whether you use Sync Replication (Slow, Safe) or Async Replication (Fast, Risky).
Staff Engineer Tip: Semi-Synchronous Replication. Purely synchronous replication is too slow (one slow slave stalls all writes). Purely asynchronous is too risky (data loss on crash). Most “Elite” systems (like MySQL 5.7+ or Postgres with remote_write) use Semi-Synchronous Replication. The Master waits for at least one slave to acknowledge it has received the data in its relay log, but doesn’t wait for it to be fully committed to disk. This balances durability and performance.
Mnemonic — “Single Leader=Simple, Multi-Leader=Conflict, Leaderless=Complex”: Master-Slave = simple failover, one write path, replication lag risk. Multi-Master = low-latency global writes, conflict resolution required. Leaderless (Cassandra/Dynamo) = zero SPOF, requires Quorum logic. GitHub’s outage lesson: never auto-promote without fencing tokens — always fence the zombie master.