BASE & Eventual Consistency: The Speed Trade-off
BASE & Eventual Consistency: The Speed Trade-off
If ACID is the “Strict Bank Teller”, BASE is the “Social Media Feed”. It doesn’t matter if your friend sees your post 2 seconds later than you do. What matters is that the app loads instantly.
1. The Philosophy: Optimism
- ACID (SQL): Pessimistic. “I will lock this row until I am 100% sure the data is safe everywhere.”
- BASE (NoSQL): Optimistic. “I will accept this write immediately and figure out the synchronization details later.”
The Acronym
- BA (Basically Available): The system guarantees availability. If a node goes down, the system still replies (possibly with stale data).
- S (Soft State): The state of the system may change over time, even without new input (due to background syncing).
- E (Eventual Consistency): If writes stop, all replicas will eventually agree on the same value.
2. Distributed Consistency Mechanics
How do we actually achieve this? We use Replication.
Quorums (N, R, W)
In a distributed system (like Cassandra or DynamoDB), we don’t just write to one hard drive. We write to N replicas.
- N: Replication Factor (Total copies). Usually 3.
- W: Write Quorum. How many nodes must confirm the write before we say “Success” to the user?
- R: Read Quorum. How many nodes must we ask to get data?
The Golden Rule: R + W > N
If this equation holds, you are guaranteed Strong Consistency (you will always read the latest write).
If R + W <= N, you risk Eventual Consistency (reading old data).
Anti-Entropy (Fixing the Mess)
Since nodes can be out of sync, how do they agree?
- Read Repair: When you read data, the DB asks multiple nodes. If Node A says “v1” and Node B says “v2” (newer), the DB returns “v2” and silently updates Node A.
- Hinted Handoff: If a node is down, the DB writes the data to a neighbor with a note: “Give this to Node A when it comes back online.”
- Merkle Trees: Background processes that compare massive data structures to find differences efficiently without sending the whole dataset.
3. Interactive Demo: Quorum Configurator (N, R, W)
Visualize how tuning Read (R) and Write (W) quorums affects consistency.
- Strong Consistency: When
R + W > N, a read is guaranteed to see the latest write. - Eventual Consistency: When
R + W <= N, you might read stale data.
4. Deep Dive: Conflict Resolution (Vector Clocks)
What happens if two users update the same data at the exact same time on different nodes? User A adds “Apple” to cart. User B adds “Banana” to cart (same account). Who wins?
A. Last Write Wins (LWW)
The lazy approach. We look at the timestamp.
- User A: 12:00:01 PM
- User B: 12:00:02 PM
- Winner: User B.
- Result: Cart has “Banana”. “Apple” is lost. Data Loss!
B. Vector Clocks (Amazon Dynamo Style)
The smart approach. We track causality, not just wall-clock time.
Every piece of data carries a version history: [NodeA: 1, NodeB: 0].
The Shopping Cart Scenario:
- Initial:
[A:0, B:0]Cart: {} - Node A adds Apple:
[A:1, B:0]Cart: {Apple} - Node B adds Banana:
[A:0, B:1]Cart: {Banana} (Note: Node B hasn’t seen Node A’s update yet). - Sync: The DB compares
[1, 0]and[0, 1].- Is
1 > 0AND0 > 1? No. - Neither version is “newer”. This is a Conflict.
- Is
- Resolution: The DB saves both versions:
{Apple}AND{Banana}. - Client Repair: The next time the user reads the cart, the app gets both versions. It merges them to
{Apple, Banana}and writes back[A:1, B:1].
5. Case Study: Social Media Likes
Let’s apply BASE to a real feature: Instagram Likes.
Requirement
- Millions of users liking posts simultaneously.
- Latency: Must be < 100ms.
- Consistency: If I like a post, it’s okay if my friend in Japan sees the count update 5 seconds later.
Architecture
- Write Path:
- User taps “Like”.
- App sends request to closest Edge Server.
- Server writes to local Redis/Cassandra node.
- Returns “Success” immediately. (Basically Available).
- Background Sync:
- The local node asynchronously pushes the update to other data centers. (Eventual Consistency).
- Read Path:
- Users read the Like Count from their local replica.
- The count might be 100 in US and 95 in EU. This is Soft State.
Conflict Resolution
What if two people like at the exact same millisecond?
- Actually, “Likes” are Commutative.
Count = Count + 1.- Order doesn’t matter.
1 + 1 = 2. - We use CRDTs (Conflict-free Replicated Data Types) to merge counters automatically without Vector Clocks.
6. Summary
- BASE prioritizes Availability over immediate Consistency.
- Quorums (
R + W > N) allow you to tune the trade-off. - Conflict Resolution: Use LWW for simplicity, Vector Clocks for correctness in complex merges (Shopping Carts).
- Anti-Entropy: Merkle Trees and Read Repair keep nodes in sync eventually.