Redis vs Memcached: The Titans
[!TIP] Interview Insight: Don’t just list features. Explain architectural differences. “Redis is single-threaded, which avoids lock contention. Memcached is multi-threaded, which scales vertically on big CPUs.”
1. Memcached (The Simple Giant)
Memcached is a pure, in-memory key-value store. It is “dumb” by design.
- Architecture: Multi-threaded. It uses a master thread to accept connections and worker threads to handle requests.
- Memory Management: Uses Slab Allocation to reduce memory fragmentation. It pre-allocates chunks of memory (slabs) for different data sizes.
- Data Types: Only Strings (Blobs).
- Replication: None. If a node dies, data is gone.
2. Redis (The Smart Swiss Army Knife)
Redis (Remote Dictionary Server) is a data structure server.
A. The Single-Threaded Architecture (Why?)
You might think: “It’s 2024. Why use a single thread?” Analogy: Imagine a busy restaurant.
- Multi-threaded (Apache/Memcached): You have 100 waiters. They bump into each other, fight over the single kitchen door (Locks), and waste time coordinating.
- Single-threaded (Redis/Node.js): You have ONE super-fast waiter (The Event Loop). He never stops moving. He takes an order, passes it to the kitchen (Kernel/Network), and takes the next order immediately. No collisions. No locks.
Deep Dive: IO Multiplexing (The Secret Sauce)
How can one thread handle 100,000 requests per second?
It uses Non-blocking IO and IO Multiplexing (via epoll on Linux or kqueue on BSD).
- Blocking IO (Standard): The thread calls
read()and sleeps until data arrives. It does nothing else. - IO Multiplexing (Redis): The thread asks the OS Kernel: “Here is a list of 10,000 connections. Wake me up if ANY of them have data.”
- The Kernel wakes the thread only when there is work. The thread processes the data, sends the reply, and goes back to waiting.
Benefit:
- Zero Context Switching: The CPU never wastes cycles switching between threads.
- Cache Locality: The hot data stays in the CPU’s L1/L2 cache because the same thread is always running.
- No Locks: You never need
mutexlocks for data access because only one command runs at a time.
B. Data Structures
- Lists: For Queues.
- Sets: For unique friends.
- Sorted Sets (ZSET): For Leaderboards.
- HyperLogLog: For counting unique items (DAU) with 99% accuracy in tiny memory.
C. Persistence (RDB vs AOF)
Redis is in-memory, but it saves to disk. How?
- RDB (Snapshot): Forks a child process to dump RAM to disk every X minutes. Fast startup, but potential data loss.
- AOF (Append Only File): Logs every write. Slower startup, but durable.
Deep Dive: How Redis Survives Crashes
RDB (Redis Database Snapshot):
- Trigger: Every
save 900 1(900s if 1 key changed) or manualBGSAVE - Mechanism:
fork()creates a child process with Copy-On-Write memory - Child writes entire dataset to
/dump.rdbon disk - Atomicity: Rename temp file to
dump.rdbonly when complete
Trade-offs:
- ✅ Fast recovery: Loading binary is fast (1M keys/sec)
- ✅ Small file size: Compressed binary format
- ❌ Data loss window: If Redis crashes 10min after last save, you lose 10min of writes
- ❌ Fork cost: On 10GB dataset,
fork()can pause Redis for 1 second
AOF (Append-Only File):
- Logs: Every
SET user:1 Aliceis appended to/appendonly.aof - Sync Policy:
appendfsync always: Sync after every write (slow, but ZERO data loss)appendfsync everysec: Sync every 1s (default, balances speed vs safety)appendfsync no: Let OS decide (fast, but risky)
- Rewrite: AOF grows huge. Redis periodically rewrites it to only include final state
Example AOF:
*3
$3
SET
$6
user:1
$5
Alice
Trade-offs:
- ✅ Durable: Max 1s of data loss (with
everysec) - ✅ Readable: Text format, can edit manually
- ❌ Slow recovery: Must replay every command (slower than RDB)
- ❌ Larger files: Text is bigger than binary
Hybrid Persistence (Redis 4.0+)
Combines RDB + AOF:
- Use RDB for base snapshot (fast recovery)
- Use AOF for changes since last snapshot (low data loss)
- On restart: Load RDB, then replay AOF delta
Config:
# Enable both
save 900 1 # RDB every 15min if 1 key changed
appendonly yes # AOF enabled
aof-use-rdb-preamble yes # Hybrid mode
Recovery Scenarios
| Scenario | RDB Only | AOF Only | Hybrid (RDB+AOF) |
|---|---|---|---|
| Normal Restart | Fast (load binary) | Slow (replay log) | Fast (RDB) + Fast (small AOF) |
| Crash (10min since save) | ❌ Lost 10min | ✅ Lost 1s | ✅ Lost 1s |
| Corrupted AOF | ✅ Still have RDB | ❌ Cannot start | ✅ Fallback to RDB |
| No Disk Space | ⚠️ BGSAVE fails silently | ⚠️ AOF sync fails | ⚠️ Both fail |
Interview Insight: Netflix uses AOF with everysec. They accept 1s of data loss for session caches, but need better durability than RDB snapshots alone.
Persistence Visualizer
See the difference in real-time.
- AOF (Log): Every write adds a line. Continuous IO.
- RDB (Snapshot): Periodically dumps the entire state. Burst IO.
AOF (Append Only File)
RDB (Snapshot)
3. Distributed Caching: Sentinel vs Cluster
How do you scale Redis when one machine isn’t enough?
A. Redis Sentinel (High Availability)
- Goal: Keep the system alive if Master dies.
- Setup: 1 Master (Read/Write) + N Slaves (Read Only).
- Mechanism: Sentinels watch the Master. If it dies, they vote to promote a Slave to Master.
- Limitation: You cannot write more data than fits on one machine.
B. Redis Cluster (Sharding)
- Goal: Store MORE data than fits on one machine.
- Mechanism: Data is split into 16,384 Hash Slots.
- Distribution:
CRC16(key) % 16384. Each node owns a range of slots. - Smart Clients: Clients know which node holds which slot and connect directly.
Sentinel (HA)
Cluster (Sharding)
C. Consistent Hashing (The Old Way - Memcached)
Before Redis Cluster, we used Consistent Hashing (e.g., in Memcached clients).
- Concept: Map both Nodes and Keys to a Hash Ring (0 to 232-1).
- Placement: A key belongs to the first Node found moving clockwise on the ring.
- Virtual Nodes: To balance load, each physical node is hashed hundreds of times (Node A_1, Node A_2…).
Difference:
- Consistent Hashing: Dynamic. Good for systems where nodes join/leave frequently (P2P, Cassandra).
- Hash Slots (Redis): Fixed 16k slots. Deterministic. Better for controlled cluster management.
D. Gossip Protocol
How do Redis Cluster nodes know about each other? They Gossip. Every node connects to a few other random nodes and exchanges information (“I am alive”, “Node B is down”). This allows the cluster to detect failures without a central “Master” server.
Interactive Demo: Consistent Hashing vs Hash Slots
- Hash Slots: Fixed buckets. Adding a node requires moving specific buckets.
- Consistent Hashing: A continuous ring. Adding a node only affects its immediate neighbor.
Consistent Hashing Ring
4. Interactive Demo: Hash Slot Sharding
Visualizing the Redis Cluster Ring.
- Enter a key to see its Hash Slot (0-16383).
- Add Node: Watch how the slots (and data) are Rebalanced from Node C to the new Node D.
5. Performance Trick: Pipelining vs Lua Scripting
Normally, every Redis command is a separate network Round Trip (RTT).
A. Pipelining (Batching)
- Concept: Send 100 commands at once. Read 100 replies at once.
- Benefit: Saves 99 RTTs. Massive throughput increase.
- Limitation: Not Atomic. If command 50 fails, command 51 still runs.
B. Lua Scripting (Atomicity)
- Concept: Send a small script (code) to Redis. Redis executes it as a single transaction.
- Benefit: Atomic. No other command runs while the script is running. Great for complex logic (e.g., Rate Limiting “Check-and-Set”).
- Warning: If your script is slow (infinite loop), you block the entire server (Single Threaded!).
6. Summary
| Feature | Memcached | Redis |
|---|---|---|
| Concurrency | Multi-threaded (Good for scaling up on 1 node) | Single-threaded (No locks, avoids context switching) |
| Data Types | Strings only | String, List, Set, Hash, ZSet, Bitmap, Geo |
| Persistence | ❌ None (Lost on restart) | ✅ RDB (Snapshots) & AOF (Logs) |
| Cluster Mode | Client-side hashing only | Native Redis Cluster (Hash Slots) |