RDB Snapshotting

Imagine you are working on a massive, complex jigsaw puzzle on your living room table. If an earthquake hits (or your cat jumps on the table), all your progress is instantly lost. To protect yourself, you decide to take a high-resolution photograph of the table every hour. If the puzzle is destroyed, you can use the photograph to recreate it exactly as it was.

In Redis, this “photograph” is called RDB (Redis Database) Snapshotting. Since Redis keeps all its data in volatile RAM, RDB is one of the primary mechanisms to ensure your data survives a server crash or restart by periodically writing a point-in-time snapshot to the hard drive.

1. What is an RDB Snapshot?

RDB persistence performs point-in-time snapshots of your dataset at specified intervals.

  • Compact & Binary: RDB is a single, heavily compressed binary file (dump.rdb) representing the state of Redis at a specific moment.
  • Fast Recovery: Because it is a direct binary representation of the in-memory data structures, restarting a Redis node from an RDB file is incredibly fast. Redis simply loads the file back into memory without needing to replay individual commands.
  • Minimal Performance Impact: The process of creating an RDB file is handed off to a background child process, ensuring the main Redis thread remains completely responsive to client requests.

2. Triggering a Snapshot

There are three ways an RDB snapshot is created:

  1. SAVE (Synchronous - Avoid in Production): This command blocks the main Redis thread while the snapshot is created. During this time, Redis will refuse all client requests. It should only be used in specific administrative scenarios (e.g., shutting down a node gracefully).
  2. BGSAVE (Asynchronous): This command tells Redis to fork a child process to handle writing the snapshot in the background. The main thread continues serving traffic.
  3. Config-Based (Automatic): You can configure Redis in redis.conf to automatically trigger a BGSAVE based on the number of changes over a period of time.

The redis.conf Configuration

# save <seconds> <changes>
save 3600 1      # Save after 1 hour if at least 1 key changed
save 300 100     # Save after 5 minutes if at least 100 keys changed
save 60 10000    # Save after 60 seconds if at least 10,000 keys changed

3. The Fork() Mechanism (Under the Hood)

When a BGSAVE is triggered, Redis relies on the operating system’s fork() system call to create a perfect clone of the Redis process.

  1. The Fork: The OS creates a child process. Initially, the child shares the exact same memory pages as the parent process.
  2. Copy-on-Write (COW): As the main parent process continues to handle new write requests, the OS duplicates only the specific memory pages being modified. The child process retains a static, point-in-time view of the original memory.
  3. Writing to Disk: The child process sequentially writes its static view of the data to a temporary RDB file.
  4. Atomic Swap: Once the write is complete, the child process replaces the old dump.rdb file with the new one and gracefully exits.

[!WARNING] The Thundering Herd of RAM: While fork() is efficient thanks to Copy-on-Write, it can be dangerous if your Redis instance is running near maximum memory capacity. If your 64GB Redis node is using 50GB of RAM, and a heavy write load occurs during the BGSAVE, Copy-on-Write will duplicate memory pages. This could cause Redis to exceed the 64GB physical memory limit, leading the OS to aggressively swap to disk or invoke the Out-Of-Memory (OOM) killer to terminate Redis.


4. Interactive: Snapshot Simulator

Trigger a background save and observe how the main process hands the work to the child process via fork(), allowing it to continue handling requests uninterrupted.

Main Process
Handling Requests
Child Process
Idle
System: Up

5. Summary: Pros and Cons

Feature Advantage Disadvantage
Disaster Recovery RDB files are compact and easily portable, making them perfect for sending to an S3 bucket for off-site backups. Data Loss Window: If Redis crashes, you lose all data modified since the last snapshot. If your rule is save 300 100, you could lose up to 5 minutes of data.
Restart Speed Extremely fast restarts. Redis simply loads the binary memory dump straight back into RAM. High CPU/Memory during Fork: The fork() process can be CPU intensive, and the Copy-on-Write mechanism can double memory usage under heavy write loads.

RDB is fantastic for backups, but for true durability (where you cannot afford to lose even a second of data), RDB alone is not enough. You need to combine it with AOF (Append Only File).