Data Deep Dive
Welcome to the Deep Dive Data module. This is where we go below the surface of “Use a Key-Value Store” and understand exactly how they work.
We will deconstruct the seminal papers and systems that built the modern internet. You will learn not just what to use, but why it was built that way.
Chapters
- Dynamo (Amazon): The system that taught us about Availability, Vector Clocks, and Consistent Hashing.
- Cassandra (Wide Column): The hybrid beast that powers Facebook and Netflix. Learn about LSM Trees and Tombstones.
- BigTable (Google): The massive distributed map behind Google Search. Learn about Tablets, Splitting, and SSTables.
- MapReduce: The paradigm shift of moving Code to Data.
- Bloom Filters: The probabilistic magic that saves millions of disk seeks.
- Snowflake Data Warehouse [NEW]: Modern cloud data warehousing. Learn about Storage/Compute separation and Micro-partitions.
- Module Review: Flashcards and Cheat Sheets to cement your knowledge.
Why This Matters
In a System Design interview, saying “I’ll use Cassandra” is L1. Explaining “I’ll use Cassandra because I need fast writes (LSM Tree) and multi-region active-active availability, and I can tolerate eventual consistency” is L5/L6.
Let’s dive in.
Module Chapters
Chapter 01
Dynamo (Key-Value)
Dynamo (Key-Value)
Start Learning
Chapter 02
Cassandra (Wide Column Store)
Cassandra (Wide Column Store)
Start Learning
Chapter 03
Google BigTable
Google BigTable
Start Learning
Chapter 04
MapReduce
MapReduce
Start Learning
Chapter 05
Bloom Filters
Bloom Filters
Start Learning
Chapter 06
Snowflake Architecture
Snowflake Architecture
Start Learning
Chapter 07
Review & Cheat Sheet
Review & Cheat Sheet
Start Learning