Infra Deep Dive
Scaling to millions of users is impossible without the underlying infrastructure that manages data, coordination, and orchestration. In this module, we explore the “Big Four” systems from Google and Yahoo/LinkedIn that defined the modern cloud.
Chapters
| Chapter | Title | Key Concepts |
|---|---|---|
| 01 | GFS (Google File System) | Chunks, Master-Slave, Shadow Masters, Consistency. |
| 02 | HDFS (Hadoop Distributed FS) | DataNodes, NameNodes, Federation, Erasure Coding. |
| 03 | Apache Kafka | Log-structured storage, ISR, Zero-Copy, High Watermark. |
| 04 | Chubby & ZooKeeper | Consensus, Paxos/Raft, Fencing, Distributed Locks. |
| 05 | Kubernetes (Control Plane) [NEW] | Controller Loops, etcd, Scheduling, Kubelet. |
[!TIP] Why study these? These papers are the source code for modern cloud architecture. If you understand GFS, you understand S3; if you understand Kafka, you understand Pulsar and Kinesis.
Module Chapters
Chapter 01
Google File System (GFS)
Google File System (GFS)
Start Learning
Chapter 02
HDFS (Hadoop Distributed File System)
HDFS (Hadoop Distributed File System)
Start Learning
Chapter 03
Kafka (Wait-Free Architecture)
Kafka (Wait-Free Architecture)
Start Learning
Chapter 04
Chubby (Distributed Lock Service)
Chubby (Distributed Lock Service)
Start Learning
Chapter 05
Design Kubernetes (Control Plane Architecture)
Design Kubernetes (Control Plane Architecture)
Start Learning
Chapter 06
Review & Cheat Sheet
Review & Cheat Sheet
Start Learning