Infra Deep Dive

Scaling to millions of users is impossible without the underlying infrastructure that manages data, coordination, and orchestration. In this module, we explore the “Big Four” systems from Google and Yahoo/LinkedIn that defined the modern cloud.

Chapters

Chapter Title Key Concepts
01 GFS (Google File System) Chunks, Master-Slave, Shadow Masters, Consistency.
02 HDFS (Hadoop Distributed FS) DataNodes, NameNodes, Federation, Erasure Coding.
03 Apache Kafka Log-structured storage, ISR, Zero-Copy, High Watermark.
04 Chubby & ZooKeeper Consensus, Paxos/Raft, Fencing, Distributed Locks.
05 Kubernetes (Control Plane) [NEW] Controller Loops, etcd, Scheduling, Kubelet.

[!TIP] Why study these? These papers are the source code for modern cloud architecture. If you understand GFS, you understand S3; if you understand Kafka, you understand Pulsar and Kinesis.