Design an Ad Click Aggregator

[!IMPORTANT] In this lesson, you will master:

High-Throughput Ingestion: Building intuition behind buffering billions of clicks using Kafka to prevent systemic collapse during traffic spikes.

Exactly-Once Processing: Understanding how Apache Flink guarantees precision through distributed checkpointing and two-phase commits.

OLAP Analytics: Why traditional databases fail at scale and how columnar stores like ClickHouse enable real-time dashboarding.

1. Problem Statement

In the world of online advertising, every click counts—literally. An Ad Click Aggregator is a system that processes billions of click events from various websites and mobile apps to provide real-time reports for advertisers and billing services.

The Challenge:

Scale: Handling billions of clicks per day.
Precision: We cannot overcharge (double counting) or undercharge (missing clicks).
Latency: Advertisers want to see their budget consumption in near real-time (< 1 minute).

Real-World Examples:

Google Ads: Real-time bidding and budget tracking.
Facebook Ads: Aggregating impressions and clicks for campaign performance.

[!TIP] Analogy: The Toll Booth Imagine a massive highway with 1,000 lanes. Every time a car passes, you need to count it to bill the driver. If you miss a car, you lose money. If you count a car twice because it swerved between lanes, the driver gets angry. This system is the digital version of that toll booth, but for billions of “cars” (clicks).

[!NOTE] War Story: The Super Bowl Ad Blackout How Company X handled a Thundering Herd problem during a massive sports event. Millions of users clicked an interactive ad at the exact same moment, flooding the ingest API. They survived by instantly scaling their Kafka partitions and heavily relying on asynchronous processing to buffer the spike, preventing database collapse.

2. Requirements & Goals (P - Problem & Requirements)

Functional Requirements

Aggregate Clicks: Count clicks per ad_id over time windows (e.g., 1 minute, 1 hour).
Filter Fraud: Detect and filter out bot clicks or duplicates.
Queryable Results: Provide an API for internal dashboards to query aggregated stats.

Non-Functional Requirements

Exactly-Once Processing: Every click must be counted exactly once, despite failures.
High Throughput: Support up to 100,000 clicks/sec.
Low Latency: End-to-end delay from click to dashboard should be < 5 seconds.
Fault Tolerance: The system must recover from node crashes without data loss.

3. Capacity Estimation (E - Estimation)

Traffic Estimates

Assume 100,000 clicks per second.

Daily Clicks: 100,000 * 86,400 = 8.64 Billion clicks/day.

Storage Estimates

Raw Event Size: ~100 bytes.
Daily Storage (Raw): 8.64 Billion * 100 bytes = 864 GB/day.
Aggregated Storage: If we aggregate by (ad_id, minute), and we have 1 million active ads, that is 1 million rows per minute. 1M * 50 bytes = 50 MB/minute = 72 GB/day.

4. Data Model (D - Data Model)

We need to define the schema for the raw events coming in and the aggregated data stored in the OLAP database.

Raw Click Event (Kafka / Flink Ingest)

{
  "click_id": "uuid-1234",
  "ad_id": "ad-987",
  "user_id": "u-456",
  "ip_address": "192.168.1.1",
  "timestamp": "2023-10-27T10:00:00Z"
}

Aggregated Ad Stats (ClickHouse) We store pre-aggregated windows to reduce database load.

Column	Type	Description
`ad_id`	String	The unique identifier for the ad.
`window_start`	Timestamp	The start of the 1-minute aggregation window.
`click_count`	UInt64	The number of valid clicks in this window.

5. High-Level Architecture (A - Architecture)

Interview-Friendly High-Level Diagram

This is the simplified version of the architecture you should draw on the whiteboard.

graph TD
    Click([Ad Click]) --> LB[Load Balancer]

    subgraph Ingestion
        LB --> API[Log Ingestor API]
        API --> Kafka[Kafka Queue]
    end

    subgraph Stream Processing
        Kafka --> Flink[Apache Flink]
        Flink -- Fraud Lookup --> Redis[(Redis TTL)]
    end

    subgraph Storage & Analytics
        Flink -- Batched Aggregates --> DB[(ClickHouse OLAP)]
        Dashboard([Advertiser Dashboard]) --> DB
    end

We use a Kappa Architecture (Stream Processing) to handle both real-time counts and re-processing.

Data Ingestion Pipeline

User Clicks

Log Ingestor

(HTTP API)

Message Bus

(Kafka)

Flink App

(Windowed Agg)

Fraud Filter

(Redis Lookup)

OLAP DB

(ClickHouse)

Status

--- ## 6. Deep Dive: Exactly-Once Processing (L - Localized Details) In a distributed system, network timeouts or crashes can cause a message to be processed twice or lost. We solve this using **Kafka Transactions** and **Flink Checkpointing**. ### The Mechanism: Two-Phase Commit 1. **Source**: Flink reads an event from Kafka and gets its offset. 2. **Processing**: Flink updates the count in its state (memory). 3. **Sink**: Flink prepares to write to the database (ClickHouse). 4. **Checkpoint**: Flink triggers a snapshot of its state and current Kafka offsets to S3. 5. **Commit**: Once the snapshot is successful, Flink commits the Kafka offset. > [!IMPORTANT] > If a node crashes at Step 3, Flink restarts from the last successful Checkpoint (Step 5), essentially "rewinding" the stream and reprocessing precisely from where it left off.

Click to view Flink Logic (Pseudo-code)

DataStream<ClickEvent> clicks = env.addSource(new KafkaSource<>(...));

clicks
  .keyBy(event -> event.adId)
  .window(TumblingEventTimeWindows.of(Time.minutes(1)))
  .reduce(new CountAggregater())
  .addSink(new JDBCSink(...)); // Committed on Checkpoint

--- ## 7. Scaling the Data Store (S - Scale) Standard relational databases (PostgreSQL) struggle with the high-velocity append-only writes needed for aggregation.

Store Type	Performance	Optimization
OLTP (MySQL)	Slow for aggregates	Requires heavy indexing; bottlenecked by LOCKS.
NoSQL (Redis)	Very Fast	Great for real-time counters, but hard to query historical trends complexly.
OLAP (ClickHouse)	The Winner	Columnar storage; stores data in compressed chunks. Ideal for `SELECT SUM(count) FROM clicks`.

### Optimization: Pre-Aggregation Instead of writing 100,000 raw clicks to ClickHouse, the **Flink Application** buffers them in memory and writes **1 aggregated row per Ad per Minute**. This reduces DB writes from 100k/s to ~100/s. --- ## 8. Fraud Detection & Deduplication (S - Scale) How do we prevent a user from clicking an ad 50 times in a row? 1. **Deduplication Layer**: * Store a unique hash of `(user_id, ad_id, timestamp)` in Redis with a TTL of 1 minute. * If the hash already exists, discard the click. 2. **Rate Limiting**: * Allow a maximum of 5 clicks per user per IP per minute. --- ## 9. Interview Gauntlet 1. **How do you handle late-arriving events?** * *Ans*: Use **Watermarks** in Flink. We allow a "grace period" (e.g., 2 seconds). Events arriving later than that are either discarded or written to a "Late Arrival" side-output for auditing. 2. **What happens if your Flink job fails?** * *Ans*: The system recovers from the last **Checkpoint** stored in persistent storage (e.g., S3). It replays messages from Kafka from the saved offset. 3. **Why not just use MapReduce?** * *Ans*: MapReduce is **Batch Processing**. It would take minutes or hours to get results. We need **Stream Processing** for real-time aggregation. 4. **How do you handle a sudden spike in traffic for a single viral ad (Data Skew)?** * *Ans*: In Flink, group by `(ad_id, random_salt)` for local aggregation first, then perform a global aggregation by `ad_id` to prevent a single node from being overwhelmed by the hot key. 5. **How do you ensure advertisers are not overcharged if the system replays events after a crash?** * *Ans*: Idempotent writes to ClickHouse. Instead of `UPDATE count = count + X`, we write immutable chunks or use ClickHouse's `ReplacingMergeTree` to overwrite with the exact correct value based on the Flink checkpoint. --- ## 10. Summary * **Ingest**: Use Kafka to buffer billions of events. * **Process**: Use Apache Flink for windowed aggregation with Exactly-Once guarantees. * **Store**: Use a Columnar OLAP database like ClickHouse for fast historical queries. * **Scale**: Pre-aggregate data in-memory to minimize database write pressure. --- ## 11. Whiteboard Summary (4 Quadrant)

1. P - Core Requirements

Aggregate Clicks per ad_id
Exactly-Once Processing
High Throughput (100k/s)

2. A - Key Architecture

Ingest: Kafka
Process: Flink (Stream)
Store: ClickHouse (OLAP)

3. L - Deep Dives

Exactly-Once: Flink Checkpoints
Deduplication: Redis Hash (TTL)
Watermarks: Late Events

4. S - Scale & Bottlenecks

DB Writes: Pre-aggregate in Memory
Fraud: Rate Limiting / Blacklist

Design an Ad Click Aggregator

Design an Ad Click Aggregator

1. Problem Statement

2. Requirements & Goals (P - Problem & Requirements)

Functional Requirements

Non-Functional Requirements

3. Capacity Estimation (E - Estimation)

Traffic Estimates

Storage Estimates

4. Data Model (D - Data Model)

5. High-Level Architecture (A - Architecture)

Interview-Friendly High-Level Diagram

Data Ingestion Pipeline

1. P - Core Requirements

2. A - Key Architecture

3. L - Deep Dives

4. S - Scale & Bottlenecks

Found this lesson helpful?