Design 24h Money-Moved Totals

Important

In this lesson, you will master:

  1. Time-Bucketing Logic: How to calculate sliding windows in O(1) time without expensive database scans.
  2. Event-Driven Aggregation: Moving from request-response to high-throughput streaming.
  3. The Physics of Memory: Why Redis is the 106 choice for real-time dashboards.

1. Scene Setting: The Public Metric Challenge

Wise often publishes “Money Moved” totals to demonstrate their scale. In a senior 60-minute interview, the challenge is simple to state but hard to execute:

“Design a system to display the total volume of transactions processed in the last 24 hours for 50 currencies on a public dashboard. The totals should update near-real-time (lag < 5s) and handle 50,000+ concurrent viewers.”

⚖️ Clarifying Questions (The “Senior” Probe)

Before you start drawing, narrow the scope:

  • Candidate: “Is the ‘Last 24 Hours’ a fixed window (e.g., Today vs. Yesterday) or a sliding window (e.g., now back to exactly 24h ago)?”
  • Interviewer: “Sliding window. As time moves, the ‘oldest’ minute must drop off and the ‘newest’ must be added.”
  • Candidate: “Do we need mathematically perfect accuracy, or is 99.9% aggregation drift acceptable for a public dashboard?”
  • Interviewer: “99.9% is okay for the UI, but we need a way to audit the drift later.”
  • Candidate: “Framework restriction? Can I use Flink or KSQL?”
  • Interviewer: “No. Build it from first principles using standard building blocks (Queues, Caches, DBs).”

2. Requirements & Constraints

2.1 Functional Requirements

  1. Sliding Window Aggregation: Display the total volume of transactions processed in the last 24 hours (not just a fixed calendar day).
  2. Multi-Currency Support: Support 50+ currencies (GBP, USD, EUR, etc.) with real-time conversion to a base currency (e.g., USD) for a global total.
  3. High-Traffic Dashboard: Handle 50,000+ concurrent viewers hitting the public metrics page.
  4. Low Latency: Totals must update with sub-5s lag from the moment a transaction settles.
  5. Auditability: While the dashboard can be slightly inconsistent, the system must provide a mechanism to verify and correct “drift” against the source-of-truth ledger.

2.2 Non-Functional Requirements

  1. Availability: 99.99% for the read-path. If the dashboard is down, it reflects poorly on Wise’s reliability.
  2. Scalability: Must handle 10x spikes during market volatility or product launches.
  3. Simplicity: Avoid heavy-weight streaming frameworks (Flink, Spark) unless strictly necessary. Build using standard building blocks (Redis, Kafka, Postgres).

2.3 Compliance & Accuracy

  • Regulatory Transparency: Totals must be defensible. If a regulator asks why the total changed, the system must link back to “Settled” events.
  • Precision: Use fixed-point arithmetic (minor units) to avoid floating-point rounding errors.

3. Capacity Planning & Estimation (Napkin Math)

Before designing, we must understand the “weight” of the data.

3.1 Throughput Analysis

  • Daily Volume: 5,000,000 transactions/day.
  • Average TPS: 5,000,000 / 86,400 ≈ 60 TPS.
  • Peak TPS: Financial traffic is bursty. We design for a 20x spike → 1,200 TPS.
  • Read Traffic: 50,000 concurrent viewers. If the UI polls every 5s → 10,000 QPS on the read-path.

3.2 Memory & Storage (Redis)

  • Aggregation Granularity: 1-minute buckets.
  • Sliding Window: 24 hours = 1,440 minutes.
  • Currencies: 50.
  • Total Data Points: 1,440 buckets × 50 currencies = 72,000 keys.
  • Data Size: Each key-value pair is small (~100 bytes).
    • Total Memory: 72,000 × 100 B ≈ 7.2 MB.
  • Conclusion: This fits in the L3 cache of a single modern server. The bottleneck is Network I/O, not storage capacity.

3.3 Network Bandwidth

  • Ingestion: 1,200 TPS × 1 KB payload ≈ 1.2 MB/s.
  • Read Path: 10,000 QPS × 200 Bytes (JSON total) ≈ 2 MB/s.
  • Conclusion: Trivial for a standard 10Gbps cloud network.

4. The 4-Quadrant Whiteboard Layout

1. Reqs & Math
Throughput: ~60 TPS (avg)
Peak: 1,000 TPS

Func:
- 24h sliding totals
- 50 Currencies
- Lag < 5s

Non-Func:
- No Flink/Spark
- High Read Availability
2. High Level Design
Ledger (SQL)
|
[CDC / Kafka]
|
[Aggregator Svc]
|
[Redis Rings]
|
[API] → CDN
3. Data Models / API
GET /metrics/money-moved

Redis Key Format:
metric:{ccy}:{timestamp_min}

Example:
metric:EUR:29456210 → 1250000
4. Deep Dives
- Atomic INCR in Redis
- TTL Management (25h)
- Nightly Reconciliation
- Handling Duplicates
- Global Total calculation

5. High-Level Architecture: The Event-Driven Pipeline

We decouple Transaction Processing from Metric Aggregation.

Source of Truth
Ledger DB
(Postgres/SQL)
Streaming Layer
Kafka (CDC)
Topic: transaction.settled
Aggregator Worker
Consumer Group: metrics-viz
State Layer (RAM)
Redis Buckets
1,440 keys/CCY
Global Total
O(1) Pre-Aggregated
Public Clients
Public Dashboard
50k CCU

The Ingestion Path

  1. Event Source: The Ledger emits a TransferSettled event to Kafka via Change Data Capture (CDC). This ensures that every committed transaction in the SQL DB is eventually propagated without putting load on the application layer.
  2. Aggregation: The Aggregator service consumes the event and identifies the currency and the 1-minute bucket (Unix timestamp / 60).
  3. Storage: It calls INCRBY metric:USD:17105024 500 in Redis.

5. Elite Interactive: The Sliding Window Visualizer

Understand how time-bucketing works by simulating incoming events and the 24h rolling sum.

24h Rolling Revenue

$0
24h ago Now (Bucket 0)

6. Deep Dive: Technical Implementation & Concurrency

6.1 The Write Path: Ingestion & State Generation

The goal of the write path is to decouple the analytical dashboard from the primary financial ledger, ensuring analytics never degrade core money-movement infrastructure.

  • Event-Driven CDC: The primary Postgres ledger emits a TransferSettled event to Kafka via Change Data Capture (CDC).
  • Partitioning for Scale (The Correction): The Kafka topic transaction.settled is partitioned by transfer_id (or user_id), not currency_id. This prevents “hot partitions” for high-volume currencies like USD and guarantees perfectly even load distribution across your worker pods.
  • Concurrency Handling: Because Redis commands (INCRBY) are atomic and addition is commutative, multiple consumers can process events for the same currency simultaneously without race conditions.
  • Event-Time Bucketing: The Kafka consumer extracts the timestamp_min directly from the event payload. This ensures that even if events arrive out-of-order or the consumer is lagging, the transaction is added to the mathematically correct historical minute, completely eliminating clock-skew issues.

6.2 The Storage Layer: Redis Infrastructure

The state layer must be optimized for O(1) time and space complexity.

  • The O(1) Sliding Window: Instead of a massive, memory-heavy ZSET, state is stored in 1,440 discrete 1-minute buckets per currency using Redis Hashes or individual Keys (e.g., metric:USD:17105024). A 25-hour TTL ensures old data naturally evaporates, capping memory footprint at a trivial ~7.2MB.
  • High Availability Topology: Avoid Redis Cluster Mode, as the dataset is tiny and read-throughput is shielded by the API. Deploy a Highly Available Standalone setup (e.g., AWS ElastiCache Primary with a Multi-AZ Standby Replica) managed by Sentinel for seamless failover.
  • Persistence & Recovery:
    • Warm Start: Enable Redis AOF (appendfsync everysec). If a container crashes, it reads from the local disk and recovers in seconds.
    • Cold Start (Disaster Recovery): If the primary disk is destroyed, use a Blue/Green deployment to spin up a fresh Redis node and replay the Kafka topic from offset - 24h. Never run a massive replay on the live environment to avoid a “replay storm.”

6.3 The Read Path: Caching & Real-Time FX Conversion

The read path must serve 50,000+ concurrent viewers with sub-5s latency without melting the database.

  • The API Protection Layer: The public API (e.g., a Spring Boot service deployed on Kubernetes) never allows user requests to hit Redis directly. It utilizes an in-memory cache (Caffeine/Guava) configured with a “Write-Through/Refresh-Ahead” pattern, polling Redis only once every 1-2 seconds.
  • Read-Time FX Conversion (Crucial Detail): Currency conversion must never happen on the write path, as rates fluctuate constantly.
    1. A background thread on the Spring Boot API continuously fetches the latest FX rates from an internal Exchange Rate Service and caches them in memory.
    2. When the cache refresh triggers, the API pulls the 1,440 un-converted buckets for all 50 currencies from Redis.
    3. The API applies the cached FX multiplier to each currency’s total in memory, sums them up, and serves the normalized global total (e.g., in USD) to the frontend.

7. Operational Excellence: Drift & Reconciliation

In distributed systems, eventual consistency means occasional data drift. The system must self-heal without jarring user experience.

  • Targeted Bucket Corrections (Avoiding “Ghost Corrections”): You cannot fix drift by applying a global + / - delta, because that delta will incorrectly persist after the missed transaction slides out of the 24h window. Corrections must target the specific minute that failed.
  • The Reconciliation Cron: A nightly (or hourly) batch job queries the Postgres ledger using GROUP BY date_trunc('minute', timestamp) and compares it to the Redis buckets.
  • The Healing Mechanism: If a discrepancy is found (e.g., Redis missed $50 at 10:05 AM), the cron job emits a specific BucketCorrection event to Kafka. The live consumer processes this delta (INCRBY metric:USD:17105024 50), surgically fixing the past. The correction naturally slides out of the 24h window at the exact right time.

8. Core Entities and API Contracts

A. Ingestion Payload (Kafka transaction.settled)

{
  "transfer_id": "tx_987654321",
  "currency": "EUR",
  "amount_minor_units": 10000,
  "settled_at_utc": "2026-03-25T13:25:01Z"
}

B. State Layer (Redis)

  • Key Schema: metric:{currency}:{timestamp_minute} (e.g., metric:EUR:29456210)
  • Value: 10000 (Integer representing the sum in minor units)
  • TTL: 25 Hours

C. Public API (Read Path)

  • Endpoint: GET /v1/metrics/money-moved/24h?base_currency=USD
  • Response:
    {
    "time_window": "24h",
    "base_currency": "USD",
    "total_amount": 12504500.50,
    "data_as_of": "2026-03-25T13:28:00Z",
    "is_degraded": false // Flips to true if Kafka consumer lag triggers an alert
    }
    

D. Internal Admin API (The Missing Drift Endpoint) Used by the reconciliation cron job or manually by Ops engineers to audit or force a correction without touching production databases directly.

  • Endpoint: POST /internal/v1/metrics/reconciliation/correct
  • Payload:
    {
    "currency": "USD",
    "bucket_timestamp_min": 29456210,
    "sql_authoritative_total": 55000,
    "redis_current_total": 50000,
    "delta_to_apply": 5000,
    "reason": "Automated hourly drift catch-up"
    }
    
  • Action: This endpoint authenticates the internal request and publishes the BucketCorrection event directly to the Kafka topic, ensuring the state fix flows through the exact same pipelines and audit logs as live traffic.

9. Operational Failure Modes (Playbooks)

Scenario A: Redis Cluster Wipe

  • Problem: A misconfiguration clears all Redis keys. The dashboard shows “0 Money Moved.”
  • Playbook:
    1. Trigger the Replay Aggregator.
    2. Seek the Kafka consumer back 24 hours.
    3. Re-process all TransferSettled events into empty buckets.
    4. With a 60 TPS average, the system can “re-hydrate” the last 24h in < 15 minutes.
      • The Math: 60 TPS × 86,400 seconds (24h) &approx; 5.18 million events. To re-process these in 15 minutes (900 seconds), our consumer needs to handle 5,184,000 / 900 &approx; 5,760 TPS. A standard Kafka consumer writing to Redis via pipeline can easily hit 50,000+ TPS, making a 15-minute recovery trivial.

Scenario B: Consumer Lag

  • Problem: The Aggregator service is running slow; the dashboard is 10 minutes behind.
  • Playbook:
    1. Scale the Consumer Group horizontally.
    2. Partitioning Strategy: To scale consumers, we must partition the Kafka topic. As established, we partition by transfer_id (or user_id), not currency_id. This evenly distributes the massive load of high-volume currencies like USD across all consumers. Because Redis operations (INCRBY) are atomic and commutative, multiple consumers can update the same currency bucket safely.

Scenario C: Handling Out-of-Order Events in the Stream

  • Problem: Network latency causes a transfer that happened at 10:01 to arrive in Kafka after a transfer that happened at 10:02.
  • Playbook: Because we use Event-Time Bucketing (extracting the timestamp from the payload) rather than processing time, the aggregator simply increments the 10:01 bucket, even if it’s currently processing 10:02. Order in the stream does not affect the final mathematical sum.

💡 Hardware-First Intuition: The 106 Difference

Why did we choose Redis instead of PostgreSQL for the counters?

  1. Sequential I/O vs. Random Seek: Incrementing a counter in SQL requires a “Read-Modify-Write” cycle with row-level locks on disk (10 ms).
  2. Memory Latency: Redis performs the INCR in-memory (0.1 ms).
  3. Throughput: 1,000 TPS on SQL puts a significant load on the I/O subsystem. 1,000 TPS on Redis is “idling” (Redis can handle 100k+).

10. Interview Pacing & Milestone Guide

Time Task Key Talking Points
0-10m Reqs & Estimates Define “Money Moved” (Settled vs Initiated). Calculate the 1,440 buckets math.
10-25m High Level Design Explain why we use Kafka to decoupler from the Ledger. Draw the Redis sink.
25-45m Sliding Window Logic Deep dive into bucket naming, TTL strategies, and the O(1) summing optimization.
45-55m Recovery & Scale Discuss Nightly Reconciliation for accuracy and handling late events.
55-60m Summary Final trade-offs: Accuracy vs Refreshness.

11. Summary: Senior Interview Checklist

  • Idempotency: Use the transfer_id to ensure an aggregator retry doesn’t double-count a transaction.
  • Precision Logic: When summing 1,440 buckets, ensure the API handles “missing” buckets (treat as 0) gracefully.
  • Latency Management: For global audiences, explain how you could run Regional Aggregators that rollup to a Global Redis HQ.
  • Observability: Define an alert for “Consumer Lag” — if the aggregator is 10 mins behind Kafka, the dashboard is lying.

12. Follow-up Interview Questions

  1. “What if events arrive out of order or extremely late?” Answer: Because we bucket by the event’s exact timestamp_min (extracted from the payload) rather than the processing time, out-of-order events naturally fall into their correct historical bucket. If an event is >24h late, it falls outside the dashboard’s queried range and is safely ignored by the real-time UI, but will be caught by the Nightly Reconciliation job for auditing.

  2. “How do you prevent the API from overloading Redis if there are 50,000 concurrent viewers?” Answer: The Dashboard UI does not query Redis directly. The API maintains an in-memory cache (e.g., Guava or Caffeine) of the 24h total that refreshes every 1-2 seconds. 50,000 viewers hit the API’s memory/CDN edge, resulting in just 1 request per second to Redis.

  3. “What happens if a Kafka broker dies mid-aggregation?” Answer: Kafka provides replication. As long as we use at least acks=all for producers and the consumer commits its offset after the Redis INCRBY succeeds (At-Least-Once delivery), we won’t lose data. We might process an event twice if the consumer dies after Redis but before offset commit, which is why idempotency (or relying on the <0.1% drift tolerance) is key.

  4. “Can we use a single key with a sliding window ZSET (Sorted Set) instead of 1,440 buckets?” Answer: A ZADD based on timestamp is possible (ZADD metric:USD <timestamp> <amount>), but space complexity grows linearly with every transaction rather than every minute. For 5M transactions, a ZSET is memory-heavy and ZREMRANGEBYSCORE becomes expensive. The 1,440 bucket approach provides constant O(1) space and time complexity regardless of transaction volume.

  5. “How do you ensure correct ordering and partition the Kafka stream for this aggregator?” Answer: We partition the transaction.settled topic by transfer_id (or user_id), not currency_id. Partitioning by currency_id causes severe hotspots (e.g., USD might take 80% of volume, maxing out one consumer). By partitioning by transfer_id, we evenly distribute load. The trade-off is that multiple consumers might update the same currency bucket in Redis at once, but since Redis INCRBY is atomic, this is perfectly safe.