Design 24h Money-Moved Totals

[!NOTE] This scenario is a favorite at Wise because it tests your ability to handle real-time data pipelines, time-window mathematics, and correctness in a public-facing dashboard.

1. The Problem: Public Transparency

Wise often publishes “Money Moved” totals to demonstrate their scale and transparency. In an interview, the prompt usually looks like this:

“Design a system to display the total volume of transactions processed in the last 24 hours for 50 currencies on a public dashboard. The totals should update near-real-time.”

Why this is hard:

Framework Restriction: You might be asked to design the windowing logic without using Flink or Spark.
Accuracy vs. Freshness: Does “Money Moved” mean “Users clicked Send” or “Money was delivered”?
The Rolling Window: A “Last 24h” total is a sliding window that moves every second. You can’t just run a SUM on a DB with millions of rows every second.

2. Requirements & Goals

Functional Requirements

Rolling Totals: Display the sum of processed volume for the last 24 hours.
Multi-Currency: Support at least 50 currencies (USD, EUR, GBP, etc.).
Drill-down: (Optional) View by currency pair or region.

Non-Functional Requirements

Low Read Latency: The public dashboard must load in < 200ms.
High Write Throughput: Ingest thousands of transaction events per second.
Financial Accuracy: No double-counting; must match the ledger eventually.
Availability: The dashboard should be readable even if the ingestion pipeline lags.

3. Capacity Estimation

Transactions: 5 Million / day ≈ 60 TPS.
Currencies: 50.
Storage: If we store every transaction ID for 24h to dedupe, that’s 5M records.
Read Volume: High. A public dashboard could have 1,000s of concurrent viewers.

4. API Design

The dashboard needs a read-optimized endpoint.

Query Totals

GET /v1/public/metrics/money-moved?window=24h

Response:

{
  "window_start": "2026-03-11T10:00:00Z",
  "window_end": "2026-03-12T10:00:00Z",
  "aggregates": [
    { "currency": "EUR", "amount_minor": 125000000, "count": 4500 },
    { "currency": "USD", "amount_minor": 98000000, "count": 3200 }
  ],
  "global_total_usd": 245000000 
}

5. High-Level Design: The Event-Driven Pipeline

We follow a classic producer-consumer pattern. We don’t query the production transaction DB directly—that would kill the database performance. Instead, we listen to an event stream.

flowchart LR
  L[Ledger Service] -->|TransferSettled Event| SQ[(Message Queue / Kafka)]
  SQ --> AGG[Window Aggregator]
  AGG --> DB[(Time-Bucket Store)]
  API[Metrics API] --> DB
  DASH[Dashboard UI] --> API

The Source of Truth

CRITICAL: In a Wise interview, ask: “When does the count start?”

Bad: TransferCreated (User might cancel).
Good: TransferSettled (Money has physically moved).

6. Detailed Design: Bucketed Counters (No-Framework)

If you are forbidden from using a streaming framework, you must implement Time Bucketing.

The Concept

Instead of calculating a sliding window (continuous), we use small fixed buckets (e.g., 1 minute).

To get the “Last 24 Hours”, we sum the last 1,440 buckets (24 hours * 60 minutes).

Step-by-Step Logic

Ingest: A message arrives: { currency: "EUR", amount: 100, timestamp: 10:05:22 }.
Bucketize: Round the timestamp to the nearest minute: 10:05:00.
Update: INCR BY 100 the key total:EUR:2026-03-12:10:05 in an atomic store (Redis).
Query: To get the 24h total, the API performs a MGET on the last 1440 keys and sums them.

Data Model (Redis)

We use a Hash or String with a TTL.

Key: metrics:{currency}:{bucket_timestamp}
Value: sum_amount
TTL: 25 hours (to allow some buffer for late events).

[!TIP] Optimization: To avoid 1,440 Redis calls per request, the API can cache the sum and only “slide” it by adding the newest bucket and subtracting the oldest one every minute.

7. Deep Dive: Handling Reliability & Scale

A. Late-Arriving Events

Events can arrive out of order (e.g., a network delay causes a 10:05 event to reach the aggregator at 10:08).

Solution: The aggregator should update the historical bucket based on the event’s occurred_at timestamp, not the process time.

B. Scalability (Partitioning)

If Wise processes 10,000 TPS, a single Redis instance might become a bottleneck.

Partitioning: Partition the event stream and the Redis store by Currency.
Drift Check: Nightly, run a batch job on the Source-of-Truth DB to “correct” any drift in the Redis buckets.

C. The “Framework Ban” Trade-off

8. Summary: The Senior Interview Checklist

When presenting this solution at Wise, ensure you cover:

Idempotency: Use a transfer_id to ensure a retried event doesn’t increment the bucket twice.
Backfill Strategy: “What if the aggregator crashes for 3 hours?” (Explain how you’d replay the Kafka log).
Definition of Done: Clear distinction between Settled and Initiated.
Observability: Monitor the lag between the Ledger entry and the Dashboard update.