Design 24h Money-Moved Totals
[!NOTE] This scenario is a favorite at Wise because it tests your ability to handle real-time data pipelines, time-window mathematics, and correctness in a public-facing dashboard.
1. The Problem: Public Transparency
Wise often publishes “Money Moved” totals to demonstrate their scale and transparency. In an interview, the prompt usually looks like this:
“Design a system to display the total volume of transactions processed in the last 24 hours for 50 currencies on a public dashboard. The totals should update near-real-time.”
Why this is hard:
- Framework Restriction: You might be asked to design the windowing logic without using Flink or Spark.
- Accuracy vs. Freshness: Does “Money Moved” mean “Users clicked Send” or “Money was delivered”?
- The Rolling Window: A “Last 24h” total is a sliding window that moves every second. You can’t just run a
SUMon a DB with millions of rows every second.
2. Requirements & Goals
Functional Requirements
- Rolling Totals: Display the sum of processed volume for the last 24 hours.
- Multi-Currency: Support at least 50 currencies (USD, EUR, GBP, etc.).
- Drill-down: (Optional) View by currency pair or region.
Non-Functional Requirements
- Low Read Latency: The public dashboard must load in < 200ms.
- High Write Throughput: Ingest thousands of transaction events per second.
- Financial Accuracy: No double-counting; must match the ledger eventually.
- Availability: The dashboard should be readable even if the ingestion pipeline lags.
3. Capacity Estimation
- Transactions: 5 Million / day ≈ 60 TPS.
- Currencies: 50.
- Storage: If we store every transaction ID for 24h to dedupe, that’s 5M records.
- Read Volume: High. A public dashboard could have 1,000s of concurrent viewers.
4. API Design
The dashboard needs a read-optimized endpoint.
Query Totals
GET /v1/public/metrics/money-moved?window=24h
Response:
{
"window_start": "2026-03-11T10:00:00Z",
"window_end": "2026-03-12T10:00:00Z",
"aggregates": [
{ "currency": "EUR", "amount_minor": 125000000, "count": 4500 },
{ "currency": "USD", "amount_minor": 98000000, "count": 3200 }
],
"global_total_usd": 245000000
}
5. High-Level Design: The Event-Driven Pipeline
We follow a classic producer-consumer pattern. We don’t query the production transaction DB directly—that would kill the database performance. Instead, we listen to an event stream.
flowchart LR
L[Ledger Service] -->|TransferSettled Event| SQ[(Message Queue / Kafka)]
SQ --> AGG[Window Aggregator]
AGG --> DB[(Time-Bucket Store)]
API[Metrics API] --> DB
DASH[Dashboard UI] --> API
The Source of Truth
CRITICAL: In a Wise interview, ask: “When does the count start?”
- Bad:
TransferCreated(User might cancel). - Good:
TransferSettled(Money has physically moved).
6. Detailed Design: Bucketed Counters (No-Framework)
If you are forbidden from using a streaming framework, you must implement Time Bucketing.
The Concept
Instead of calculating a sliding window (continuous), we use small fixed buckets (e.g., 1 minute).
- To get the “Last 24 Hours”, we sum the last 1,440 buckets (24 hours * 60 minutes).
Step-by-Step Logic
- Ingest: A message arrives:
{ currency: "EUR", amount: 100, timestamp: 10:05:22 }. - Bucketize: Round the timestamp to the nearest minute:
10:05:00. - Update:
INCR BY 100the keytotal:EUR:2026-03-12:10:05in an atomic store (Redis). - Query: To get the 24h total, the API performs a
MGETon the last 1440 keys and sums them.
Data Model (Redis)
We use a Hash or String with a TTL.
- Key:
metrics:{currency}:{bucket_timestamp} - Value:
sum_amount - TTL: 25 hours (to allow some buffer for late events).
[!TIP] Optimization: To avoid 1,440 Redis calls per request, the API can cache the sum and only “slide” it by adding the newest bucket and subtracting the oldest one every minute.
7. Deep Dive: Handling Reliability & Scale
A. Late-Arriving Events
Events can arrive out of order (e.g., a network delay causes a 10:05 event to reach the aggregator at 10:08).
- Solution: The aggregator should update the historical bucket based on the event’s
occurred_attimestamp, not the process time.
B. Scalability (Partitioning)
If Wise processes 10,000 TPS, a single Redis instance might become a bottleneck.
- Partitioning: Partition the event stream and the Redis store by Currency.
- Drift Check: Nightly, run a batch job on the Source-of-Truth DB to “correct” any drift in the Redis buckets.
C. The “Framework Ban” Trade-off
| Approach | Pros | Cons | | :— | :— | :— | | Framework (Flink) | Handles late data, watermarks, and state management automatically. | Higher infra complexity. | | Bucketed Redis | Extremely low latency, simple to debug, cheap. | Manual logic for window sliding and backfills. |
8. Summary: The Senior Interview Checklist
When presenting this solution at Wise, ensure you cover:
- Idempotency: Use a
transfer_idto ensure a retried event doesn’t increment the bucket twice. - Backfill Strategy: “What if the aggregator crashes for 3 hours?” (Explain how you’d replay the Kafka log).
- Definition of Done: Clear distinction between
SettledandInitiated. - Observability: Monitor the lag between the Ledger entry and the Dashboard update.