Design an Exchange-Rate Service

[!NOTE] Exchange rates are the heartbeat of Wise. This system must be highly available, defensibly accurate, and audit-friendly.

1. The Problem: The “Fair” Rate

Wise prides itself on providing the “mid-market” rate. To do this, you must aggregate data from multiple providers (Reuters, Bloomberg, banks) in real-time, filter out bad data, and serve it with sub-millisecond latency.

The Interview Challenge:

  • “Design a service that provides the latest exchange rate for any currency pair (e.g., EUR/GBP).”
  • “Produce a 24-hour aggregation report (OHLC: Open, High, Low, Close) for these rates.”

2. Requirements & Goals

Functional Requirements

  1. Latest Rate: Return the current aggregated rate for a pair.
  2. Historical Report: Provide 24h metrics (Avg, Max, Min, OHLC).
  3. Provider Diversity: Ingest from 5+ external providers simultaneously.

Non-Functional Requirements

  1. Accuracy: Detect and ignore outlier rates from broken provider APIs.
  2. Low Latency: GET /latest should be served primarily from cache.
  3. Auditability: Every rate used in a transaction must be traceable to the raw provider observations.

3. Capacity Estimation

  • Currency Pairs: ~10,000 potential combinations (though 50-100 are “majors”).
  • Update Frequency: Providers might push updates every 100ms.
  • Query Volume: Millions of users checking rates + internal checkout flows.

4. API Design

Get Latest Rate

GET /v1/rates/latest?base=EUR&quote=GBP

Response:

{
  "pair": "EURGBP",
  "mid_rate": 0.8571,
  "as_of": "2026-03-12T10:15:00Z",
  "contributing_providers": 4,
  "stale": false
}

Get 24h Report

GET /v1/rates/report?base=EUR&quote=GBP&window=24h

5. High-Level Design: Ingestion to API

The pipeline must normalize data from different formats (JSON, XML, FIX) into a unified internal model.

flowchart LR
  P1[Provider A] -->|Push/Pull| ING[Rate Ingestor]
  P2[Provider B] -->|Push/Pull| ING
  ING --> NORM[Normalization Svc]
  NORM --> LOG[(Immutable Rate Log)]
  LOG --> AGG[Aggregator Svc]
  AGG --> SNAP[(Latest Snapshots Cache)]
  LOG --> TS[(Time-Series Store)]
  API[Rates API] --> SNAP
  REP[Reporting Job] --> TS

6. Detailed Design: The “Best Rate” Logic

How do we decide the “Mid-Market” rate when different providers disagree?

A. Outlier Detection (The Drunken Provider)

If Provider A says 1.08, B says 1.09, but C suddenly says 1.50 (due to a bug), we must ignore C.

  • Trimmed Mean: Discard the highest and lowest 10% of observations and average the rest.
  • Z-Score Filtering: Use the standard deviation of previous observations. If a new rate is > 3 deviations away, flag it for review.

B. Stale Data Handling

If a provider stops sending updates, their data becomes an “Anchor” that pulls the average towards outdated prices.

  • TTL Policy: If an observation is > 5 minutes old, exclude it from the current aggregate.

C. Data Modeling (Postgres + TimeScaleDB)

While the Latest rate lives in Redis for speed, the Historical data belongs in a time-series store.

-- Raw Observation Table
CREATE TABLE rate_observations (
    provider_id VARCHAR(50),
    base_currency CHAR(3),
    quote_currency CHAR(3),
    bid DECIMAL(18, 8),
    ask DECIMAL(18, 8),
    observed_at TIMESTAMPTZ NOT NULL
);
-- Create Hypertable (TimeScaleDB) for fast OHLC queries

7. Deep Dive: Provider Resilience

Financial APIs are notoriously unreliable.

  1. Circuit Breaker: If Provider A returns 5xx errors or 0.0 rates, “trip” the circuit for 60 seconds to stop affecting the aggregate.
  2. Weighting: Give higher weights to high-liquidity providers (e.g., major banks) over smaller, slower brokers.
  3. Fallback: If ALL providers fail, serve the Last Known Good rate but add a stale: true flag to the API response.

8. Summary: The Senior Interview Checklist

  1. Normalization: How do you handle different decimal precision (e.g., JPY having 0 decimals vs EUR having 2)?
  2. Consistency: How do you ensure a user locks a rate for 30 minutes during a transfer? (Hint: Link the rate_snapshot_id to the transfer).
  3. Latency: Use a “Push” architecture (WebSockets) from providers where possible to minimize lag.
  4. Audit: “Provider B sent a bad rate that caused us to lose money. How do we prove it?” (Show the immutable observation log).