Citus: Turning Postgres into a Cluster
Standard Postgres scales vertically (bigger CPU, more RAM). But eventually, you hit the limits of a single machine (e.g., 128 cores, 4TB RAM). Moreover, vertical scaling becomes disproportionately expensive at the high end.
Citus allows Postgres to scale horizontally by sharding your data across multiple nodes (machines), while still appearing as a single database to your application. This is essential for highly scalable SaaS architectures, multi-tenant databases, and real-time analytics dashboards.
1. The Genesis: Why Shard?
Real-world Scenario: Imagine you are building a B2B SaaS platform for analytics (like Mixpanel). Initially, you have 100 customers, and all their data fits nicely on one Postgres node. But then, you land a massive enterprise client whose event data alone is 5TB. Your single Postgres instance grinds to a halt. You can’t just buy a bigger server indefinitely (vertical scaling has limits and gets exponentially expensive). If you split that 10TB across 10 machines (1TB each), the entire dataset can fit in aggregate RAM (fast). This is conceptually similar to how a group of people can carry a heavy load much more easily than one person.
The Analogy: Think of a single Postgres node as a massive warehouse managed by one overworked supervisor. Sharding is like splitting the inventory across 10 different regional warehouses, each with its own supervisor. The challenge is ensuring an order (query) goes to the right warehouse without asking all 10 supervisors. If you split that 10TB across 10 machines (1TB each), the entire dataset can fit in aggregate RAM (fast). This is conceptually similar to how a group of people can carry a heavy load much more easily than one person.
The Reality of Distributed Systems Distributing data introduces complexity. The CAP Theorem implies that we must balance Consistency, Availability, and Partition Tolerance. Citus typically prioritizes Consistency and Partition Tolerance (CP), ensuring data integrity across the cluster.
The Challenge: Joins If Table A is on Node 1 and Table B is on Node 2, joining them requires sending data over the network (slow).
The Solution: Co-location
Citus ensures that related data (e.g., all orders for user_id: 123) lives on the same node. This means joins can happen locally on that node without network overhead.
2. Interactive: Sharding Simulator
See how Citus distributes rows based on a Shard Key (e.g., tenant_id).
- Coordinator: Receives the query.
- Workers: Store the actual data.
- Router: Decides which worker gets the data based on
Hash(shard_key) % num_shards.
Control Panel
Simulate inserting rows into a distributed table.
Coordinator: Top Node
Workers: Bottom Nodes
Current Distribution: Uniform
3. Architecture Patterns
A. Distributed Tables
Tables are sharded by a key. Large tables (Users, Orders, Events).
- Write: Hashed to a specific shard/node.
- Read: Routed to a single node (if filter includes shard key) or scatter-gathered (if it doesn’t).
B. Reference Tables
Small tables that are needed everywhere (Countries, Currencies, Product Categories).
- Replication: These tables are replicated to all nodes.
- Benefit: Local joins! You can join a Distributed Table (Orders) with a Reference Table (Currencies) without any network traffic.
4. Case Study: Multi-Tenant SaaS (PEDALS Framework)
Let’s apply the PEDALS framework to design a scalable event-tracking system using Citus.
- Process Requirements: We need to ingest millions of events per second for thousands of tenants. Each tenant queries their own data independently.
- Estimate: 10,000 tenants, 1B events/day, 500GB/day data growth.
- Data Model: We need an
eventstable and atenantstable. - Architecture: A Citus cluster with 1 Coordinator and 10 Worker nodes. We shard by
tenant_id. - Localized Details: We use Reference Tables for small lookup data (e.g.,
event_types) to ensure local joins on all worker nodes. - Scale: As data grows, we seamlessly add more worker nodes and rebalance the shards without downtime.
5. Performance Comparison
| Operation | Single Node Postgres | Citus (Distributed) | Notes | | :— | :— | :— | :— | | Point Read | Very Fast | Fast | Citus adds slight routing overhead (Coordinator -> Worker). | | Local Join | Very Fast | Very Fast | Citus excels when data is co-located on the same shard. | | Cross-Shard Join | N/A | Slow | Requires pulling data across the network. Avoid if possible. | | Aggregate Query | Slower | Very Fast | Citus pushes aggregation down to workers and parallelizes. | | Write Throughput | Limited by disk/CPU | Highly Scalable | Writes are distributed across multiple worker nodes. |
6. Code Implementation
A. Sharding a Table
SQL
-- 1. Create standard tables
CREATE TABLE tenants (
id UUID PRIMARY KEY,
name TEXT
);
CREATE TABLE users (
id UUID,
tenant_id UUID,
email TEXT,
PRIMARY KEY (id, tenant_id) -- Composite key required for co-location
);
-- 2. Distribute the tables (Shard by tenant_id)
SELECT create_distributed_table('tenants', 'id');
SELECT create_distributed_table('users', 'tenant_id');
-- Now, inserting into 'users' will automatically route data to the same node as 'tenants'
-- based on the hash of tenant_id.
B. Multi-Tenant Query
When you run a query filtering by tenant_id, Citus routes it to a single worker node. This is the secret to scaling SaaS apps to millions of tenants.
Java (JDBC)
public void getTenantUsers(Connection conn, UUID tenantId) throws SQLException {
// The coordinator sees this WHERE clause and sends the query
// ONLY to the node holding this tenant's data.
String sql = "SELECT * FROM users WHERE tenant_id = ?";
try (PreparedStatement pstmt = conn.prepareStatement(sql)) {
pstmt.setObject(1, tenantId);
ResultSet rs = pstmt.executeQuery();
// ... process results ...
}
}
Go (pgx)
func GetTenantUsers(ctx context.Context, conn *pgx.Conn, tenantId string) error {
// Citus handles the routing transparently.
// To the application, it looks like a single Postgres instance.
sql := `SELECT * FROM users WHERE tenant_id = $1`
rows, _ := conn.Query(ctx, sql, tenantId)
defer rows.Close()
// ... process results ...
return nil
}