RPC and gRPC: The Microservices Standard

[!TIP] Interview Tip: “REST is for Humans (Public APIs). gRPC is for Machines (Internal Microservices).” If you are building a public-facing service like Stripe, use REST. If you are building internal services like Uber’s ride matching, use gRPC (and protect it with Rate Limiting).

1. What is gRPC?

gRPC (gRPC Remote Procedure Call) is Google’s open-source framework for high-performance communication.

  • Protocol Buffers (Protobuf): Binary serialization (not JSON).
  • HTTP/2: Multiplexing and streaming built-in.
  • Strict Contracts: You define the API in .proto files first.

2. The Power of Protobuf (vs JSON)

Why is gRPC 10x faster?

  1. Size: JSON repeats keys ("name": "Alice", "name": "Bob"). Protobuf uses numbered tags (1: "Alice", 1: "Bob").
  2. Parsing: Parsing text (JSON) is CPU expensive. Parsing binary (Protobuf) is instant.

Interactive Demo: Serialization Overhead Race

Parsing text (JSON) requires scanning every character for quotes, colons, and brackets. Protobuf just reads bytes directly into memory structs.

JSON Parsing (CPU Intensive)
Waiting...
Protobuf Parsing (Zero Copy)
Waiting...

Size Comparator: JSON vs Protobuf

Size Comparator: JSON vs Protobuf

Type a value to see how Protobuf strips away the metadata overhead.

JSON (Text)
{"user_name":"Satoshi"}
22 bytes
{ }
Protobuf (Binary)
0a 07 53 61 74 6f 73 68 69
9 bytes
01
SAVING: 59% SPACE

3. The “Load Balancing” Nightmare

This is the most common gRPC interview trap. “How do you load balance gRPC?”

The Problem: Sticky Connections

  • REST (HTTP/1.1): Client opens connection, sends request, gets response, closes connection. The Load Balancer (LB) can easily round-robin requests.
  • gRPC (HTTP/2): Client opens One Persistent Connection and keeps it open for days.
    • If you put a standard L4 LB (AWS NLB) in the middle, it just forwards that one TCP connection to one server.
    • Result: Server A gets 100% of traffic. Server B gets 0%. (See L4 vs L7 Load Balancing).

The Solutions

  1. L7 Load Balancing (Proxy): Use a smart proxy (e.g., Envoy, Nginx). It terminates the HTTP/2 connection, inspects individual requests, and distributes them. (Most common).
  2. Client-Side Balancing (Lookaside): The Client asks a Service Registry (e.g., Consul) for a list of IPs and connects to all of them, doing its own Round Robin. (Complex client logic).

Interactive Demo: L4 vs L7 Load Balancing

Visualize why L4 fails for gRPC.

  • Mode L4: All requests follow the Single Connection to Server 1. Server 2 is idle.
  • Mode L7: The Proxy opens connections to both. Requests are distributed evenly.

The gRPC Load Balancing Trap

💻
CLIENT
⚖️
L4 BALANCER
🏠
REQS: 0
🏠
REQS: 0
ID: #99

L4 sees one persistent TCP connection and sticks to it.


4. Interactive Demo: Schema Evolution (Protobuf)

See why Protobuf is “Backward Compatible”.

  1. We start with a simple message.
  2. Click “Add Email Field”.
  3. Notice the Hex Output grows, but the original bytes (Tag 1 and 2) stay exactly the same. Old clients can still read the name and ID!
// user.proto
message User {
int32 id = 1;
string name = 2;
string email = 3;
}
Payload Hex View
Blue = Field Tag (ID)
Orange = Value (Data)

System Walkthrough: The gRPC Call

When you run client.GetUser({id: 150}), what happens?

  1. Stub: Code generated from .proto takes your object.
  2. Serialization: Converts {id: 150} into 08 96 01 (Protobuf).
  3. Framing (HTTP/2): Wraps it in a DATA frame.
    • Adds 5 bytes prefix: [Compressed Flag] [Length (4 bytes)].
  4. Network: Sends over persistent TCP connection.
  5. Server: Decodes frame -> Deserializes Protobuf -> Calls actual Go/Java function.

5. Can I use gRPC in the Browser?

No, not directly.

The Problem

gRPC relies heavily on HTTP/2 Trailers (headers sent after the body) to send the Status Code (e.g., grpc-status: 0). Browser JavaScript APIs (fetch, XHR) generally do not give you access to HTTP/2 Trailers. If the request fails, the browser hides the specific gRPC error.

The Solution: gRPC-Web

gRPC-Web is a protocol that wraps the gRPC data in a way browsers can understand (often base64 encoded text). You need a “Translation Layer” (Proxy) in the middle.

🌍
Browser
gRPC-Web
(HTTP/1.1 or 2)
🛡️
Envoy Proxy
Translates
Encoding
⚙️
Go/Java Svc
Pure gRPC
(HTTP/2)
The Envoy Proxy strips the "Web" wrapper and talks pure gRPC to the backend.

6. gRPC vs HTTP Status Codes

gRPC doesn’t use 200/404. It uses its own Enum.

gRPC Status HTTP Code Meaning
OK (0) 200 Success.
INVALID_ARGUMENT (3) 400 Bad Request (Validation failed).
NOT_FOUND (5) 404 Resource missing.
PERMISSION_DENIED (7) 403 Auth failed.
UNAUTHENTICATED (16) 401 Missing Token.
RESOURCE_EXHAUSTED (8) 429 Rate limit hit.
UNAVAILABLE (14) 503 Server down / Maintenance.

6.5 The Silent Killer: No Deadlines (Timeouts)

In microservices, if Service A calls B, and B calls C, and C hangs… the whole chain hangs. gRPC solves this with Deadlines (Context Propagation).

  1. Service A: “I need this done in 100ms.” (Sends request to B with grpc-timeout: 100m).
  2. Service B: Takes 20ms to process. Calls Service C. Forwarding the remaining time (80ms).
  3. Service C: Takes 90ms.
  4. Result: At 80ms, Service B cancels the request to C and returns DEADLINE_EXCEEDED to A. The system fails fast instead of hanging.

[!TIP] Always set Deadlines. The default is “Infinite”, which is a production outage waiting to happen.

Summary: REST vs gRPC

Feature REST (Open API) gRPC (Internal)
Payload JSON (Text) Protobuf (Binary)
Contract Loose (Swagger) Strict (.proto)
Streaming Request/Response only Bi-directional Streaming
Best For Mobile apps, Public APIs Microservices, High throughput

[!IMPORTANT] gRPC-Web: Browsers cannot speak raw gRPC because they don’t have access to HTTP/2 trailers. You need a proxy like Envoy to translate between the browser and the gRPC backend.