Resource Quotas
Welcome to the module on Resource Quotas. Here you will learn key concepts and best practices to master this topic.
1. The Problem
If you deploy a Pod without specifying how much CPU or Memory it needs, it can consume all the available resources on the Node, potentially crashing the Node or starving other critical workloads.
2. The Solution: Requests, Limits, and Quotas
Kubernetes provides a multi-tiered approach to resource management:
- Requests and Limits (Pod Level): Define minimum guaranteed resources (
requests) and maximum allowed resources (limits) for individual containers. - LimitRanges (Namespace Level): Enforce default requests/limits for Pods in a namespace that don’t specify them.
- ResourceQuotas (Namespace Level): Cap the total aggregate resources (e.g., total RAM, total CPU, total Pods) that can be consumed by all Pods in a given namespace.
3. Interactive: Resource Allocation Simulator
See how requests and limits interact with a Node’s capacity.
Node Capacity Simulator (Memory)
4. Hardware Reality: cgroups and CFS
Resource management in Kubernetes is entirely enforced by the Linux kernel via cgroups.
CPU Throttling (Compressible Resource)
CPU is handled using the CFS (Completely Fair Scheduler) quota system.
- Requests: Sets
cpu.shares. If the Node is idle, your Pod can use more CPU than requested. If the Node is busy, you get guaranteed proportional time. - Limits: Sets
cpu.cfs_quota_usandcpu.cfs_period_us. If your Pod tries to exceed this, the kernel will physically pause (throttle) your process threads until the next period. Your app will not crash, but latency will spike massively.
Memory (Incompressible Resource)
Memory cannot be throttled.
- Requests: Used only by the
kube-schedulerto find a Node with enough capacity. - Limits: Sets
memory.limit_in_bytes. If your process tries to allocate more RAM than the limit, the Linux Out-Of-Memory (OOM) Killer is invoked. Your process receives aSIGKILL(cannot be caught) and crashes withOOMKilled.
5. Implementation: Java & Go Handling Memory Limits
When running inside containers, the runtime (e.g., JVM, Go Runtime) must be aware of cgroup limits, otherwise it might attempt to allocate more memory than the container limit, resulting in an OOM kill.
// Java: Container Awareness
// Modern JVMs (Java 10+) automatically detect cgroup memory limits.
// However, it's best practice to configure MaxRAMPercentage.
public class MemoryAwareApp {
public static void main(String[] args) {
// Run with: java -XX:MaxRAMPercentage=75.0 -jar app.jar
// The JVM will automatically set its max heap size to 75% of the container limit.
long maxMemory = Runtime.getRuntime().maxMemory();
System.out.println("Max Heap Size: " + (maxMemory / (1024 * 1024)) + " MB");
}
}
// Go: GOMEMLIMIT
// Go 1.19 introduced GOMEMLIMIT to help prevent OOMs by making the Garbage Collector more aggressive as it nears the limit.
package main
import (
"fmt"
"os"
"runtime/debug"
)
func main() {
// Run container with env var: GOMEMLIMIT=900MiB
// This tells Go's GC to keep total memory footprint under 900MiB.
if memLimit := os.Getenv("GOMEMLIMIT"); memLimit != "" {
fmt.Printf("GOMEMLIMIT configured: %s\n", memLimit)
} else {
fmt.Println("Warning: GOMEMLIMIT not set. Susceptible to OOM in constrained environments.")
}
// Explicitly set a soft memory limit programmatically (optional)
// debug.SetMemoryLimit(900 * 1024 * 1024)
}
6. YAML Manifests
Pod Requests and Limits
apiVersion: v1
kind: Pod
metadata:
name: frontend-app
spec:
containers:
- name: app
image: nginx
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Namespace ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: my-namespace
spec:
hard:
pods: "10"
requests.cpu: "4"
requests.memory: "8Gi"
limits.cpu: "8"
limits.memory: "16Gi"