Volume Management: Taming the Beast

The War Story: Imagine a junior engineer tearing down a misbehaving database container with docker rm -f production_db. Because the database was writing to a container layer instead of an external volume, five years of customer records vanish instantly. Volumes are the external hard drives of the container world—they are persistent, meaning they survive container deletion. But with great persistence comes great responsibility: if unmanaged, your disk will fill up with orphan volumes containing gigabytes of stale data.

1. The Anatomy of a Volume

Under the hood, Docker manages volumes natively. On Linux, they are typically stored at /var/lib/docker/volumes/. Because Docker manages this location, it is easier to back up, migrate, and secure than arbitrary bind mounts.

The “-v” vs “–mount” Debate

Historically, -v (or --volume) was the standard. It consists of three fields separated by colons: -v my-vol:/app/data:ro. However, --mount is more explicit and verbose, making it preferred for production services: --mount source=my-vol,target=/app/data,readonly

Backups & Migrations

Because a volume is an isolated directory on the host, backing it up involves spinning up a temporary container that mounts both the volume and a local backup directory:

# Backing up a volume to a tarball
docker run --rm --mount source=my-db-data,target=/data -v $(pwd):/backup ubuntu tar cvf /backup/db_backup.tar /data

2. Advanced: Volume Plugins (NFS & Cloud)

Volumes aren’t restricted to local disk. Docker volume plugins allow containers to directly mount external storage.

  • NFS (Network File System): Share data across multiple Docker hosts.
  • Cloud Providers: Plugins like rexray/ebs for AWS Elastic Block Store, allowing persistent storage across EC2 instances.

Example of creating an NFS-backed volume:

docker volume create --driver local \
  --opt type=nfs \
  --opt o=addr=192.168.1.1,rw \
  --opt device=:/path/to/dir \
  my-nfs-volume

3. The Lifecycle of a Volume

  1. Creation: Explicit (docker volume create) or Implicit (via VOLUME instruction in Dockerfile).
  2. Attachment: Mounted into a running container.
  3. Detachment: Container stops or is removed. The volume remains.
  4. Pruning: Deleting volumes that are no longer attached to any container.

Named vs Anonymous Volumes

  • Named: docker run -v my-db:/data .... Easy to find, backup, and manage.
  • Anonymous: docker run -v /data .... Docker generates a random hash name (e.g., 2d3f4a...). These are the primary cause of “disk leak” because they are easily forgotten.

4. Interactive: Lifecycle Simulator

Visualize the state of a volume as you perform operations.

1. Created
2. Attached
3. Detached
4. Pruned
📦
Volume: my-vol
Status: Free
System Ready.

3. Code Example: Cleaning Up

How to programmatically find and remove unused volumes.

Go

package main

import (
    "context"
    "fmt"
    "github.com/docker/docker/api/types/filters"
    "github.com/docker/docker/client"
)

func main() {
    ctx := context.Background()
    cli, _ := client.NewClientWithOpts(client.FromEnv)

    // Prune Volumes
    // Equivalent to: docker volume prune -f
    report, err := cli.VolumesPrune(ctx, filters.Args{})
    if err != nil {
        panic(err)
    }

    fmt.Printf("Deleted Volumes:\n")
    for _, v := range report.VolumesDeleted {
        fmt.Println(v)
    }
    fmt.Printf("Space Reclaimed: %d bytes\n", report.SpaceReclaimed)
}

Java

import com.github.dockerjava.api.DockerClient;
import com.github.dockerjava.core.DockerClientBuilder;
import com.github.dockerjava.api.model.PruneResponse;

public class VolumeCleaner {
    public static void main(String[] args) {
        DockerClient dockerClient = DockerClientBuilder.getInstance().build();

        // Prune unused volumes
        PruneResponse response = dockerClient.pruneCmd(PruneType.VOLUMES).exec();

        System.out.println("Space Reclaimed: " + response.getSpaceReclaimed());
    }
}