Testing Your Observability
“If a tree falls in a forest and no one traces it, did it make a sound?”
You write unit tests for your business logic. You write integration tests for your database queries. But do you test your Observability?
Imagine this scenario: Your payment service crashes in production. You rush to your dashboard, expecting to see the root cause. Instead, you see… nothing. The trace is broken. The error=true attribute wasn’t set. The critical customer_id tag is missing. You are flying blind.
In this module, we treat Instrumentation as Code. If it’s worth adding to your codebase, it’s worth testing. We will explore how to verify your traces, metrics, and logs using the OpenTelemetry SDKs for Java and Go.
1. The Observability Test Pyramid
Just like application testing, observability testing follows a pyramid structure.
2. Interactive: Trace Assertion Builder
Before we dive into code, let’s practice the mental model of testing traces. Below is a simulated trace generated by a Checkout Service. Your job is to verify its correctness.
Mock Trace Data
{
"name": "checkout",
"kind": "SERVER",
"status": { "code": "ERROR" },
"attributes": {
"http.method": "POST",
"http.route": "/checkout",
"user.id": "u-12345",
"cart.total": 99.99
},
"events": [
{ "name": "exception", "attributes": { "exception.type": "PaymentFailed" } }
]
}
Run Assertions
[!NOTE] In a real test, you don’t manually check boxes. You write code that fails the build if these conditions aren’t met.
3. Unit Testing in Java
For Java, OpenTelemetry provides the opentelemetry-sdk-testing artifact. This includes the InMemoryExporter, which stores spans in a List for you to inspect.
1. Dependencies
Add these to your pom.xml (Maven) or build.gradle (Gradle):
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk-testing</artifactId>
<version>1.34.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<version>5.10.1</version>
<scope>test</scope>
</dependency>
2. The OpenTelemetryExtension
This JUnit 5 extension handles the boilerplate of setting up a TraceProvider and an InMemoryExporter.
import io.opentelemetry.api.common.AttributeKey;
import io.opentelemetry.api.trace.StatusCode;
import io.opentelemetry.sdk.testing.junit5.OpenTelemetryExtension;
import io.opentelemetry.sdk.trace.data.SpanData;
import org.junit.jupiter.api.RegisterExtension;
import org.junit.jupiter.api.Test;
import java.util.List;
import static org.junit.jupiter.api.Assertions.*;
class PaymentServiceTest {
// 1. Register the extension
@RegisterExtension
static final OpenTelemetryExtension otelTesting = OpenTelemetryExtension.create();
// 2. Inject the Tracer into your service
private final PaymentService service = new PaymentService(otelTesting.getOpenTelemetry().getTracer("test-tracer"));
@Test
void processPayment_Success_ShouldRecordSpans() {
// 3. Run business logic
service.processPayment("user-123", 99.99);
// 4. Retrieve spans
List<SpanData> spans = otelTesting.getSpans();
assertEquals(1, spans.size(), "Should produce exactly one span");
SpanData span = spans.get(0);
// 5. Assertions
assertEquals("process-payment", span.getName());
assertEquals(StatusCode.UNSET, span.getStatus().getStatusCode(), "Status should be UNSET on success");
// Check Attributes
assertEquals("user-123", span.getAttributes().get(AttributeKey.stringKey("payment.user_id")));
assertEquals(99.99, span.getAttributes().get(AttributeKey.doubleKey("payment.amount")));
}
@Test
void processPayment_Failure_ShouldRecordError() {
assertThrows(RuntimeException.class, () -> service.processPayment(null, 0));
List<SpanData> spans = otelTesting.getSpans();
SpanData span = spans.get(0);
// Verify Error Status
assertEquals(StatusCode.ERROR, span.getStatus().getStatusCode());
// Verify Exception Event
boolean hasException = span.getEvents().stream()
.anyMatch(e -> e.getName().equals("exception"));
assertTrue(hasException, "Should record exception event");
}
}
[!TIP] Why
StatusCode.UNSET? In OpenTelemetry, a successful span typically has a status ofUNSET. You only explicitly setOKif you want to override a previous error.ERRORis used for failures.
4. Unit Testing in Go
Go’s testing ecosystem is simpler but equally powerful. We use the tracetest package from the SDK.
1. The InMemoryExporter
package payment_test
import (
"context"
"testing"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
"go.opentelemetry.io/otel/sdk/trace"
"go.opentelemetry.io/otel/sdk/trace/tracetest"
"github.com/stretchr/testify/assert"
)
func TestProcessPayment_Success(t *testing.T) {
// 1. Setup InMemoryExporter
exporter := tracetest.NewInMemoryExporter()
// 2. Create a TracerProvider with the exporter
tp := trace.NewTracerProvider(
trace.WithSyncer(exporter),
)
// 3. Create the service with the tracer
tracer := tp.Tracer("test-tracer")
service := NewPaymentService(tracer)
// 4. Run Logic
err := service.ProcessPayment(context.Background(), "user-123", 99.99)
assert.NoError(t, err)
// 5. Get Spans
spans := exporter.GetSpans()
// 6. Assertions
if assert.Len(t, spans, 1) {
span := spans[0]
assert.Equal(t, "process-payment", span.Name)
assert.Equal(t, codes.Unset, span.Status.Code)
// Check Attributes
attrs := span.Attributes
// Helper to find attribute
var userId string
for _, kv := range attrs {
if kv.Key == "payment.user_id" {
userId = kv.Value.AsString()
}
}
assert.Equal(t, "user-123", userId)
}
}
func TestProcessPayment_Error(t *testing.T) {
exporter := tracetest.NewInMemoryExporter()
tp := trace.NewTracerProvider(trace.WithSyncer(exporter))
service := NewPaymentService(tp.Tracer("test-tracer"))
// Force error
err := service.ProcessPayment(context.Background(), "", 0)
assert.Error(t, err)
spans := exporter.GetSpans()
if assert.Len(t, spans, 1) {
span := spans[0]
assert.Equal(t, codes.Error, span.Status.Code)
assert.Equal(t, "invalid payment details", span.Status.Description)
}
}
5. Testing Context Propagation
Testing a single span is easy. But what about verifying that your service passes the torch? If Service A calls Service B, it must inject the Trace Context into the HTTP headers.
Conceptual Verification
You don’t need a real second service. You just need to mock the HTTP client and inspect the headers it would have sent.
@Test
void shouldPropagateContext() {
// 1. Start a parent span
Span parent = tracer.spanBuilder("parent").startSpan();
try (Scope scope = parent.makeCurrent()) {
// 2. Call the method that makes an HTTP request
// (Here we mock the HttpClient to capture the request)
myServiceClient.callDownstream();
// 3. Verify the Mock received headers
HttpRequest request = mockHttpClient.getLastRequest();
String traceParent = request.getHeader("traceparent");
// 4. Assert header exists and contains trace ID
assertNotNull(traceParent);
assertTrue(traceParent.contains(parent.getSpanContext().getTraceId()));
} finally {
parent.end();
}
}
[!IMPORTANT] Baggage Handling: If you use Baggage (e.g.,
request-idortenant-id), verify that it is also propagated in thebaggageheader. Losing baggage breaks distributed context.
6. Integration Testing with Testcontainers
For higher-level tests, you might want to spin up a real OpenTelemetry Collector and a backend like Jaeger to verify the full pipeline. This is where Testcontainers shines.
- Start Jaeger: Run
jaegertracing/all-in-onein a container. - Configure App: Point your app’s OTLP exporter to the container’s port.
- Run Test: Execute a real flow (e.g., POST /checkout).
- Verify: Query the Jaeger API to confirm the trace exists and has the expected structure.
This is overkill for every test but excellent for a “smoke test” to ensure your exporter configuration (gRPC/HTTP, ports, encryption) is correct.
7. Common Pitfalls
- Forgetting
span.end(): If you don’t end the span, it never gets exported. Your tests will show 0 spans. - Async Leaks: If your business logic runs in a separate thread, your test might finish before the span is recorded. Use
Awaitilityor Go’sEventuallyto wait for spans. - Mocking Too Much: Don’t mock the
Tracerinterface itself. Use the real SDK with theInMemoryExporter. Mocking the API leads to brittle tests that don’t reflect reality.
8. Next Steps
Now that you can trust your instrumentation, it’s time to measure performance.