Day 39 of 60
·
Production & continuous
Distributed tracing & observability
Past three services, debugging without traces is folklore. With them, you read the call graph like a story.
ProblemErrors in distributed systems are invisible without correlation.
How it works
Every request gets a trace ID. Every span emits structured data. A trace UI shows the whole call graph. Pairs with metrics + logs (the three pillars).
What it catches
Cross-service latency, dependency cascades, hot spots. Required for debugging anything past three services.
Tools
OpenTelemetry · OSS Jaeger · OSS Tempo · OSS Honeycomb · SaaS
Verdict by project size
Small
Opt
Medium
Rec
Large
Must
Extra-large
Must
Cost
| Project size | Setup | Maint / mo | Tool / mo | CI / run |
|---|---|---|---|---|
| Small <10k LOC | 1d | 1h | $0 | , |
| Medium 10–100k LOC | 3d | 5h | $200 | , |
| Large 100k–1M LOC | 15d | 30h | $3k | , |
| Extra-large >1M LOC | 60d | 150h | $20k | , |
Setup = engineer-days to first useful run ·
Maint = engineer-hours / month at steady state ·
Tool = out-of-pocket $ / month ·
CI = minutes added (or saved) per pipeline run
Lifecycle & ownership
When in lifecycle
Release Operate Observe
Continuous in prod · Always-on, observing real traffic.
Who owns it
SRE / DevOps / Platform
CI/CD, observability, reliability
Collaborates with: Developer
Reference implementations
-
OpenTelemetry Demo
Realistic microservice demo instrumented with traces, metrics, and logs.
-
Jaeger HotROD
Distributed tracing demo app with realistic service interactions.
-
OpenTelemetry Collector examples
Collector pipelines for exporting traces, metrics, and logs.
Quick check
Distributed tracing becomes effectively required past…
One question. Pick the best answer. Your streak is saved locally on this device.
Save the lesson
Download SVG ↓Screenshot for a 1:1, drop it in Slack, or download the SVG.