Understanding the Open Telemetry Collector: Part 1 - Architecture Overview
At its core, the OpenTelemetry Collector is a data-processing engine built on a simple pipeline model. Its architecture consists of modular components (receivers, processors, and exporters), each with their own specific jobs work together through a shared set of rules to keep telemetry moving smoothly.
1. Pipelines as Isolation Boundaries
At runtime, the Collector constructs independent signal pipelines for traces, metrics, and logs. This isolation is a core reliability feature: a surge in trace volume cannot exhaust the resources allocated to metrics, and failures in one domain exists independently of others.
This modularity allows the Collector to remain stable even when one specific signal type is under extreme stress.
2. The Blueprint: Wiring the Pipeline
The YAML configuration serves as a blueprint for a connection chain, it is the instruction for how the Collector wires its internal components together
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter]
exporters: [debug]
The runtime resolves this graph by building and wiring in reverse. It starts with the Exporter, connects it to the Processor, and finally hooks both to the Receiver. This bottom-up wiring ensures that by the time the Receiver starts accepting data, every downstream component is already standing by to catch it.
3. The Backbone: Interface Standardization
To make this reverse wiring work, the Collector relies on fixed Go interfaces that ensure that regardless of the vendor, every component speaks the same language.
Data Handoff (pdata)
Components don't exchange raw bytes, instead they use pdata (Pipeline Data), a standardized internal model. For example, the ConsumeTraces interface is the handshake between any two components:
Go
// Traces is an interface that receives ptrace.Traces, processes it
// as needed, and sends it to the next processing node if any or to the destination.
type Traces interface {
internal.BaseConsumer
ConsumeTraces(ctx context.Context, td ptrace.Traces) error
}
The
ptrace.Tracestype comes from thepdatalayer and it represents the structured trace data model.
Lifecycle Management
Every component must also cooperate with the service's runtime through a standard Component interface. This ensures that when the service exits, no leaky goroutines are left running in the background.
Go
type Component interface {
Start(ctx context.Context, host Host) error
Shutdown(ctx context.Context) error
}
4. Decentralized Concurrency and Backpressure
The Collector doesn't have a central event loop, instead, concurrency is decentralized. Each component manages its own goroutines, and the handoffs between them are generally synchronous calls.
And because the calls are synchronous, backpressure flows upstream naturally.
If an exporter is slow, the ConsumeTraces call simply doesn't return. This blocks the processor, which blocks the receiver, which eventually forces the source application to back off. Components (Processors) like the Memory Limiter act as circuit breakers in this chain, triggering an early reject signal to prevent the process from crashing due to memory growth.
5. Lifecycle Coordination
Because of the inter-dependency of the components, the order in which we start and stop the system is vital for data integrity.
Startup: Bottom-Up
We start from the destination and work back to the source to ensure the pipes are connected before the water starts flowing.
Shutdown: Top-Down
We stop the ingress first, allowing the remaining data to be processed and flushed safely.
6. Final Reflections
By keeping the call stack intact and synchronous from Receiver to Exporter, the Collector ensures every telemetry packet is accounted for until it leaves the process. This creates a pressure-sensitive and performant pipeline that prioritizes guaranteed delivery over blind throughput. Understanding these mechanics is the key to predicting how the Open Telemetry system handles the weight of production traffic.
Now that we have established the high-level workings of the pipeline, we can begin to zoom in.
In the next part of this series, I’ll be going in-depth into the specific components i.e. the Receivers, Processors, and Exporters to see how each one implements these rules in practice.
