Observability Guide
Enzyme has built-in OpenTelemetry (OTel) instrumentation for traces and metrics. When enabled, telemetry data is exported via OTLP to any compatible backend — Datadog, Honeycomb, HyperDX, Grafana Cloud, etc. Disabled by default with zero overhead.
This guide covers what Enzyme captures, how to set it up, and how to add system-level metrics with an OTel Collector.
Quick Start
Enable telemetry with a single environment variable:
ENZYME_TELEMETRY_ENABLED=true ./enzyme
Or in config.yaml:
telemetry:
enabled: true
endpoint: 'localhost:4317'
protocol: 'grpc'
sample_rate: 1.0
service_name: 'enzyme'
Enzyme pushes data to the configured OTLP endpoint. You need a collector or backend listening there to receive it. See Setup Examples below.
Configuration
| Key | Env Var | Default | Description |
|---|---|---|---|
telemetry.enabled |
ENZYME_TELEMETRY_ENABLED |
false |
Enable OpenTelemetry instrumentation. |
telemetry.endpoint |
ENZYME_TELEMETRY_ENDPOINT |
localhost:4317 |
OTLP collector endpoint (host:port). |
telemetry.protocol |
ENZYME_TELEMETRY_PROTOCOL |
grpc |
Export protocol: grpc (port 4317) or http (port 4318). |
telemetry.insecure |
ENZYME_TELEMETRY_INSECURE |
true |
Use plaintext (no TLS) for OTLP export. |
telemetry.sample_rate |
ENZYME_TELEMETRY_SAMPLE_RATE |
1.0 |
Trace sampling rate. 1.0 = all, 0.1 = 10%, 0 = none. |
telemetry.service_name |
ENZYME_TELEMETRY_SERVICE_NAME |
enzyme |
Service name reported to the collector. |
telemetry.headers |
Map of headers sent with every OTLP export request (e.g., API keys). | ||
telemetry.traces |
ENZYME_TELEMETRY_TRACES |
true |
Export traces. Disable to keep only metrics and/or logs. |
telemetry.metrics |
ENZYME_TELEMETRY_METRICS |
true |
Export metrics. Disable to keep only traces and/or logs. |
telemetry.logs |
ENZYME_TELEMETRY_LOGS |
true |
Export logs via OTLP. Disable to keep only traces and/or metrics. |
See the full Configuration Reference for details.
What's Captured
Traces
Traces show the lifecycle of a request from start to finish. Each trace is a tree of spans representing operations.
HTTP Request Spans
Every incoming HTTP request creates a root span with:
- Span name:
METHOD /route/pattern(e.g.,GET /api/workspaces/{wid}/channels) - Attributes: HTTP method, status code, route pattern, response size, user agent
- Duration: Total request processing time
The span name uses the route pattern (not the raw URL), so /api/workspaces/01J5X.../channels appears as /api/workspaces/{wid}/channels. This keeps span cardinality manageable.
Database Spans
Key database operations create child spans under the HTTP request span. These help identify slow queries:
| Span Name | Operation |
|---|---|
message.Create |
Insert a new message (transaction) |
message.List |
List messages in a channel (paginated) |
message.Search |
Full-text search across workspace messages |
channel.GetByID |
Fetch a single channel |
channel.ListForWorkspace |
List all channels with membership info (JOIN) |
workspace.GetMembership |
Check user's membership and role in a workspace |
All database spans include the db.system: sqlite attribute.
Trace Context Propagation
When frontend telemetry is enabled, the browser injects a W3C traceparent header into API requests. The backend extracts this header and creates child spans under the frontend's trace, giving you end-to-end visibility from button click to database query.
The CORS configuration automatically allows traceparent and tracestate headers when telemetry is enabled.
Metrics
Metrics are exported every 60 seconds via OTLP.
| Metric | Type | Attributes | Description |
|---|---|---|---|
sse.connections.active |
UpDownCounter | — | Current number of active SSE connections |
sse.events.broadcast |
Counter | scope |
Total SSE events broadcast |
sse.events.broadcast attributes:
scope:workspace(broadcast to all members),channel(broadcast to channel members only), oruser(targeted to a single user)
Log Correlation
When telemetry is enabled, every log line is enriched with trace_id and span_id fields from the active request context. This lets you jump from a log entry directly to the corresponding trace in your observability backend.
Example log output with telemetry enabled (JSON format):
{
"time": "2026-02-27T10:15:30Z",
"level": "INFO",
"msg": "request completed",
"method": "POST",
"path": "/api/workspaces/01J5X/channels/01J6Y/messages",
"status": 200,
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
"span_id": "00f067aa0ba902b7"
}
Text format also includes the fields:
time=2026-02-27T10:15:30Z level=INFO msg="request completed" method=POST path=... trace_id=4bf92f3577b34da6a3ce929d0e0e4736 span_id=00f067aa0ba902b7
Resource Attributes
Every trace and metric is tagged with resource attributes that identify the Enzyme instance:
| Attribute | Value |
|---|---|
service.name |
Configured service_name (default: enzyme) |
service.version |
Server version (from build, e.g., 0.1.0) |
host.name |
Hostname of the machine |
os.type |
Operating system (e.g., linux, darwin) |
Frontend Telemetry
The web client has separate telemetry for browser-side fetch calls and page loads, with trace context propagation to the backend.
Enabling
Frontend telemetry activates automatically when the Go server serves the embedded web client with telemetry.enabled and telemetry.traces both set to true (the defaults when telemetry is enabled). No build-time configuration or environment variables are needed.
The server injects a window.__ENZYME_CONFIG__ script into index.html at runtime, telling the frontend to load the OTel SDK. The Go server also proxies /api/telemetry/traces to the configured OTLP collector, so no reverse proxy setup is needed. Self-hosters who download the pre-built binary get frontend telemetry without rebuilding the web client.
For development without the Go server (e.g., pnpm --filter @enzyme/web dev against a separate API), use the build-time env var as a fallback:
VITE_OTEL_ENABLED=true VITE_OTEL_ENDPOINT=/api/telemetry/traces pnpm --filter @enzyme/web dev
| Env Var | Default | Description |
|---|---|---|
VITE_OTEL_ENABLED |
not set | Dev-only fallback. Set to "true" to enable telemetry. |
VITE_OTEL_ENDPOINT |
/api/telemetry/traces |
Dev-only fallback. OTLP HTTP endpoint for trace export. |
What's Instrumented
- Fetch calls: Every
fetch()request creates a span with URL, method, status, and duration. Since the@enzyme/api-clientuses nativefetch, all API calls are automatically covered. - Page loads: Document load timing (DNS, TCP, TLS, TTFB, DOM content loaded, load complete).
- Unhandled errors: Uncaught exceptions (
error.unhandled) and unhandled promise rejections (error.unhandled_rejection) are captured as error spans withexception.type,exception.message, andexception.stacktraceattributes. - Trace propagation: W3C
traceparentheader is injected into requests to same-origin and the configured API base URL. Third-party requests are not affected.
How It Works
Frontend telemetry is lazy-loaded — the OTel SDK is only imported when the runtime config or VITE_OTEL_ENABLED enables it. The telemetry chunk is included in the production bundle but only fetched by the browser when telemetry is active.
The frontend reports as service name enzyme-web, separate from the backend's enzyme service. In your observability backend, you'll see traces that start in enzyme-web and continue into enzyme via the propagated trace context.
Collector Routing
The frontend exports traces over OTLP/HTTP (not gRPC, since browsers can't speak gRPC) to /api/telemetry/traces. The Go server includes a built-in reverse proxy that forwards these requests to the collector's OTLP/HTTP receiver:
- HTTP protocol (
telemetry.protocol: http): forwards to the configuredtelemetry.endpoint - gRPC on port 4317 (local collector default): swaps to port
4318(the standard OTLP/HTTP port) - gRPC on any other port (cloud providers like Honeycomb, Grafana Cloud): uses the same endpoint, since these providers serve both gRPC and HTTP on the same port
To override the auto-derived endpoint, set telemetry.frontend_endpoint:
telemetry:
enabled: true
endpoint: api.honeycomb.io:443
protocol: grpc
frontend_endpoint: collector.internal:4318 # explicit override
The proxy also forwards any telemetry.headers to the collector for authentication (e.g., Honeycomb API keys).
Sampling
For high-traffic deployments, trace sampling reduces data volume without losing visibility. The sample_rate setting controls what fraction of traces are captured:
| Value | Effect |
|---|---|
1.0 |
Sample everything (default). Good for development and low-traffic. |
0.5 |
Sample 50% of traces. Good balance for moderate traffic. |
0.1 |
Sample 10%. Suitable for high-traffic production. |
0.0 |
Disable tracing entirely. Metrics are still exported. |
Sampling is parent-based: if an incoming request already has a traceparent header with a sampling decision, Enzyme respects it. This ensures that frontend-initiated traces are complete even at low sample rates.
Graceful Shutdown
When Enzyme receives SIGINT or SIGTERM, it flushes all pending traces and metrics to the collector before shutting down. The flush happens within the 30-second shutdown timeout. No data is lost during normal restarts or deployments.
Performance Impact
When telemetry is disabled (the default), there is zero overhead — no providers are initialized, no metrics are recorded, and no spans are created.
When enabled, the overhead is minimal:
- HTTP middleware adds ~1-2 microseconds per request for span creation
- Database spans add ~0.5 microseconds per instrumented operation
- SSE metrics use atomic counter operations (negligible cost)
- Traces are batched and exported asynchronously in the background
- Metrics are collected and exported every 60 seconds
For most deployments, telemetry overhead is unmeasurable relative to actual request processing time.
Setup Examples
Enzyme exports standard OTLP, so it works with any backend that accepts OTLP. Below are examples for common platforms.
Datadog
The Datadog Agent runs alongside Enzyme and accepts OTLP locally, then forwards to Datadog. No API key header is needed from Enzyme — the agent holds the key.
Enable the OTLP receiver in your Datadog Agent config:
# datadog.yaml
otlp_config:
receiver:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
Then point Enzyme at the agent:
telemetry:
enabled: true
endpoint: 'localhost:4317'
If the agent runs on a different host (e.g., a sidecar in Kubernetes), replace localhost with the agent's address.
Honeycomb
Honeycomb accepts OTLP directly — no collector or agent needed. Authentication uses the x-honeycomb-team header with your API key.
telemetry:
enabled: true
endpoint: 'api.honeycomb.io:443'
protocol: 'grpc'
insecure: false
headers:
x-honeycomb-team: 'YOUR_API_KEY'
For EU region, use api.eu1.honeycomb.io:443 instead.
HyperDX
HyperDX accepts OTLP directly. Authentication uses the authorization header with your ingestion API key (found under Team Settings in HyperDX).
telemetry:
enabled: true
endpoint: 'in-otel.hyperdx.io:4317'
protocol: 'grpc'
insecure: false
headers:
authorization: 'YOUR_HYPERDX_API_KEY'
Other Backends
Any OTLP-compatible backend (Axiom, Grafana Cloud, New Relic, etc.) works the same way: set telemetry.endpoint to their OTLP receiver, telemetry.protocol to grpc or http, and pass any required auth headers via telemetry.headers.
Adding System Metrics with a Collector
The examples above (except Datadog) send OTLP directly from Enzyme to the backend. This covers Enzyme's own traces and metrics, but not system-level metrics like CPU, memory, and disk usage.
To get system metrics, run an OpenTelemetry Collector alongside Enzyme with the Host Metrics receiver enabled. Point Enzyme at the collector instead of the backend, and have the collector forward everything upstream.
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
hostmetrics:
collection_interval: 30s
scrapers:
cpu:
memory:
disk:
network:
exporters:
otlphttp:
endpoint: https://api.honeycomb.io
headers:
x-honeycomb-team: 'YOUR_API_KEY'
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlphttp]
metrics:
receivers: [otlp, hostmetrics]
exporters: [otlphttp]
Then point Enzyme at the collector:
telemetry:
enabled: true
endpoint: 'localhost:4317'
The Datadog Agent already works this way — it acts as a collector that gathers both OTLP data and system metrics.