ServicesClickStack HyperDX Infra

Debug Setup

How to debug services with OpenTelemetry traces in HyperDX

👤 Sai Tharun

Debug Setup

HyperDX debugging should start from a trace, not from unrelated logs. The goal is to follow one request from entrypoint to dependency calls and understand what happened without guessing.

What Was Missing Before

Previous GapDebugging Impact
No reliable request timelineEngineers could not see the exact order of API, auth, database, cache, and search work
No child spans for dependenciesSlow requests were hard to split between application code and external systems
Weak log correlationLogs existed, but matching them to one request required manual timestamp matching
Missing domain contextA trace or log did not always explain which lab, route, query shape, or scope was involved
No consistent trace ID handoffFrontend, backend, logs, and dashboards were not always connected by one identifier

Required Request Flow

Every instrumented service should make this path possible:

StepWhereRequirement
1Client or callerCapture the request that failed or was slow
2Response headers or logsFind the request trace ID, preferably X-Trace-Id for HTTP APIs
3HyperDXSearch the trace ID
4Trace timelineInspect the root route span and child spans
5Dependency spanCheck Redis, MySQL, Elasticsearch, HTTP, queue, or other external calls
6Domain attributesConfirm the business context that shaped the behavior

What To Inspect In HyperDX

SignalWhat It Answers
Root span durationHow long the API request or job took end to end
Child span durationWhich dependency or internal block consumed time
Span statusWhether the failure is attached to a specific span
Error eventsException type, message, and where the error was recorded
Trace IDShared correlation key for frontend, logs, traces, and backend investigation
Service nameWhich service emitted the span
Route or operation nameWhich endpoint, worker job, or operation ran
Span attributesDomain-specific context needed to explain the behavior

Debugging Common Issues

SymptomFirst CheckFollow-up
Slow API requestCompare root route span duration with dependency span durationsIf dependencies are fast, inspect auth/session work, response shaping, serialization, and runtime overhead
Empty or wrong search resultInspect domain attributes and search dependency spansConfirm query shape, routing, filters, hit count, and target index
Auth failureInspect auth/session span and logs for the same traceConfirm token/session values, Redis lookup, lab context, and rejected reason
5xx responseOpen the failed trace by trace IDFind the span with error status, then inspect logs and Sentry for the same trace ID
Stale dataCheck freshness metrics and the request trace togetherConfirm whether the read path is healthy before moving to CDC or sync runbooks
Dependency timeoutInspect dependency span status and durationCheck retry behavior, timeout config, and downstream health dashboards

Phoenix Search Example

Phoenix Search should expose the trace ID on search responses and emit enough span context to explain slow, empty, or failed searches.

Important Phoenix Search attributes:

AttributeWhy It Matters
search.lab_idConfirms the lab context used by the request
search.search_keyShows which search mode was requested
search.query_shapeExplains how the input was classified
search.routingConfirms Elasticsearch routing
search.is_multi_centerShows whether related lab scope was used
search.hit_countShows how many results came back
search.zero_resultsMakes empty-result traces searchable

Important Elasticsearch span fields:

FieldWhy It Matters
db.system.nameConfirms the dependency is Elasticsearch
db.operation.nameConfirms the operation, usually search
db.operation.parameter.indexConfirms the target index
db.query.textShows the generated query body when query capture is enabled
db.response.status_codeSeparates Elasticsearch failures from application failures

Debug Setup Checklist

CheckExpected Result
Service exports OTLPTraces appear in HyperDX under the correct OTEL_SERVICE_NAME
Framework instrumentation is enabledHTTP route spans are created automatically
Dependency instrumentation is enabledRedis, MySQL, Elasticsearch, HTTP, or queue spans appear as child spans
Logs include trace contextLogs carry trace_id and span_id
HTTP response exposes trace IDOperators can copy X-Trace-Id from the failing request
Domain attributes are addedHyperDX traces explain business context, not only technical timing
Errors are recorded on spansFailed traces show the failing span and exception context

Do not call the setup complete until a real request can be opened in HyperDX and the timeline shows the route span, dependency spans, logs, and required domain attributes.

On this page