ServicesPhoenix Search

Start Here

Reading guide and Phase 1 context for Phoenix Search

👤 Sai Tharun

Start Here

Use this page first. Phoenix Search has a read side and a sync side, and mixing them makes the docs hard to follow.

If You Need To UnderstandRead
What Phoenix Search is and where it fitsPhoenix Search Overview
How the real system is wiredArchitecture
How frontend authentication and endpoints workAuth and Endpoints
How search requests and query shapes workAPI Reference
How MySQL changes reach ElasticsearchCDC
How data is migrated, repaired, or reindexedCDC Tools and Backfill
How to run, monitor, or debug productionOperations
How to debug one API request end to endAPI Debugging
What improved after migrationPost-Migration Results

Mental Model

The frontend does not call a Phoenix Search login endpoint. It calls crelio-app to mint a short-lived Phoenix Search token, then uses that token against the Phoenix Search API.

The API does not own source-of-truth patient data. It searches Elasticsearch and uses MySQL for detail lookup and session-scope resolution.

Elasticsearch documents are routed by lab_id. Search requests derive routing from the authenticated session, CDC writes use the document's current lab_id, and backfill writes use user_meta.lab_id.

CDC is not a separate service in the docs. It is the write-side sync subsystem that keeps Phoenix Search's Elasticsearch index current.


Phase 1 Why / Context

This is the Phase 1 reasoning the docs assume. The goal is not only to move code into a new service; it is to make preview user detail search simpler to operate, easier to debug, and cheaper to serve at production scale.

QuestionAnswer
Why was Phoenix Search introduced in Phase 1?Preview user detail search had too much fan-out. One user search could require roughly 6-7 lookup paths across patient identity, LRF / lab-report identifiers, manual sample ID, bill ID, and order identifiers. Phoenix Search turns that into one routed Elasticsearch search over user_details.
Which legacy search paths are replaced first?The Phase 1 target is preview user detail search: patient identity lookup plus identifier lookups that previously needed separate backend/search calls and merge behavior.
Why use one user_details Elasticsearch document?The search input can match many identifiers, but the user-facing result is still a user/patient entity. Denormalizing the searchable identifiers, org/referral/branch filters, and display fields into one projection lets the API return bucketed results without calling multiple lookup APIs.
Why use analyzer-backed mappings?Exact keyword matching alone is not enough for patient and identifier search. The mapping supports exact, prefix, suffix, segment, and search_as_you_type fields so Phoenix Search can handle partial names, phone-like input, lab patient IDs, manual sample IDs, bills, orders, and national IDs in one query shape.
Why is CDC required?Elasticsearch is a read projection, not the source of truth. MySQL remains the source. CDC keeps user_meta and user_details current after the initial backfill without making the frontend wait on source-table joins during every search.
Why Debezium, Redpanda, and the Phoenix CDC consumer?Debezium reads MySQL binlog changes, Redpanda gives a replayable Kafka-compatible transport, and the Python consumer owns Phoenix-specific materialization rules. That split lets operators inspect connector state, topic lag, DLQ records, consumer health, and Elasticsearch sync separately.
Why is backfill still needed if CDC exists?CDC keeps changes current from a point in time. Backfill migrates the existing historical dataset into user_meta and Elasticsearch so the index starts complete, then CDC handles ongoing inserts, updates, deletes, and lab reroutes.
Why is Elasticsearch routing by lab_id required?Search scope is lab/session driven. Routing documents by lab_id keeps reads and writes scoped to the same shard route, makes point lookups deterministic, and avoids cross-lab scatter for the hot search path. Any direct ES document check must include routing=<LAB_ID>.
Why use ephemeral tokens from crelio-app?LiveHealth already has the user session. crelio-app mints a short-lived Phoenix Search token containing session and lab context, and Phoenix Search validates that token plus Redis session state. The frontend does not need a separate Phoenix login flow.
Why is the Phase 1 endpoint https://phoenix-search-in.crelio.solutions/api?The current frontend fallback and crelio-app token response point to the IN Phoenix Search base URL. The docs therefore describe the real Phase 1 endpoint instead of inventing region segregation that is not present in the current integration.
Why OpenTelemetry now?The new service boundary makes tracing valuable. A single request can expose the browser X-Trace-Id, API route, search attributes, Elasticsearch query span, MySQL dependency timing, errors, CDC freshness, and logs around the same trace.
What proves Phase 1 worked?The post-migration evidence shows one ES search request replacing the previous multi-lookup pattern, roughly 50% lower index storage, 0% observed HTTP request error rate, ES p50 around 2.5 ms, ES p95 around 4.75 ms, and trace-level debugging from frontend request to ES query.

Known Phase 1 implementation facts from code:

FactEvidence
Frontend gets an ephemeral Phoenix token from crelio-applivehealth-frontend/src/services/phoenixSearch/axios/tokenManager.ts
crelio-app signs the token using PHOENIX_SEARCH_JWT_SECRETcrelio-app/core/views/phoenix_search_token.py
Phoenix Search validates it using SEARCH_EPHEMERAL_JWT_SECRETsearch/search/services/auth/ephemeral.py
Frontend default Phoenix Search URL is https://phoenix-search-in.crelio.solutions/apilivehealth-frontend/src/services/phoenixSearch/axios/config.ts
Search calls use /v1/users/search under that base URLlivehealth-frontend/src/services/phoenixSearch/search.ts

Folder Structure

phoenix-search/
├── index                 # service overview
├── start-here            # reading path + Phase 1 context
├── architecture          # high-level and low-level system design
├── read-path/
│   ├── auth-and-endpoints
│   └── api-reference
├── sync-path/
│   ├── cdc
│   └── backfill
└── operate/
    ├── operations
    ├── debugging
    └── post-migration-results

Keep new docs in the same split:

Content TypeFolder
Frontend integration, auth, endpoints, request shapesread-path/
Debezium, Redpanda, consumer behavior, migration, backfillsync-path/
On-call, metrics, dashboards, deploy, debugoperate/
System-wide design and boundariesarchitecture.mdx

On this page