Start Here
Reading guide and Phase 1 context for Phoenix Search
Start Here
Use this page first. Phoenix Search has a read side and a sync side, and mixing them makes the docs hard to follow.
| If You Need To Understand | Read |
|---|---|
| What Phoenix Search is and where it fits | Phoenix Search Overview |
| How the real system is wired | Architecture |
| How frontend authentication and endpoints work | Auth and Endpoints |
| How search requests and query shapes work | API Reference |
| How MySQL changes reach Elasticsearch | CDC |
| How data is migrated, repaired, or reindexed | CDC Tools and Backfill |
| How to run, monitor, or debug production | Operations |
| How to debug one API request end to end | API Debugging |
| What improved after migration | Post-Migration Results |
Mental Model
The frontend does not call a Phoenix Search login endpoint. It calls crelio-app to mint a short-lived Phoenix Search token, then uses that token against the Phoenix Search API.
The API does not own source-of-truth patient data. It searches Elasticsearch and uses MySQL for detail lookup and session-scope resolution.
Elasticsearch documents are routed by lab_id. Search requests derive routing from the authenticated session, CDC writes use the document's current lab_id, and backfill writes use user_meta.lab_id.
CDC is not a separate service in the docs. It is the write-side sync subsystem that keeps Phoenix Search's Elasticsearch index current.
Phase 1 Why / Context
This is the Phase 1 reasoning the docs assume. The goal is not only to move code into a new service; it is to make preview user detail search simpler to operate, easier to debug, and cheaper to serve at production scale.
| Question | Answer |
|---|---|
| Why was Phoenix Search introduced in Phase 1? | Preview user detail search had too much fan-out. One user search could require roughly 6-7 lookup paths across patient identity, LRF / lab-report identifiers, manual sample ID, bill ID, and order identifiers. Phoenix Search turns that into one routed Elasticsearch search over user_details. |
| Which legacy search paths are replaced first? | The Phase 1 target is preview user detail search: patient identity lookup plus identifier lookups that previously needed separate backend/search calls and merge behavior. |
Why use one user_details Elasticsearch document? | The search input can match many identifiers, but the user-facing result is still a user/patient entity. Denormalizing the searchable identifiers, org/referral/branch filters, and display fields into one projection lets the API return bucketed results without calling multiple lookup APIs. |
| Why use analyzer-backed mappings? | Exact keyword matching alone is not enough for patient and identifier search. The mapping supports exact, prefix, suffix, segment, and search_as_you_type fields so Phoenix Search can handle partial names, phone-like input, lab patient IDs, manual sample IDs, bills, orders, and national IDs in one query shape. |
| Why is CDC required? | Elasticsearch is a read projection, not the source of truth. MySQL remains the source. CDC keeps user_meta and user_details current after the initial backfill without making the frontend wait on source-table joins during every search. |
| Why Debezium, Redpanda, and the Phoenix CDC consumer? | Debezium reads MySQL binlog changes, Redpanda gives a replayable Kafka-compatible transport, and the Python consumer owns Phoenix-specific materialization rules. That split lets operators inspect connector state, topic lag, DLQ records, consumer health, and Elasticsearch sync separately. |
| Why is backfill still needed if CDC exists? | CDC keeps changes current from a point in time. Backfill migrates the existing historical dataset into user_meta and Elasticsearch so the index starts complete, then CDC handles ongoing inserts, updates, deletes, and lab reroutes. |
Why is Elasticsearch routing by lab_id required? | Search scope is lab/session driven. Routing documents by lab_id keeps reads and writes scoped to the same shard route, makes point lookups deterministic, and avoids cross-lab scatter for the hot search path. Any direct ES document check must include routing=<LAB_ID>. |
| Why use ephemeral tokens from crelio-app? | LiveHealth already has the user session. crelio-app mints a short-lived Phoenix Search token containing session and lab context, and Phoenix Search validates that token plus Redis session state. The frontend does not need a separate Phoenix login flow. |
Why is the Phase 1 endpoint https://phoenix-search-in.crelio.solutions/api? | The current frontend fallback and crelio-app token response point to the IN Phoenix Search base URL. The docs therefore describe the real Phase 1 endpoint instead of inventing region segregation that is not present in the current integration. |
| Why OpenTelemetry now? | The new service boundary makes tracing valuable. A single request can expose the browser X-Trace-Id, API route, search attributes, Elasticsearch query span, MySQL dependency timing, errors, CDC freshness, and logs around the same trace. |
| What proves Phase 1 worked? | The post-migration evidence shows one ES search request replacing the previous multi-lookup pattern, roughly 50% lower index storage, 0% observed HTTP request error rate, ES p50 around 2.5 ms, ES p95 around 4.75 ms, and trace-level debugging from frontend request to ES query. |
Known Phase 1 implementation facts from code:
| Fact | Evidence |
|---|---|
| Frontend gets an ephemeral Phoenix token from crelio-app | livehealth-frontend/src/services/phoenixSearch/axios/tokenManager.ts |
crelio-app signs the token using PHOENIX_SEARCH_JWT_SECRET | crelio-app/core/views/phoenix_search_token.py |
Phoenix Search validates it using SEARCH_EPHEMERAL_JWT_SECRET | search/search/services/auth/ephemeral.py |
Frontend default Phoenix Search URL is https://phoenix-search-in.crelio.solutions/api | livehealth-frontend/src/services/phoenixSearch/axios/config.ts |
Search calls use /v1/users/search under that base URL | livehealth-frontend/src/services/phoenixSearch/search.ts |
Folder Structure
phoenix-search/
├── index # service overview
├── start-here # reading path + Phase 1 context
├── architecture # high-level and low-level system design
├── read-path/
│ ├── auth-and-endpoints
│ └── api-reference
├── sync-path/
│ ├── cdc
│ └── backfill
└── operate/
├── operations
├── debugging
└── post-migration-resultsKeep new docs in the same split:
| Content Type | Folder |
|---|---|
| Frontend integration, auth, endpoints, request shapes | read-path/ |
| Debezium, Redpanda, consumer behavior, migration, backfill | sync-path/ |
| On-call, metrics, dashboards, deploy, debug | operate/ |
| System-wide design and boundaries | architecture.mdx |