System Architecture

Database schema, triggers, nightly scheduler, and the end-to-end data flow for Historical Summary

👤 Aakash Pawar📅 Updated: Apr 1, 2026🏷️ feature

System Architecture

This page walks through every layer of the Historical Summary system, from the MySQL tables that store aggregated counts, through the triggers and nightly jobs that populate them, to the migration procedures that handle first-time onboarding for new labs.


High-Level Data Flow


Database Schema

operation_historical_summary

This is the central fact table. Think of it as a pre-aggregated star schema: every row represents "for lab X, on date Y, grouping Z had N occurrences of event E."

CREATE TABLE operation_historical_summary (
    id             BIGINT       NOT NULL AUTO_INCREMENT,
    dimension_a_id INT          NOT NULL,          -- Primary grouping entity (org/ref/branch/dept ID)
    dimension_b_id INT          DEFAULT NULL,      -- Secondary entity (test/sample ID), NULL for bill counts
    event_type     VARCHAR(250) NOT NULL,          -- e.g., 'organization_test', 'referral_sample_accessed'
    event_value    INT          NOT NULL DEFAULT 0, -- The aggregated count
    lab_id         INT          NOT NULL,
    ordered_at     DATE         NOT NULL,          -- The date (in lab timezone) when activity occurred
    ordered_month  DATE GENERATED ALWAYS AS (
        DATE_FORMAT(ordered_at, '%Y-%m-01')
    ) STORED,                                      -- Precomputed first-of-month for fast monthly grouping
    PRIMARY KEY (id, ordered_at),
    UNIQUE KEY (lab_id, ordered_at, dimension_a_id, dimension_b_id, event_type),
    KEY idx_lab_event_month_granularity (lab_id, event_type, ordered_month,
        dimension_a_id, dimension_b_id, event_value)
) PARTITION BY RANGE COLUMNS (ordered_at) (
    PARTITION summary_table_previous VALUES LESS THAN ('2022-01-01'),
    PARTITION summary_table_2022     VALUES LESS THAN ('2023-01-01'),
    PARTITION summary_table_2023     VALUES LESS THAN ('2024-01-01'),
    -- ... per-year partitions up to 'future' ...
    PARTITION summary_table_future   VALUES LESS THAN (MAXVALUE)
);

Key design points:

  • Yearly partitions keep queries fast; the nightly job only ever touches yesterday's partition.
  • Composite unique key (lab_id, ordered_at, dimension_a_id, dimension_b_id, event_type) enables INSERT ... ON DUPLICATE KEY UPDATE for idempotent upserts.
  • Stored generated column ordered_month avoids repeated DATE_FORMAT() calls in monthly aggregation queries. The covering index idx_lab_event_month_granularity uses this column.

operation_event_log

A simple append-only audit log for the nightly scheduler:

CREATE TABLE operation_event_log (
    id         BIGINT       NOT NULL AUTO_INCREMENT,
    event_name VARCHAR(200) NOT NULL,  -- e.g., '[IST] Nightly-Operation-Summary-Job'
    status     VARCHAR(30)  NOT NULL,  -- 'STARTED' or 'COMPLETED'
    created_at DATETIME     NOT NULL,
    PRIMARY KEY (id)
);

Trigger System (Real-Time Delta Updates)

The trigger system ensures that when operational data changes after the nightly job has run, the summary table stays consistent. There are two triggers:

trg_lab_report_relation_after_update

Fires on: AFTER UPDATE on labReportRelation

What it handles:

Change DetectedAction
dismissed flag flippedIncrements or decrements all relevant test/sample event counts for that record's org, referral, branch, and department
sampleCollected or sampleArrivedAndCollected changedMoves counts between *_accessed and *_not_accessed event types

Early exit conditions:

  1. None of dismissed, sampleCollected, or sampleArrivedAndCollected actually changed → LEAVE trigger
  2. billTime is NULLLEAVE trigger
  3. The computed ordered_at date is today or in the future → LEAVE trigger (today's data is handled by the nightly job tomorrow)

Access status logic:

Accessed     = sampleCollected = 1 OR sampleArrivedAndCollected = 1
Not Accessed = sampleCollected = 0 AND sampleArrivedAndCollected = 0

trg_billing_after_update

Fires on: AFTER UPDATE on billing

What it handles:

Change DetectedAction
isCancel flag flippedAdjusts *_bill counts for org, referral, and branch
orgId_id changedDecrements old org's counts, inserts/increments new org's test, sample, and bill counts
docId_id changedSame pattern for referral dimension
branch_id changedSame pattern for branch dimension

When inserting new counts for re-assigned bills, the trigger recalculates from billingInfo (for tests) and collectedSample (for samples) to get accurate grouped counts.


Nightly Event Scheduler

operation_summary_nightly_event

Schedule: Runs every day at 04:00 UTC (approximately 09:30 IST).

Logic:

  1. Cursor over timezones - Groups all enabled labs by their labTimeZone, producing (timezone, comma_separated_lab_ids) pairs.
  2. Compute yesterday's boundaries - For each timezone group:
    • Convert UTC_TIMESTAMP() to local time
    • Build local_start and local_end for "yesterday" in that timezone
    • Convert back to UTC for the actual query
  3. Call the aggregation procedure - insert_operation_summary_for_all_events(utc_start, utc_end, tz, lab_ids)
  4. Audit log - Writes STARTED and COMPLETED entries to operation_event_log.

insert_operation_summary_for_all_events

Iterates over all 16 event types via an internal cursor and calls insert_operation_summary_for_event for each one, wrapped in a single START TRANSACTION / COMMIT block.

insert_operation_summary_for_event

The workhorse procedure. For each event type, it:

  1. Maps dimensions - Determines which columns become dimension_a_id and dimension_b_id (e.g., b.orgId_id for organization, bI.testId_id for test).
  2. Builds dynamic SQL - Two query modes:
    • Sample events join through labReportRelation → billing → collectedSample → allTests, using COUNT(DISTINCT collectedSample.id) and adding sample access filters.
    • Non-sample events join through billingInfo → billing → allTests, using COUNT(*) or COUNT(DISTINCT b.Id) for bill counts.
  3. Executes with INSERT ... ON DUPLICATE KEY UPDATE - idempotent; safe to re-run for the same day.

Migration Procedure (First-Time Onboarding)

When a lab is first enabled for Historical Summary via the Support Dashboard, they have months or years of historical billing data that the nightly job has never processed. The operation_summary_migration stored procedure handles this data backfill.

Activation Flow

operation_summary_migration

This stored procedure:

  1. Opens a cursor over crelio_data_migrations where is_scheduled = 1 and migration_type = 'HISTORICAL_SUMMARY'.
  2. For each lab, determines the earliest billTime in the billing table as the start date.
  3. Deletes any existing data in operation_historical_summary for that lab (clean slate).
  4. Calls operation_summary_testing_long_range() to recompute the full history from start to today.
  5. Updates the crelio_data_migrations record with is_scheduled=0, job_status='Completed', and the execution time.

operation_summary_migration_event

A MySQL scheduled event that runs daily at 03:30 UTC and simply calls operation_summary_migration(). This is 30 minutes before the nightly summary event, so new labs get their backfill processed before the regular daily job kicks in.


Feature Flag & Onboarding

enable_operation_summary

PropertyValue
Django modelLabFeatures.enable_operation_summary (BooleanField, default=False)
DB tablelabFeatures
Frontend checksessionState.enable_operation_summary (from session data)
Support DashboardUnder "Workflow Configurations → Enable Workflows/Features"

What Happens When the Flag is Toggled

In support_dashboard_settings_view.py, when the diff contains enable_operation_summary:

if "enable_operation_summary" in diff:
    CrelioDataMigrations.objects.filter(
        lab_id=lab_id, migration_type="HISTORICAL_SUMMARY"
    ).delete()
    value and CrelioDataMigrations.objects.create(
        lab_id=lab_id,
        is_scheduled=True,
        initiated_by_id=account_manager_id,
        migration_type="HISTORICAL_SUMMARY",
    )

This schedules the lab for historical data backfill. The operation_summary_migration_event MySQL event (daily at 03:30 UTC) picks up these scheduled labs and runs the full recomputation.


Source File Index

FilePurpose
operation/sql/operation_historical_summary.sqlMaster SQL: table DDL, triggers, procedures, and schedulers
report/db/scheduler/operation_historical_summary.sqlStandalone nightly scheduler DDL
report/db/procedures/operation_summary_testing.sqlManual testing procedure for ad-hoc runs
admin/account/models/lab_features.pyFeature flag model
admin/account/views/account_configuration_constant.pySupport Dashboard config constants

On this page