Data Repair & Rebuild Queue

How the nightly aggregation job repairs and rebuilds historical summary data, plus the migration queue for full lab recomputes

👤 Aakash Pawar📅 Updated: Apr 3, 2026🏷️ feature

Data Repair & Rebuild Queue

Historical Summary data is pre-aggregated nightly. The system has two mechanisms for ensuring data correctness:

  1. Nightly Repair Queue — The primary mechanism. Runs every day, picks up yesterday's billing date, and rebuilds all event aggregations for that day. This is the "repair queue."
  2. Migration Queue — Used for first-time onboarding or full lab recomputes. Processes the entire billing history from scratch.

Nightly Repair Queue

How It Works

The nightly repair queue is driven by the MySQL event operation_summary_nightly_event, which runs every day at 04:00 UTC (~09:30 IST).

Step-by-Step

  1. Timezone cursor — The job groups all enabled labs by their labTimeZone (from the labs table), producing (timezone, comma_separated_lab_ids) pairs.

  2. Compute yesterday's boundaries — For each timezone group:

    • Converts UTC_TIMESTAMP() to the lab's local time
    • Builds local_start = yesterday 00:00:00 and local_end = yesterday 23:59:59
    • Converts back to UTC for the actual query
  3. Aggregate all event types — Calls insert_operation_summary_for_all_events, which iterates over all 16 event types:

    CategoryEvent Types
    Organizationorganization_test, organization_sample, organization_bill, organization_sample_accessed, organization_sample_not_accessed
    Referralreferral_test, referral_sample, referral_bill, referral_sample_accessed, referral_sample_not_accessed
    Branchbranch_test, branch_sample, branch_bill, branch_sample_accessed, branch_sample_not_accessed
    Departmentdepartment_test
  4. Idempotent upsert — Each event type's aggregation uses INSERT ... ON DUPLICATE KEY UPDATE. If a row already exists for that (lab_id, ordered_at, dimension_a_id, dimension_b_id, event_type) combination (e.g., because a trigger already created it), the nightly job overwrites it with the freshly computed count. This is the "repair" aspect — stale or partially-correct rows left by triggers are corrected.

  5. Audit log — The job writes STARTED and COMPLETED entries to operation_event_log with timestamps, so you can verify it ran.

Why It's a Repair Queue

The nightly job is not just a "builder" — it's also a repair mechanism:

  • Triggers handle real-time corrections to historical data (e.g., bill cancellation for a past date). But triggers only fire on UPDATE, not INSERT. And they apply deltas (+1 / -1), which can drift if multiple changes happen in quick succession.
  • The nightly job recomputes from source — It queries billing, billingInfo, labReportRelation, collectedSample, and allTests directly to produce accurate counts. This means any drift accumulated from trigger deltas is corrected the next morning.
  • Idempotent by designINSERT ON DUPLICATE KEY UPDATE ensures the job can be safely re-run for the same day without creating duplicates. The fresh count always wins.

Client Perspective

From a lab's point of view:

QuestionAnswer
When does yesterday's data appear?After the nightly job runs (around 04:00 UTC / 09:30 IST)
What if numbers look off for yesterday?Wait for the nightly job to run — it recomputes from source and self-corrects
What about today's data?Not available in Historical View until tomorrow's nightly run
What if a bill from last week is cancelled today?The trigger adjusts the summary immediately for last week's date. The nightly job does NOT re-process last week (only yesterday)

Monitoring the Nightly Job

Check recent runs

SELECT event_name, status, created_at
FROM operation_event_log
WHERE event_name LIKE '%Nightly-Operation-Summary%'
ORDER BY created_at DESC
LIMIT 20;

A healthy system shows paired STARTED / COMPLETED entries for each day. If you see STARTED without a matching COMPLETED, the job failed mid-execution.

Check if the MySQL event is enabled

SELECT event_name, status, last_executed
FROM information_schema.events
WHERE event_name = 'operation_summary_nightly_event';

Ad-Hoc Repair: Single-Day Rerun

If a specific day's data needs to be repaired (e.g., you discover that yesterday's nightly job failed), you can manually re-run the aggregation:

CALL operation_summary_testing('<UTC_TIMESTAMP_OF_THE_DAY_AFTER>');

For example, to reprocess March 15th 2026:

CALL operation_summary_testing('2026-03-16 04:00:00');

This re-runs the exact same logic as the nightly event for "yesterday" relative to the given timestamp, for all enabled labs. It uses INSERT ON DUPLICATE KEY UPDATE, so it's safe to run multiple times.

Note: This procedure processes all enabled labs, not just a single lab. It's the same procedure the nightly event uses internally.

Source: report/db/procedures/operation_summary_testing.sql


Full Lab Rebuild (Migration Queue)

For situations where the entire history needs to be rebuilt (not just a single day), the system provides a migration queue.

When to Use

ScenarioUse Migration Queue?
Lab reports numbers wrong for a past period✅ Yes — full rebuild
Bulk data import bypassed triggers✅ Yes — full rebuild
Database restore from backup✅ Yes — full rebuild
Single day's data looks off❌ No — use operation_summary_testing instead

How to Trigger

Toggle the feature off and back on through the Support Dashboard:

  1. Navigate to Support Dashboard → Workflow Configurations → Enable Workflows/Features.
  2. Toggle "Historical View in Operations Dashboard" OFF → Save.
  3. Toggle it back ON → Save.

This creates a new migration record. The migration event (operation_summary_migration_event, daily at 03:30 UTC) picks it up and runs a clean-slate rebuild:

  1. DELETE all existing rows in operation_historical_summary for that lab.
  2. Recompute from the lab's earliest billTime through yesterday, day by day.
  3. Update the crelio_data_migrations record with job_status='Completed' and execution time.

The Migration Queue Table

Source: admin/account/models/crelio_migrations.py

class CrelioDataMigrations(BaseModel):
    lab = models.ForeignKey("account.Labs", on_delete=models.CASCADE)
    is_scheduled = models.BooleanField(default=False)
    start_date = models.DateField(null=True)
    end_date = models.DateField(null=True)
    job_status = models.CharField(max_length=100, default="Pending")
    time_taken = models.PositiveIntegerField(null=True, default=0)  # seconds
    initiated_by = models.ForeignKey("support.AccountsManager", on_delete=models.CASCADE)
    migration_type = models.CharField(max_length=200)  # 'HISTORICAL_SUMMARY'

Checking Migration Status

SELECT lab_id, is_scheduled, job_status, time_taken, start_date, end_date, created_at
FROM crelio_data_migrations
WHERE lab_id = <lab_id>
  AND migration_type = 'HISTORICAL_SUMMARY'
ORDER BY created_at DESC
LIMIT 1;
is_scheduledjob_statusMeaning
1PendingQueued, waiting for next 03:30 UTC run
0CompletedRebuild finished; time_taken shows duration in seconds

Manual Queue Insert (Without Toggling)

INSERT INTO crelio_data_migrations (lab_id, is_scheduled, job_status, migration_type, initiated_by_id, created_at, modified_at)
VALUES (<lab_id>, 1, 'Pending', 'HISTORICAL_SUMMARY', <account_manager_id>, NOW(), NOW());

Source File Index

FilePurpose
operation/sql/operation_historical_summary.sqlNightly event, migration procedure, all stored procedures
report/db/procedures/operation_summary_testing.sqlAd-hoc single-day rerun procedure
admin/account/models/crelio_migrations.pyCrelioDataMigrations Django model
admin/account/support_dashboard_settings_view.pyFeature toggle handler — creates/deletes migration records

On this page