Skip to content

ADR-0035: Appointment-Slot Inverted-Window DB CHECK Backstop (wijs)

Accepted wijs Territory-Specific

Date: 2026-06-24 Compliance: AVG/GDPR (care-adjacent resident scheduling data)

Context

The 2026-06-17 ghost-group production incident was a class of data corruption in which slot-mutation paths wrote appointment_slots rows with end_date < start_date ("inverted windows"). These rows are invisible to the read path since the date-precise calendar fix (PR #49) but accumulate silently and require periodic manual cleanup.

The two known generators are now guarded at the application layer and tested:

  • UpdateLocationAppointmentSlotsAction ($hasAfterRemainder skip of the inverting after-slot).
  • UpdateLocationAction::transferAppointmentsToNewSlot (hard-delete the superseded original instead of shorten-to-invert).
  • LocationRequest::after() (rejects end_date < start_date per opening-day).

Those guards close the known write paths. They do not protect against a future write path — a new Action, a console command, a raw query, a seeder — reintroducing the same corruption. Doctrine's enforcement-escalation ladder says: when a pattern recurs, push enforcement to the lowest level that catches it before a human sees it. A database CHECK constraint is below Level 1 in spirit — it catches the violation at write time regardless of which code path produced it.

A 2026-06-24 read-only prod census established the two load-bearing facts:

  1. The engine enforces CHECK. wijs prod is MySQL 8.0.45 (CHECK enforced since 8.0.16). This is not the MySQL 5.7 parse-but-ignore no-op — the constraint will actually reject writes.
  2. Existing data blocks a naive ADD CONSTRAINT. 576 inverted rows exist (48 live, 528 soft-deleted). MySQL 8.0 validates existing rows when adding an ENFORCED CHECK → ERROR 3819. The constraint applies to every physical row, so soft-deleting the live rows does not clear the obstruction. The CHECK can only land after a hard data repair. The 48 live rows are contained in 2 groups across 2 locations; 8 of them carry 9 real resident-appointment occupations (RESTRICT FK) that must be repointed before deletion. The 528 trashed rows carry no occupations.

Full census + evidence: campaigns/wijs/2026-06-24-appointment-slot-db-check-backstop-census.md.

Decision

Add a CHECK constraint to appointment_slots enforcing the window invariant, preceded by a data repair, via a two-step rollout (cleanup verified on prod, then constraint):

Step 1 — data repair migration (cleanup PR)

The 2026-06-24 governing-slot dig resolved every occupation to exactly one live, non-inverted, date-covering governing slot, all collision-clear against the (appointment_id, appointment_slot_id) unique index. All 9 occupations belong to cancelled (soft-deleted) appointments — zero live-scheduling risk. The migration encodes this explicit mapping (surgical, by occ_id — not a heuristic):

occupation idappointmentinverted slot→ governing slot
6204630076818 (loc 6, ts 1300)6963
6204730076819 (loc 6, ts 1330)6965
6206830146820 (loc 6, ts 1400)6967
6207230156821 (loc 6, ts 1430)6969
6207330156822 (loc 6, ts 1500)6971
6207530166821 (loc 6, ts 1430)6969
65603434622452 (loc 58, ts 1500)27694
65604434622453 (loc 58, ts 1530)27696
65605434622454 (loc 58, ts 1600)27698
sql
-- 1a. Repoint the 9 occupation pivots onto their governing slot (by occ_id, explicit).
--     Raw pivot UPDATE bypasses the SoftDeletes scope (mirrors UpdateLocationAction c81aa1bd).
UPDATE appointment_slot_occupation SET appointment_slot_id = 6963  WHERE id = 62046;
UPDATE appointment_slot_occupation SET appointment_slot_id = 6965  WHERE id = 62047;
UPDATE appointment_slot_occupation SET appointment_slot_id = 6967  WHERE id = 62068;
UPDATE appointment_slot_occupation SET appointment_slot_id = 6969  WHERE id = 62072;
UPDATE appointment_slot_occupation SET appointment_slot_id = 6971  WHERE id = 62073;
UPDATE appointment_slot_occupation SET appointment_slot_id = 6969  WHERE id = 62075;
UPDATE appointment_slot_occupation SET appointment_slot_id = 27694 WHERE id = 65603;
UPDATE appointment_slot_occupation SET appointment_slot_id = 27696 WHERE id = 65604;
UPDATE appointment_slot_occupation SET appointment_slot_id = 27698 WHERE id = 65605;

-- 1b. Now every inverted row (48 live + 528 trashed) is occupation-free → one DELETE clears all 576.
--     forceDelete equivalent; no RESTRICT FK violation remains.
DELETE FROM appointment_slots WHERE end_date IS NOT NULL AND end_date < start_date;

Re-census guard (build/deploy time): immediately before running the repair on prod, re-assert the set is unchanged — exactly 576 inverted rows, the 9 occupation ids above still present and still mapping to the same governing slots, all collision-clear. The application guards are live, so no new inverted rows should have appeared; abort and re-dig if the census drifts.

Step 2 — constraint migration

php
// after Step 1 is merged and verified clean on prod
DB::statement(<<<'SQL'
    ALTER TABLE appointment_slots
    ADD CONSTRAINT chk_appointment_slots_window
    CHECK (end_date IS NULL OR end_date >= start_date)
SQL);

end_date IS NULL (open-ended) and end_date = start_date (single-day) remain valid — only a strict end_date < start_date is rejected. The down migration drops the constraint by name.

Complementary enforcement

File a Level-1/2 entry on enforcement/queue.md: a guard that flags any future migration dropping chk_appointment_slots_window (the constraint is only protective while it exists). Optional — decided in deliberation, not pre-committed here.

Options Considered

OptionVerdictReason
Application guards only (status quo)RejectedCloses known write paths but not future ones; corruption stays silent until the next read-path regression or manual census surfaces it.
Bare ADD CONSTRAINT CHECKRejectedErrors on the 576 existing inverted rows (ERROR 3819); cannot land without repair.
Soft-delete the live inverted rows, then add CHECKRejectedCHECK applies to physical rows including trashed ones — soft-delete leaves all 576 in violation. Does not clear the obstruction.
Single atomic clean-then-constrain migrationRejected (for now)Atomically correct, but bundles a destructive repair of live resident-appointment data with a schema change in one irreversible prod step. For an AVG care-data territory, separating cleanup (verifiable on prod) from the constraint is the more conservative path.
Two-step: repair PR (verify on prod) → constraint PRAcceptedRepair is verifiable in isolation before the constraint hard-locks the invariant; the 9 live occupations get a checkpoint between "data moved" and "schema enforced."
Window invariant as a DB trigger instead of CHECKRejectedHeavier, harder to read, MySQL trigger semantics are surprising; CHECK is declarative and self-documenting, and the engine enforces it natively here.

Consequences

Positive

  • Write-path-agnostic backstop. Any future code path (Action, command, raw query, seeder) that would write an inverted window now fails at the database, not silently. Defense in depth below the application guards.
  • Self-documenting invariant. The constraint encodes the domain rule ("a slot's window may be open-ended or a single day, but never end before it starts") in the schema itself.
  • One-time data hygiene. The repair clears all 576 existing inverted rows, including the 19-ghost residue the 2026-06-17 cleanup did not fully target.

Negative

  • Failure mode flips to a 500. A write that previously corrupted silently now throws an integrity-constraint violation. This is intended (fail loud > corrupt silent) but means any un-guarded legitimate write path becomes a user-facing error until fixed. Mitigated by the application guards already shipped.
  • Repair touches live resident-appointment data. 9 occupations must be repointed correctly; a wrong destination orphans or mis-attributes a real appointment.

Risks

  • Governing-slot ambiguity for the 8 occupied rowsMitigation: verify the destination slot per group at build time before writing the migration; do not generalize. If a group's governing slot is ambiguous, escalate to the Commander rather than guess.
  • Prod deploy vehicle — wijs prod ships via manual fly deploy from a pristine clone (the automated path is WR-0104). Mitigation: run the repair migration with the same discipline as the 2026-06-17 incident deploy (pristine clone, Bitbucket/GitHub-ancestry check, off-hours resident-free window).
  • New inverted rows appearing between census and repairMitigation: the application guards are already live, so no new inverted rows should be generated; re-run the census immediately before the repair to confirm the set is still 576 / the 2 live groups unchanged.

Enforcement

  • Level 1 (DB): the CHECK constraint itself — write-time rejection, engine-enforced (MySQL 8.0.45).
  • Level 1/2 (arch/static): optional guard against a future migration dropping the constraint (enforcement queue entry, decided in deliberation).
  • Level 4 (doctrine): a line in the wijs CLAUDE.md slot section recording the invariant + the constraint name, so future slot work knows the floor exists.

Implementation

Territory-specific (wijs only). The window invariant is a wijs schema concern; no fleet projection.

TerritoryStateDetail
wijsComplete (code) — prod-deploy is the residualBoth rollout steps merged to development with feature tests; prod data-repair + constraint apply on the next manual fly deploy (WR-0104 vehicle).
  • Step 1 — data repair migration: database/migrations/2026_06_24_120000_repair_inverted_appointment_slot_windows.php (the surgical 9-occupation repoint + DELETE of all 576 inverted rows), covered by tests/Feature/Migrations/RepairInvertedAppointmentSlotWindowsTest.php. Merged.
  • Step 2 — constraint migration: database/migrations/2026_06_26_120000_add_appointment_slots_window_check_constraint.php adds chk_appointment_slots_window (e3628c64), covered by tests/Feature/Migrations/AppointmentSlotsWindowCheckConstraintTest.php. A driver-guard follow-up (0319db1d) skips the CHECK on sqlite so the local suite stays green (CHECK is MySQL 8.0 / CI MariaDB only). Merged.
  • Level-4 doctrine: the invariant + constraint name are recorded in the wijs CLAUDE.md slot section. Done.
  • Complementary drop-guard (the optional Level-1/2 arch test flagging a future migration that drops the constraint): Not started — left optional per deliberation; promote to the enforcement queue if a second window-invariant regression surfaces.
  • Residual: confirm the repair + constraint actually ran against prod (576-row repair is the load-bearing, irreversible step) once the manual fly deploy lands. The automated GitHub deploy flow is WR-0104.

Architecture documentation for contributors and collaborators.