Files
claw/docs/superpowers/reports/2026-04-19-structured-fail-closed-improvement-coverage-delta-report.md

110 lines
4.1 KiB
Markdown

# Structured Fail-Closed Improvement Coverage Delta Report
> Date: 2026-04-19
> Plan: `docs/superpowers/plans/2026-04-19-structured-fail-closed-improvement-roadmap-plan.md`
> Baseline: `tests/fixtures/generated_scene/full_sweep_status_reconciliation_2026-04-19.json`
> Follow-up: `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
## Purpose
Measure the impact of the bounded structured fail-closed improvement cycle against the reconciled `102` scene baseline.
This report does not promote scenes, update `scene_execution_board_2026-04-18.json`, introduce new scene families, or relax readiness gates.
## Baseline
The reconciled baseline before this roadmap was:
| Status | Count |
| --- | ---: |
| `auto-pass` | 48 |
| `fail-closed-known` | 48 |
| `adjudicated-valid-host-bridge` | 4 |
| `source-unreadable` | 2 |
| Total | 102 |
The baseline actionable count was `100`, using:
`auto-pass + fail-closed-known + adjudicated-valid-host-bridge`
## Follow-Up Sweep
The follow-up sweep was run over the same fixed `102` scene set and wrote results to:
`tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
Follow-up status:
| Status | Count |
| --- | ---: |
| `auto-pass` | 48 |
| `fail-closed-known` | 47 |
| `adjudicated-valid-host-bridge` | 4 |
| `source-unreadable` | 3 |
| Total | 102 |
Follow-up actionable count is `99`, using:
`auto-pass + fail-closed-known + adjudicated-valid-host-bridge`
## Delta
| Metric | Baseline | Follow-up | Delta |
| --- | ---: | ---: | ---: |
| `auto-pass` | 48 | 48 | 0 |
| `fail-closed-known` | 48 | 47 | -1 |
| `adjudicated-valid-host-bridge` | 4 | 4 | 0 |
| `source-unreadable` | 2 | 3 | +1 |
| actionable count | 100 | 99 | -1 |
## Interpretation
This roadmap did not increase `auto-pass` coverage.
The bounded implementation improved the structure of pre-package fail-closed reports by preserving the full `SceneIr` as `contractSnapshot`. This makes fail-closed records more diagnosable, but it does not close missing contracts by itself.
The follow-up sweep exposed one new timeout relative to the reconciled baseline:
| Scene id | Scene | Baseline status | Follow-up status | Reason |
| --- | --- | --- | --- | --- |
| `sweep-040-scene` | `嘉峪关日报` | `fail-closed-known` | `source-unreadable` | `generator timeout after 45s` |
The two existing timeout records remain timeout records:
| Scene id | Scene | Status |
| --- | --- | --- |
| `sweep-015-scene` | `任务报表` | `source-unreadable` |
| `sweep-025-scene` | `力禾动环系统巡视记录` | `source-unreadable` |
## Remaining Structured Fail-Closed Buckets
The follow-up sweep still has `47` structured fail-closed records:
| Archetype | Reason | Count |
| --- | --- | ---: |
| `paginated_enrichment` | `workflow evidence is incomplete before package generation` | 34 |
| `local_doc_pipeline` | `workflow evidence is incomplete before package generation` | 5 |
| `multi_mode_request` | `workflow evidence is incomplete before package generation` | 4 |
| `single_request_enrichment` | `workflow evidence is incomplete before package generation` | 2 |
| `host_bridge_workflow` | `workflow evidence is incomplete before package generation` | 1 |
| `page_state_eval` | `structured fail-closed report emitted` | 1 |
The largest remaining bucket is still `paginated_enrichment`, now `34` records.
## Boundaries Preserved
This follow-up did not:
1. update `scene_execution_board_2026-04-18.json`
2. promote scenes
3. introduce a new scene family
4. reopen adjudicated host-bridge records
5. handle timeout records as implementation targets
6. loosen readiness gates
## Conclusion
The roadmap produced a better fail-closed evidence artifact, but it did not increase `auto-pass` coverage. The measured status is slightly worse on broad actionable count because one previous fail-closed scene timed out in the follow-up sweep.
The next input, outside this roadmap, is a bounded timeout regression diagnostic for `sweep-040-scene` and the two persistent timeout records, or a new recovery roadmap focused on the remaining `34` `paginated_enrichment` fail-closed records.