claw/docs/superpowers/specs/2026-04-19-102-full-sweep-dry-run-triage-design.md

# 102 Full Sweep Dry-Run Triage Design

> Date: 2026-04-19
> Status: Draft
> Upstream Result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
> Upstream Report: `docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-report.md`

## Design Intent

Split the non-pass buckets from the `102` scene full sweep into concrete, actionable triage categories without changing generator behavior or promoting scene status.

The design answers:

`why did 62 scenes not become dry-run auto-pass, and which blocker should be handled first?`

## Starting Point

The upstream dry-run produced:

| Status | Count |
| --- | ---: |
| `auto-pass` | 40 |
| `fail-closed-known` | 26 |
| `misclassified` | 5 |
| `source-unreadable` | 31 |
| `missing-source` | 0 |
| `unsupported-family` | 0 |
| Total | 102 |

The triage scope is only the `62` non-pass records.

## Scope Guardrails

1. do not edit `src/generated_scene/analyzer.rs`
2. do not edit `src/generated_scene/generator.rs`
3. do not change scene generation logic
4. do not update `scene_execution_board_2026-04-18.json`
5. do not promote scenes from this triage
6. do not add family baselines
7. do not create implementation plans from a single failure
8. do not rerun outside the fixed `102` scene set

## Fixed Inputs

1. dry-run result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
2. dry-run output root: `examples/full_sweep_dry_run_2026-04-19`
3. execution board: `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
4. scene root: `D:/desk/智能体资料/全量业务场景/一平台场景`

## Fixed Outputs

1. triage result: `tests/fixtures/generated_scene/full_sweep_dry_run_triage_2026-04-19.json`
2. triage report: `docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-triage-report.md`

## Triage Order

The order is fixed:

1. timeout triage
2. misclassification triage
3. no-report failure triage

This order is deliberate:

1. timeouts are the largest bucket and include already-mapped `G2` scenes
2. misclassification has the cleanest routing-quality signal
3. no-report failures are too broad until the higher-signal buckets are separated

## Timeout Triage Model

Input bucket:

`dryRunStatus = source-unreadable`

Current count:

`31`

Current reason:

`generator timeout after 30s`

Target second-level labels:

1. `timeout-known-family-sample`
2. `timeout-unvalidated-source`
3. `timeout-large-source`
4. `timeout-command-hang`
5. `timeout-generator-slow-but-progressing`
6. `timeout-undetermined`

Minimum evidence per timeout record:

1. source directory exists
2. file count
3. total source bytes
4. current group
5. current board status
6. real sample record id if present
7. whether a partial skill directory exists
8. whether a partial generation report exists

Diagnostic reruns are allowed only for classification. A longer rerun success does not promote the scene.

## Misclassification Triage Model

Input bucket:

`dryRunStatus = misclassified`

Current count:

`5`

Current shape:

1. `G3 -> host_bridge_workflow`: `3`
2. `G1-E -> host_bridge_workflow`: `2`

Target second-level labels:

1. `route-overprefer-host-bridge`
2. `board-expectation-stale`
3. `mixed-workflow-host-bridge-valid`
4. `scene-family-split-needed`
5. `misclassification-undetermined`

Minimum evidence per misclassification record:

1. board expected group
2. expected archetype
3. dry-run inferred archetype
4. current source asset
5. real sample layer status
6. generated report path
7. failed or conflicting signal summary

This phase does not correct routing logic.

## No-Report Failure Triage Model

Input bucket:

`dryRunStatus = fail-closed-known` and reason is `generator failed without generation report`

Current count:

`25`

Target failure stages:

1. `source-scan`
2. `analyzer`
3. `ir-assembly`
4. `readiness-before-report`
5. `compiler-package-write`
6. `panic-or-process-error`
7. `unknown-no-report`

The one `bootstrap_target` failure remains separately tracked and is not merged into no-report failures.

Minimum evidence per no-report record:

1. exit code if available
2. stdout tail
3. stderr tail
4. partial skill directory exists
5. partial references directory exists
6. generated report exists
7. inferred failure stage

## Result Schema

Top-level fields:

```json
{
  "triageDate": "2026-04-19",
  "scope": "102-full-sweep-dry-run-triage",
  "sourceDryRun": "tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json",
  "summary": {},
  "timeoutTriage": [],
  "misclassificationTriage": [],
  "noReportFailureTriage": [],
  "bootstrapTargetFailures": [],
  "recommendations": []
}
```

Each triage record keeps the original dry-run scene id and scene name.

## Completion Criteria

This triage is complete when:

1. all `31` timeout records have a second-level timeout label
2. all `5` misclassified records have a routing triage label
3. all `25` no-report failures have an inferred failure stage
4. the `bootstrap_target` case remains separately visible
5. no scene status is promoted
6. no generator or analyzer logic is changed

## Stop Rule

Stop after publishing the triage JSON and report.

Do not start implementation correction from this triage unless a new bounded implementation plan is explicitly created later.