209 lines
5.2 KiB
Markdown
209 lines
5.2 KiB
Markdown
# 102 Full Sweep Dry-Run Triage Design
|
|
|
|
> Date: 2026-04-19
|
|
> Status: Draft
|
|
> Upstream Result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
|
|
> Upstream Report: `docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-report.md`
|
|
|
|
## Design Intent
|
|
|
|
Split the non-pass buckets from the `102` scene full sweep into concrete, actionable triage categories without changing generator behavior or promoting scene status.
|
|
|
|
The design answers:
|
|
|
|
`why did 62 scenes not become dry-run auto-pass, and which blocker should be handled first?`
|
|
|
|
## Starting Point
|
|
|
|
The upstream dry-run produced:
|
|
|
|
| Status | Count |
|
|
| --- | ---: |
|
|
| `auto-pass` | 40 |
|
|
| `fail-closed-known` | 26 |
|
|
| `misclassified` | 5 |
|
|
| `source-unreadable` | 31 |
|
|
| `missing-source` | 0 |
|
|
| `unsupported-family` | 0 |
|
|
| Total | 102 |
|
|
|
|
The triage scope is only the `62` non-pass records.
|
|
|
|
## Scope Guardrails
|
|
|
|
1. do not edit `src/generated_scene/analyzer.rs`
|
|
2. do not edit `src/generated_scene/generator.rs`
|
|
3. do not change scene generation logic
|
|
4. do not update `scene_execution_board_2026-04-18.json`
|
|
5. do not promote scenes from this triage
|
|
6. do not add family baselines
|
|
7. do not create implementation plans from a single failure
|
|
8. do not rerun outside the fixed `102` scene set
|
|
|
|
## Fixed Inputs
|
|
|
|
1. dry-run result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
|
|
2. dry-run output root: `examples/full_sweep_dry_run_2026-04-19`
|
|
3. execution board: `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
|
4. scene root: `D:/desk/智能体资料/全量业务场景/一平台场景`
|
|
|
|
## Fixed Outputs
|
|
|
|
1. triage result: `tests/fixtures/generated_scene/full_sweep_dry_run_triage_2026-04-19.json`
|
|
2. triage report: `docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-triage-report.md`
|
|
|
|
## Triage Order
|
|
|
|
The order is fixed:
|
|
|
|
1. timeout triage
|
|
2. misclassification triage
|
|
3. no-report failure triage
|
|
|
|
This order is deliberate:
|
|
|
|
1. timeouts are the largest bucket and include already-mapped `G2` scenes
|
|
2. misclassification has the cleanest routing-quality signal
|
|
3. no-report failures are too broad until the higher-signal buckets are separated
|
|
|
|
## Timeout Triage Model
|
|
|
|
Input bucket:
|
|
|
|
`dryRunStatus = source-unreadable`
|
|
|
|
Current count:
|
|
|
|
`31`
|
|
|
|
Current reason:
|
|
|
|
`generator timeout after 30s`
|
|
|
|
Target second-level labels:
|
|
|
|
1. `timeout-known-family-sample`
|
|
2. `timeout-unvalidated-source`
|
|
3. `timeout-large-source`
|
|
4. `timeout-command-hang`
|
|
5. `timeout-generator-slow-but-progressing`
|
|
6. `timeout-undetermined`
|
|
|
|
Minimum evidence per timeout record:
|
|
|
|
1. source directory exists
|
|
2. file count
|
|
3. total source bytes
|
|
4. current group
|
|
5. current board status
|
|
6. real sample record id if present
|
|
7. whether a partial skill directory exists
|
|
8. whether a partial generation report exists
|
|
|
|
Diagnostic reruns are allowed only for classification. A longer rerun success does not promote the scene.
|
|
|
|
## Misclassification Triage Model
|
|
|
|
Input bucket:
|
|
|
|
`dryRunStatus = misclassified`
|
|
|
|
Current count:
|
|
|
|
`5`
|
|
|
|
Current shape:
|
|
|
|
1. `G3 -> host_bridge_workflow`: `3`
|
|
2. `G1-E -> host_bridge_workflow`: `2`
|
|
|
|
Target second-level labels:
|
|
|
|
1. `route-overprefer-host-bridge`
|
|
2. `board-expectation-stale`
|
|
3. `mixed-workflow-host-bridge-valid`
|
|
4. `scene-family-split-needed`
|
|
5. `misclassification-undetermined`
|
|
|
|
Minimum evidence per misclassification record:
|
|
|
|
1. board expected group
|
|
2. expected archetype
|
|
3. dry-run inferred archetype
|
|
4. current source asset
|
|
5. real sample layer status
|
|
6. generated report path
|
|
7. failed or conflicting signal summary
|
|
|
|
This phase does not correct routing logic.
|
|
|
|
## No-Report Failure Triage Model
|
|
|
|
Input bucket:
|
|
|
|
`dryRunStatus = fail-closed-known` and reason is `generator failed without generation report`
|
|
|
|
Current count:
|
|
|
|
`25`
|
|
|
|
Target failure stages:
|
|
|
|
1. `source-scan`
|
|
2. `analyzer`
|
|
3. `ir-assembly`
|
|
4. `readiness-before-report`
|
|
5. `compiler-package-write`
|
|
6. `panic-or-process-error`
|
|
7. `unknown-no-report`
|
|
|
|
The one `bootstrap_target` failure remains separately tracked and is not merged into no-report failures.
|
|
|
|
Minimum evidence per no-report record:
|
|
|
|
1. exit code if available
|
|
2. stdout tail
|
|
3. stderr tail
|
|
4. partial skill directory exists
|
|
5. partial references directory exists
|
|
6. generated report exists
|
|
7. inferred failure stage
|
|
|
|
## Result Schema
|
|
|
|
Top-level fields:
|
|
|
|
```json
|
|
{
|
|
"triageDate": "2026-04-19",
|
|
"scope": "102-full-sweep-dry-run-triage",
|
|
"sourceDryRun": "tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json",
|
|
"summary": {},
|
|
"timeoutTriage": [],
|
|
"misclassificationTriage": [],
|
|
"noReportFailureTriage": [],
|
|
"bootstrapTargetFailures": [],
|
|
"recommendations": []
|
|
}
|
|
```
|
|
|
|
Each triage record keeps the original dry-run scene id and scene name.
|
|
|
|
## Completion Criteria
|
|
|
|
This triage is complete when:
|
|
|
|
1. all `31` timeout records have a second-level timeout label
|
|
2. all `5` misclassified records have a routing triage label
|
|
3. all `25` no-report failures have an inferred failure stage
|
|
4. the `bootstrap_target` case remains separately visible
|
|
5. no scene status is promoted
|
|
6. no generator or analyzer logic is changed
|
|
|
|
## Stop Rule
|
|
|
|
Stop after publishing the triage JSON and report.
|
|
|
|
Do not start implementation correction from this triage unless a new bounded implementation plan is explicitly created later.
|
|
|