claw/docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-report.md

# 102 Full Sweep Dry-Run Report

> Date: 2026-04-19
> Plan: `docs/superpowers/plans/2026-04-19-102-full-sweep-dry-run-plan.md`
> Result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
> Output Root: `examples/full_sweep_dry_run_2026-04-19`

## Scope

This run measured current generic `scene -> skill` coverage over the fixed `102` scene execution board.

It was a measurement-only dry-run:

1. no analyzer logic was changed
2. no generator logic was changed
3. `scene_execution_board_2026-04-18.json` was not updated
4. no scene was promoted from this result
5. failures were recorded, not fixed

## Headline Numbers

| Metric | Count |
| --- | ---: |
| Real-sample executed pass | 5 / 102 |
| Code-backed ledger coverage | 23 / 102 |
| Dry-run auto-pass | 40 / 102 |
| Dry-run actionable coverage | 66 / 102 |

`dry-run actionable coverage` is `auto-pass + fail-closed-known`.

## Dry-Run Summary

| Dry-run status | Count |
| --- | ---: |
| `auto-pass` | 40 |
| `fail-closed-known` | 26 |
| `misclassified` | 5 |
| `unsupported-family` | 0 |
| `missing-source` | 0 |
| `source-unreadable` | 31 |
| Total | 102 |

## Archetype Distribution

| Inferred archetype | Count |
| --- | ---: |
| `host_bridge_workflow` | 31 |
| `paginated_enrichment` | 8 |
| `multi_mode_request` | 3 |
| `multi_endpoint_inventory` | 2 |
| `page_state_eval` | 2 |
| `none` | 56 |

The `none` bucket includes generator failures and timeout cases that did not produce a `generation-report.json`.

## Auto-Pass Shape

The `40` auto-pass scenes are distributed as:

| Inferred archetype | Auto-pass count |
| --- | ---: |
| `host_bridge_workflow` | 26 |
| `paginated_enrichment` | 8 |
| `multi_mode_request` | 3 |
| `multi_endpoint_inventory` | 2 |
| `page_state_eval` | 1 |

This means the current generic generator is no longer limited to the `23` code-backed ledger scenes. The conservative ledger coverage is lower because it only counts scenes already mapped into formal baseline or boundary assets.

## Non-Pass Buckets

### Source-Unreadable

`31` scenes timed out during this bounded dry-run.

All timeout records use:

`generator timeout after 30s`

These should not be interpreted as unsupported family evidence. They are dry-run execution-limit failures and need separate timeout/performance triage before capability conclusions are drawn.

### Fail-Closed-Known

`26` scenes failed without an auto-pass result but were recorded with a known dry-run failure category.

Top reasons:

| Reason | Count |
| --- | ---: |
| `generator failed without generation report` | 25 |
| `bootstrap_target` | 1 |

The `generator failed without generation report` bucket is actionable but too broad for implementation work. It should be split in a later bounded triage pass before any fixes are attempted.

### Misclassified

`5` scenes produced a package, but the inferred archetype conflicted with the current board group:

| Scene | Current group | Inferred archetype |
| --- | --- | --- |
| `95598报修工单日管控` | `G3` | `host_bridge_workflow` |
| `95598重要服务事项报备统计表` | `G3` | `host_bridge_workflow` |
| `用电报装信息统计列表` | `G1-E` | `host_bridge_workflow` |
| `配网支撑月报(95598抢修统计报表)` | `G3` | `host_bridge_workflow` |
| `高低压新增报装容量月度统计表` | `G1-E` | `host_bridge_workflow` |

This is the clearest blocker category from the dry-run because it indicates current generic routing can over-prefer `host_bridge_workflow` on some scenes that already have board-level family expectations.

## Interpretation

The four coverage numbers answer different questions:

1. `5 / 102` is the strict real-sample pass count.
2. `23 / 102` is the formal code-backed ledger coverage.
3. `40 / 102` is the current generic dry-run auto-pass count.
4. `66 / 102` is the current generic actionable coverage count.

The key result is that the generic generator currently auto-passes more scenes than the formal ledger coverage shows, but the result is not clean enough to promote automatically because:

1. `31` scenes hit bounded dry-run timeouts.
2. `5` scenes show board-vs-archetype mismatch.
3. `26` scenes need more specific failure extraction before implementation work.

## Recommended Next Blocker

Do not start implementation from this report directly.

The next bounded step should be a dry-run triage pass, with priority:

1. split the `31` timeout cases into true timeout, oversized source, and command-level hang
2. inspect the `5` misclassified cases as the first routing-quality sample
3. refine the `25` generic no-report failures into concrete failure categories

This report does not update the execution board and does not promote any scene.