Files
claw/docs/superpowers/plans/2026-04-19-102-full-sweep-dry-run-triage-plan.md

6.5 KiB

102 Full Sweep Dry-Run Triage Plan

Date: 2026-04-19 Status: Draft Upstream Spec: docs/superpowers/specs/2026-04-19-102-full-sweep-dry-run-triage-design.md

Plan Intent

Turn the 62 non-pass records from the full sweep into concrete triage buckets while staying measurement-only.

The plan must not fix generator failures. It only explains them.

Fixed Inputs

  1. dry-run result: tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json
  2. dry-run output root: examples/full_sweep_dry_run_2026-04-19
  3. execution board: tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json
  4. scene root: D:/desk/智能体资料/全量业务场景/一平台场景

Fixed Outputs

  1. triage result: tests/fixtures/generated_scene/full_sweep_dry_run_triage_2026-04-19.json
  2. triage report: docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-triage-report.md

Non-Negotiable Scope Guardrails

  1. do not edit analyzer implementation
  2. do not edit generator implementation
  3. do not update scene_execution_board_2026-04-18.json
  4. do not promote any scene
  5. do not add new family baselines
  6. do not start implementation correction during triage
  7. do not expand beyond the fixed 102 scene set

Workstreams

  1. WS1 Timeout Triage
  2. WS2 Misclassification Triage
  3. WS3 No-Report Failure Triage
  4. WS4 Publish Triage Result

Phase 0: Freeze Triage Boundary

Objective

Make the triage a classification exercise only.

Tasks

  1. read the upstream dry-run result
  2. verify the upstream result has 102 scenes
  3. verify non-pass buckets are:
    • 31 timeout records
    • 5 misclassified records
    • 25 no-report records
    • 1 bootstrap-target record
  4. freeze the triage order:
    • timeout first
    • misclassification second
    • no-report third

Deliverables

  1. frozen triage input statement
  2. frozen non-pass bucket counts
  3. frozen triage order

Acceptance Criteria

  1. triage input count is stable
  2. no code is changed
  3. no board status is updated

Phase 1: Timeout Triage

Objective

Split the 31 timeout records into second-level reasons.

Tasks

  1. select records where dryRunStatus = source-unreadable
  2. verify reason is generator timeout after 30s
  3. collect source directory metadata:
    • source directory exists
    • file count
    • total source bytes
    • largest file path
    • largest file bytes
  4. collect dry-run artifact metadata:
    • generated skill directory exists
    • references directory exists
    • generation report exists
  5. preserve board context:
    • current group
    • current status
    • current source asset
    • real sample record id
  6. optionally run one diagnostic longer-timeout attempt for classification only
  7. assign one timeout label:
    • timeout-known-family-sample
    • timeout-unvalidated-source
    • timeout-large-source
    • timeout-command-hang
    • timeout-generator-slow-but-progressing
    • timeout-undetermined

Deliverables

  1. timeoutTriage[] records in the triage JSON
  2. timeout label summary
  3. timeout size/source metadata summary

Acceptance Criteria

  1. all 31 timeout records have a second-level label
  2. no timeout is treated as unsupported family by default
  3. no long-timeout rerun result promotes a scene

Phase 2: Misclassification Triage

Objective

Explain the 5 board-vs-archetype conflicts.

Tasks

  1. select records where dryRunStatus = misclassified
  2. preserve:
    • board expected group
    • expected archetype
    • inferred archetype
    • current source asset
    • real sample layer status
  3. inspect existing dry-run report path when present
  4. collect route-conflict evidence:
    • whether host bridge evidence dominates
    • whether G3 or G1-E evidence is still present
    • whether current board expectation came from baseline or expansion
  5. assign one routing triage label:
    • route-overprefer-host-bridge
    • board-expectation-stale
    • mixed-workflow-host-bridge-valid
    • scene-family-split-needed
    • misclassification-undetermined

Deliverables

  1. misclassificationTriage[] records in the triage JSON
  2. routing conflict summary
  3. high-priority routing risk list

Acceptance Criteria

  1. all 5 misclassified records have a routing label
  2. no routing code is changed
  3. the report identifies whether implementation correction is justified later

Phase 3: No-Report Failure Triage

Objective

Split the 25 generic no-report failures into concrete failure stages.

Tasks

  1. select records where:
    • dryRunStatus = fail-closed-known
    • reason = generator failed without generation report
  2. collect command artifacts:
    • exit code
    • stdout tail
    • stderr tail
  3. inspect output artifacts:
    • skill directory exists
    • references directory exists
    • any report file exists
  4. infer one failure stage:
    • source-scan
    • analyzer
    • ir-assembly
    • readiness-before-report
    • compiler-package-write
    • panic-or-process-error
    • unknown-no-report
  5. keep bootstrap_target failure separate

Deliverables

  1. noReportFailureTriage[] records in the triage JSON
  2. bootstrapTargetFailures[] records in the triage JSON
  3. failure-stage summary

Acceptance Criteria

  1. all 25 no-report failures have an inferred failure stage
  2. the bootstrap_target case is not hidden in the no-report bucket
  3. every non-pass record remains explainable without implementation changes

Phase 4: Publish Triage Result

Objective

Publish a bounded triage result and stop.

Tasks

  1. write full_sweep_dry_run_triage_2026-04-19.json
  2. write 2026-04-19-102-full-sweep-dry-run-triage-report.md
  3. include:
    • timeout triage summary
    • misclassification triage summary
    • no-report triage summary
    • recommended next blocker
  4. explicitly state that the triage does not promote scenes or start fixes

Deliverables

  1. triage JSON
  2. triage report

Acceptance Criteria

  1. all 62 non-pass records are covered
  2. every non-pass record has a second-level explanation
  3. the report identifies the next blocker without implementing it
  4. no generator/analyzer file is modified
  5. scene_execution_board_2026-04-18.json is not modified

Completion Criteria

This plan is complete when:

  1. 31 timeout records have timeout labels
  2. 5 misclassified records have routing labels
  3. 25 no-report failures have failure stages
  4. 1 bootstrap-target failure is separately tracked
  5. the triage JSON and report are published
  6. execution stops without implementation work