253 lines
7.8 KiB
Markdown
253 lines
7.8 KiB
Markdown
# Scene Skill 102 Static, Mock, And Pseudo-Production Validation Plan
|
|
|
|
> Date: 2026-04-20
|
|
> Status: Draft
|
|
> Upstream Design: `docs/superpowers/specs/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-design.md`
|
|
> Parent Stage: validation after final materialization and deterministic invocation readiness
|
|
|
|
## Plan Intent
|
|
|
|
Define the next validation stage for the fully materialized `102` scene skill set.
|
|
|
|
This plan validates package health, deterministic dispatch readiness, mock runtime feasibility, and pseudo-production readiness. It does not perform real production execution.
|
|
|
|
## Fixed Inputs
|
|
|
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
|
2. `examples/scene_skill_102_final_materialization_2026-04-19/SCENE_INDEX.md`
|
|
3. `examples/scene_skill_102_final_materialization_2026-04-19/scene_skill_102_index.json`
|
|
4. `tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.json`
|
|
5. `tests/fixtures/generated_scene/scene_skill_102_final_materialization_failures_2026-04-19.json`
|
|
6. `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json`
|
|
|
|
## Planned Outputs
|
|
|
|
1. `tests/fixtures/generated_scene/scene_skill_102_static_validation_2026-04-20.json`
|
|
2. `tests/fixtures/generated_scene/scene_skill_102_dispatch_dry_run_validation_2026-04-20.json`
|
|
3. `tests/fixtures/generated_scene/scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json`
|
|
4. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_readiness_2026-04-20.json`
|
|
5. `docs/superpowers/reports/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-report.md`
|
|
|
|
## Scope Guardrails
|
|
|
|
Allowed:
|
|
|
|
1. read final materialized skill packages
|
|
2. parse `SKILL.toml`, `scene.toml`, and generation reports
|
|
3. run deterministic dispatch dry-run without executing selected skills
|
|
4. build mock runtime validation matrix
|
|
5. publish validation JSON and report assets
|
|
|
|
Forbidden:
|
|
|
|
1. do not modify `src/generated_scene/analyzer.rs`
|
|
2. do not modify `src/generated_scene/generator.rs`
|
|
3. do not modify generated skill scripts during this validation plan
|
|
4. do not rematerialize the `102` skills
|
|
5. do not update `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
|
6. do not start real browser or production environment execution
|
|
7. do not require production credentials, SSO, VPN, or real system access
|
|
|
|
## Workstreams
|
|
|
|
1. `WS1` Static package validation
|
|
2. `WS2` Deterministic invocation dry-run validation
|
|
3. `WS3` Mock runtime validation matrix
|
|
4. `WS4` Pseudo-production readiness checklist
|
|
5. `WS5` Validation report and next-stage decision
|
|
|
|
## Phase 0: Freeze Validation Baseline
|
|
|
|
### Objective
|
|
|
|
Freeze the `102` final skill set as the input to validation.
|
|
|
|
### Tasks
|
|
|
|
1. Confirm final materialization count is `102 / 102`.
|
|
2. Confirm materialization failure count is `0`.
|
|
3. Confirm deterministic invocation readiness is `102 / 102`.
|
|
4. Confirm this plan does not rematerialize skills.
|
|
|
|
### Deliverables
|
|
|
|
1. validation baseline section in final report
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. validation begins from the final materialization root
|
|
2. no source scene directories are rescanned
|
|
3. no generated scene logic is changed
|
|
|
|
## Phase 1: Static Package Validation
|
|
|
|
### Objective
|
|
|
|
Validate that all `102` skill packages are structurally complete and parseable.
|
|
|
|
### Tasks
|
|
|
|
For each skill:
|
|
|
|
1. check `SKILL.toml`
|
|
2. check `SKILL.md`
|
|
3. check `scene.toml`
|
|
4. check `references/generation-report.json`
|
|
5. check at least one script under `scripts/`
|
|
6. parse TOML and JSON files
|
|
7. compare `sceneId`, display name, archetype, readiness, suffix, and keyword fields against index and manifest assets
|
|
|
|
### Deliverables
|
|
|
|
1. `scene_skill_102_static_validation_2026-04-20.json`
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. every scene has exactly one static validation record
|
|
2. every static failure has a named reason
|
|
3. total records equal `102`
|
|
|
|
## Phase 2: Deterministic Invocation Dry-Run Validation
|
|
|
|
### Objective
|
|
|
|
Validate `U+3002 x3` deterministic suffix dispatch selection without executing selected skills.
|
|
|
|
### Tasks
|
|
|
|
For each skill:
|
|
|
|
1. construct one canonical utterance from the scene display name plus the `U+3002 x3` suffix
|
|
2. optionally construct one keyword-based utterance when safe
|
|
3. dry-run deterministic selection against the skill registry
|
|
4. record selected skill id, ambiguity count, and no-match status
|
|
|
|
### Deliverables
|
|
|
|
1. `scene_skill_102_dispatch_dry_run_validation_2026-04-20.json`
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. every complete skill has a dispatch dry-run result
|
|
2. ambiguous and no-match outcomes are explicit
|
|
3. no selected skill is executed
|
|
|
|
## Phase 3: Mock Runtime Validation Matrix
|
|
|
|
### Objective
|
|
|
|
Define and, where safe, prepare mock runtime validation by archetype.
|
|
|
|
### Tasks
|
|
|
|
1. group `102` skills by workflow archetype
|
|
2. identify one to three representatives per archetype
|
|
3. define mock dependencies required by each archetype:
|
|
- fake fetch
|
|
- fake browser DOM
|
|
- fake host bridge
|
|
- fake local-doc service
|
|
- fake artifact writer
|
|
4. classify each skill as:
|
|
- `mock-covered-by-representative`
|
|
- `mock-needs-harness`
|
|
- `mock-not-safe-yet`
|
|
|
|
### Deliverables
|
|
|
|
1. `scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json`
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. every scene is assigned a mock-runtime coverage status
|
|
2. every archetype has a named harness requirement
|
|
3. this phase does not require real network or browser credentials
|
|
|
|
## Phase 4: Pseudo-Production Readiness Checklist
|
|
|
|
### Objective
|
|
|
|
Define what evidence is required before real-environment validation can start.
|
|
|
|
### Tasks
|
|
|
|
For each scene or archetype:
|
|
|
|
1. record required runtime dependencies
|
|
2. record expected artifact type
|
|
3. record whether host bridge, browser, localhost service, or document pipeline is required
|
|
4. define required execution evidence:
|
|
- console logs
|
|
- network logs
|
|
- screenshots
|
|
- exported files
|
|
- generated artifact metadata
|
|
5. define failure taxonomy:
|
|
- `login-blocked`
|
|
- `network-blocked`
|
|
- `host-bridge-blocked`
|
|
- `data-mismatch`
|
|
- `artifact-mismatch`
|
|
- `environment-unavailable`
|
|
- `runtime-error`
|
|
|
|
### Deliverables
|
|
|
|
1. `scene_skill_102_pseudoprod_readiness_2026-04-20.json`
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. every scene has a pseudo-production readiness record
|
|
2. every real-environment blocker has a named category
|
|
3. no production credentials are required by this phase
|
|
|
|
## Phase 5: Publish Validation Report
|
|
|
|
### Objective
|
|
|
|
Publish one report that separates static readiness, dispatch readiness, mock readiness, and pseudo-production readiness.
|
|
|
|
### Tasks
|
|
|
|
1. summarize static validation results
|
|
2. summarize dispatch dry-run results
|
|
3. summarize mock runtime coverage matrix
|
|
4. summarize pseudo-production readiness categories
|
|
5. recommend whether to start real-environment validation and at what batch size
|
|
|
|
### Deliverables
|
|
|
|
1. `docs/superpowers/reports/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-report.md`
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. report explains that `102 / 102` materialization is not the same as `102 / 102` production execution
|
|
2. report lists remaining validation blockers, if any
|
|
3. report does not promote any scene to real executed-pass
|
|
|
|
## Expected Status Outputs
|
|
|
|
This plan should produce these independent status counts:
|
|
|
|
1. `staticValidationStatus`
|
|
2. `dispatchDryRunStatus`
|
|
3. `mockRuntimeCoverageStatus`
|
|
4. `pseudoProductionReadinessStatus`
|
|
|
|
## Completion Criteria
|
|
|
|
This plan is complete when:
|
|
|
|
1. all planned validation assets are published
|
|
2. all `102` scenes have static validation records
|
|
3. all `102` scenes have dispatch dry-run records
|
|
4. all `102` scenes have mock runtime matrix records
|
|
5. all `102` scenes have pseudo-production readiness records
|
|
6. the validation report is published
|
|
|
|
## Stop Statement
|
|
|
|
Stop after publishing static, dispatch, mock-runtime matrix, pseudo-production readiness assets, and the report.
|
|
|
|
Do not execute real production validation under this plan.
|