Files
claw/docs/superpowers/plans/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-plan.md

7.8 KiB

Scene Skill 102 Static, Mock, And Pseudo-Production Validation Plan

Date: 2026-04-20 Status: Draft Upstream Design: docs/superpowers/specs/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-design.md Parent Stage: validation after final materialization and deterministic invocation readiness

Plan Intent

Define the next validation stage for the fully materialized 102 scene skill set.

This plan validates package health, deterministic dispatch readiness, mock runtime feasibility, and pseudo-production readiness. It does not perform real production execution.

Fixed Inputs

  1. examples/scene_skill_102_final_materialization_2026-04-19/skills
  2. examples/scene_skill_102_final_materialization_2026-04-19/SCENE_INDEX.md
  3. examples/scene_skill_102_final_materialization_2026-04-19/scene_skill_102_index.json
  4. tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.json
  5. tests/fixtures/generated_scene/scene_skill_102_final_materialization_failures_2026-04-19.json
  6. tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json

Planned Outputs

  1. tests/fixtures/generated_scene/scene_skill_102_static_validation_2026-04-20.json
  2. tests/fixtures/generated_scene/scene_skill_102_dispatch_dry_run_validation_2026-04-20.json
  3. tests/fixtures/generated_scene/scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json
  4. tests/fixtures/generated_scene/scene_skill_102_pseudoprod_readiness_2026-04-20.json
  5. docs/superpowers/reports/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-report.md

Scope Guardrails

Allowed:

  1. read final materialized skill packages
  2. parse SKILL.toml, scene.toml, and generation reports
  3. run deterministic dispatch dry-run without executing selected skills
  4. build mock runtime validation matrix
  5. publish validation JSON and report assets

Forbidden:

  1. do not modify src/generated_scene/analyzer.rs
  2. do not modify src/generated_scene/generator.rs
  3. do not modify generated skill scripts during this validation plan
  4. do not rematerialize the 102 skills
  5. do not update tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json
  6. do not start real browser or production environment execution
  7. do not require production credentials, SSO, VPN, or real system access

Workstreams

  1. WS1 Static package validation
  2. WS2 Deterministic invocation dry-run validation
  3. WS3 Mock runtime validation matrix
  4. WS4 Pseudo-production readiness checklist
  5. WS5 Validation report and next-stage decision

Phase 0: Freeze Validation Baseline

Objective

Freeze the 102 final skill set as the input to validation.

Tasks

  1. Confirm final materialization count is 102 / 102.
  2. Confirm materialization failure count is 0.
  3. Confirm deterministic invocation readiness is 102 / 102.
  4. Confirm this plan does not rematerialize skills.

Deliverables

  1. validation baseline section in final report

Acceptance Criteria

  1. validation begins from the final materialization root
  2. no source scene directories are rescanned
  3. no generated scene logic is changed

Phase 1: Static Package Validation

Objective

Validate that all 102 skill packages are structurally complete and parseable.

Tasks

For each skill:

  1. check SKILL.toml
  2. check SKILL.md
  3. check scene.toml
  4. check references/generation-report.json
  5. check at least one script under scripts/
  6. parse TOML and JSON files
  7. compare sceneId, display name, archetype, readiness, suffix, and keyword fields against index and manifest assets

Deliverables

  1. scene_skill_102_static_validation_2026-04-20.json

Acceptance Criteria

  1. every scene has exactly one static validation record
  2. every static failure has a named reason
  3. total records equal 102

Phase 2: Deterministic Invocation Dry-Run Validation

Objective

Validate U+3002 x3 deterministic suffix dispatch selection without executing selected skills.

Tasks

For each skill:

  1. construct one canonical utterance from the scene display name plus the U+3002 x3 suffix
  2. optionally construct one keyword-based utterance when safe
  3. dry-run deterministic selection against the skill registry
  4. record selected skill id, ambiguity count, and no-match status

Deliverables

  1. scene_skill_102_dispatch_dry_run_validation_2026-04-20.json

Acceptance Criteria

  1. every complete skill has a dispatch dry-run result
  2. ambiguous and no-match outcomes are explicit
  3. no selected skill is executed

Phase 3: Mock Runtime Validation Matrix

Objective

Define and, where safe, prepare mock runtime validation by archetype.

Tasks

  1. group 102 skills by workflow archetype
  2. identify one to three representatives per archetype
  3. define mock dependencies required by each archetype:
    • fake fetch
    • fake browser DOM
    • fake host bridge
    • fake local-doc service
    • fake artifact writer
  4. classify each skill as:
    • mock-covered-by-representative
    • mock-needs-harness
    • mock-not-safe-yet

Deliverables

  1. scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json

Acceptance Criteria

  1. every scene is assigned a mock-runtime coverage status
  2. every archetype has a named harness requirement
  3. this phase does not require real network or browser credentials

Phase 4: Pseudo-Production Readiness Checklist

Objective

Define what evidence is required before real-environment validation can start.

Tasks

For each scene or archetype:

  1. record required runtime dependencies
  2. record expected artifact type
  3. record whether host bridge, browser, localhost service, or document pipeline is required
  4. define required execution evidence:
    • console logs
    • network logs
    • screenshots
    • exported files
    • generated artifact metadata
  5. define failure taxonomy:
    • login-blocked
    • network-blocked
    • host-bridge-blocked
    • data-mismatch
    • artifact-mismatch
    • environment-unavailable
    • runtime-error

Deliverables

  1. scene_skill_102_pseudoprod_readiness_2026-04-20.json

Acceptance Criteria

  1. every scene has a pseudo-production readiness record
  2. every real-environment blocker has a named category
  3. no production credentials are required by this phase

Phase 5: Publish Validation Report

Objective

Publish one report that separates static readiness, dispatch readiness, mock readiness, and pseudo-production readiness.

Tasks

  1. summarize static validation results
  2. summarize dispatch dry-run results
  3. summarize mock runtime coverage matrix
  4. summarize pseudo-production readiness categories
  5. recommend whether to start real-environment validation and at what batch size

Deliverables

  1. docs/superpowers/reports/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-report.md

Acceptance Criteria

  1. report explains that 102 / 102 materialization is not the same as 102 / 102 production execution
  2. report lists remaining validation blockers, if any
  3. report does not promote any scene to real executed-pass

Expected Status Outputs

This plan should produce these independent status counts:

  1. staticValidationStatus
  2. dispatchDryRunStatus
  3. mockRuntimeCoverageStatus
  4. pseudoProductionReadinessStatus

Completion Criteria

This plan is complete when:

  1. all planned validation assets are published
  2. all 102 scenes have static validation records
  3. all 102 scenes have dispatch dry-run records
  4. all 102 scenes have mock runtime matrix records
  5. all 102 scenes have pseudo-production readiness records
  6. the validation report is published

Stop Statement

Stop after publishing static, dispatch, mock-runtime matrix, pseudo-production readiness assets, and the report.

Do not execute real production validation under this plan.