Files
claw/docs/superpowers/specs/2026-04-20-scene-skill-102-mock-runtime-harness-implementation-design.md

5.1 KiB

Scene Skill 102 Mock Runtime Harness Implementation Design

Date: 2026-04-20 Status: Draft Upstream Validation: docs/superpowers/plans/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-plan.md Input Matrix: tests/fixtures/generated_scene/scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json

Intent

Define a bounded implementation stage for mock runtime harnesses after the 102 materialized skill set has passed static package validation and deterministic dispatch dry-run.

This design is not production validation. It exists to prove that generated skill scripts can be loaded and exercised against controlled fake dependencies before any real browser, host bridge, or production system is touched.

Current Baseline

From scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json:

Archetype Count Representative scenes
paginated_enrichment 51 sweep-001-scene, sweep-002-scene, sweep-003-scene
host_bridge_workflow 26 sweep-007-scene, sweep-009-scene, sweep-010-scene
multi_mode_request 10 sweep-020-scene, sweep-023-scene, sweep-030-scene
local_doc_pipeline 6 sweep-012-scene, sweep-017-scene, sweep-019-scene
single_request_enrichment 5 sweep-013-scene, sweep-016-scene, sweep-068-scene
multi_endpoint_inventory 2 sweep-084-scene, sweep-085-scene
page_state_eval 2 sweep-066-scene, sweep-094-scene

Current matrix status:

Status Count
mock-covered-by-representative 19
mock-needs-harness 83

Important interpretation:

mock-covered-by-representative currently means representative selection only. It does not mean scripts have been executed in a mock runtime.

Harness Layers

Layer 1: Script Load Harness

Purpose:

  1. load generated browser scripts in a controlled JavaScript runtime
  2. verify the entry module does not fail during parse/load
  3. verify referenced helper files are present

Output status:

script-load-pass or script-load-fail

Layer 2: Fake Dependency Harness

Purpose:

Provide controlled fake implementations for:

  1. fetch
  2. browser DOM
  3. host bridge action/callback
  4. local document service
  5. artifact writer

Output status:

mock-dependency-ready or mock-dependency-missing

Layer 3: Representative Flow Harness

Purpose:

Run representative scene scripts far enough to prove control-flow integrity.

Checks:

  1. expected request or host action is attempted
  2. controlled empty-data response is handled
  3. controlled non-empty response is normalized
  4. artifact metadata is produced when declared
  5. error response does not crash outside structured failure path

Output status:

mock-runtime-pass, mock-runtime-fail, or mock-runtime-partial

Layer 4: Matrix Propagation

Purpose:

Propagate representative results to same-archetype scenes without claiming direct execution for every scene.

Output statuses:

  1. mock-runtime-representative-pass
  2. mock-runtime-covered-by-representative
  3. mock-runtime-needs-direct-run
  4. mock-runtime-fail

Route Order

The route order is fixed:

  1. Route 1: paginated_enrichment mock harness
  2. Route 2: multi_mode_request and single_request_enrichment mock harnesses
  3. Route 3: multi_endpoint_inventory and page_state_eval mock harnesses
  4. Route 4: local_doc_pipeline and host_bridge_workflow mock harnesses
  5. Route 5: publish integrated mock runtime validation report

Rationale:

  1. paginated_enrichment is the largest bucket and should validate the most reused generated flow first.
  2. multi_mode_request and single_request_enrichment are mainline API flows and can share fake fetch infrastructure.
  3. multi_endpoint_inventory and page_state_eval are small buckets and should be validated after the mainline fetch harness exists.
  4. local_doc_pipeline and host_bridge_workflow require more specialized fakes and must not drive the first harness implementation.

Scope Guardrails

Allowed:

  1. add mock validation harness files
  2. add mock validation tests
  3. read generated final skill packages
  4. execute generated scripts only inside a mock runtime
  5. publish mock runtime validation result assets and reports

Forbidden:

  1. do not modify generated skill scripts under examples/scene_skill_102_final_materialization_2026-04-19/skills
  2. do not modify src/generated_scene/analyzer.rs
  3. do not modify src/generated_scene/generator.rs
  4. do not rematerialize the 102 skills
  5. do not update scene_execution_board_2026-04-18.json
  6. do not start real browser execution
  7. do not connect to real business systems
  8. do not require production credentials, VPN, SSO, or internal network access

Expected Assets

  1. tests/fixtures/generated_scene/scene_skill_102_mock_runtime_harness_results_2026-04-20.json
  2. docs/superpowers/reports/2026-04-20-scene-skill-102-mock-runtime-harness-report.md

Stop Rules

Stop after representative mock runtime results and the integrated report are published.

Do not continue into production validation under this plan.

Do not claim 102 / 102 real runtime pass from mock results.