admin/claw

Files

木炎 956f0c2b68 feat: add generated scene skill platform hardening

2026-04-21 23:19:06 +08:00

5.1 KiB

Raw Blame History

Scene Skill 102 Mock Runtime Harness Implementation Design

Date: 2026-04-20 Status: Draft Upstream Validation: docs/superpowers/plans/2026-04-20-scene-skill-102-static-mock-pseudoprod-validation-plan.md Input Matrix: tests/fixtures/generated_scene/scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json

Intent

Define a bounded implementation stage for mock runtime harnesses after the 102 materialized skill set has passed static package validation and deterministic dispatch dry-run.

This design is not production validation. It exists to prove that generated skill scripts can be loaded and exercised against controlled fake dependencies before any real browser, host bridge, or production system is touched.

Current Baseline

From scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json:

Archetype	Count	Representative scenes
`paginated_enrichment`	51	`sweep-001-scene`, `sweep-002-scene`, `sweep-003-scene`
`host_bridge_workflow`	26	`sweep-007-scene`, `sweep-009-scene`, `sweep-010-scene`
`multi_mode_request`	10	`sweep-020-scene`, `sweep-023-scene`, `sweep-030-scene`
`local_doc_pipeline`	6	`sweep-012-scene`, `sweep-017-scene`, `sweep-019-scene`
`single_request_enrichment`	5	`sweep-013-scene`, `sweep-016-scene`, `sweep-068-scene`
`multi_endpoint_inventory`	2	`sweep-084-scene`, `sweep-085-scene`
`page_state_eval`	2	`sweep-066-scene`, `sweep-094-scene`

Current matrix status:

Status	Count
`mock-covered-by-representative`	19
`mock-needs-harness`	83

Important interpretation:

mock-covered-by-representative currently means representative selection only. It does not mean scripts have been executed in a mock runtime.

Harness Layers

Layer 1: Script Load Harness

Purpose:

load generated browser scripts in a controlled JavaScript runtime
verify the entry module does not fail during parse/load
verify referenced helper files are present

Output status:

script-load-pass or script-load-fail

Layer 2: Fake Dependency Harness

Purpose:

Provide controlled fake implementations for:

fetch
browser DOM
host bridge action/callback
local document service
artifact writer

Output status:

mock-dependency-ready or mock-dependency-missing

Layer 3: Representative Flow Harness

Purpose:

Run representative scene scripts far enough to prove control-flow integrity.

Checks:

expected request or host action is attempted
controlled empty-data response is handled
controlled non-empty response is normalized
artifact metadata is produced when declared
error response does not crash outside structured failure path

Output status:

mock-runtime-pass, mock-runtime-fail, or mock-runtime-partial

Layer 4: Matrix Propagation

Purpose:

Propagate representative results to same-archetype scenes without claiming direct execution for every scene.

Output statuses:

mock-runtime-representative-pass
mock-runtime-covered-by-representative
mock-runtime-needs-direct-run
mock-runtime-fail

Route Order

The route order is fixed:

Route 1: paginated_enrichment mock harness
Route 2: multi_mode_request and single_request_enrichment mock harnesses
Route 3: multi_endpoint_inventory and page_state_eval mock harnesses
Route 4: local_doc_pipeline and host_bridge_workflow mock harnesses
Route 5: publish integrated mock runtime validation report

Rationale:

paginated_enrichment is the largest bucket and should validate the most reused generated flow first.
multi_mode_request and single_request_enrichment are mainline API flows and can share fake fetch infrastructure.
multi_endpoint_inventory and page_state_eval are small buckets and should be validated after the mainline fetch harness exists.
local_doc_pipeline and host_bridge_workflow require more specialized fakes and must not drive the first harness implementation.

Scope Guardrails

Allowed:

add mock validation harness files
add mock validation tests
read generated final skill packages
execute generated scripts only inside a mock runtime
publish mock runtime validation result assets and reports

Forbidden:

do not modify generated skill scripts under examples/scene_skill_102_final_materialization_2026-04-19/skills
do not modify src/generated_scene/analyzer.rs
do not modify src/generated_scene/generator.rs
do not rematerialize the 102 skills
do not update scene_execution_board_2026-04-18.json
do not start real browser execution
do not connect to real business systems
do not require production credentials, VPN, SSO, or internal network access

Expected Assets

tests/fixtures/generated_scene/scene_skill_102_mock_runtime_harness_results_2026-04-20.json
docs/superpowers/reports/2026-04-20-scene-skill-102-mock-runtime-harness-report.md

Stop Rules

Stop after representative mock runtime results and the integrated report are published.

Do not continue into production validation under this plan.

Do not claim 102 / 102 real runtime pass from mock results.

5.1 KiB Raw Blame History

Scene Skill 102 Mock Runtime Harness Implementation Design

Intent

Current Baseline

Harness Layers

Layer 1: Script Load Harness

Layer 2: Fake Dependency Harness

Layer 3: Representative Flow Harness

Layer 4: Matrix Propagation

Route Order

Scope Guardrails

Expected Assets

Stop Rules

5.1 KiB

Raw Blame History