wip: checkpoint 2026-03-29 runtime work

2026-03-29 22:44:30 +08:00
parent 7d9036b2d4
commit e294fbb9b1
30 changed files with 6759 additions and 161 deletions
--- a/docs/plans/2026-03-29-sgclaw-zeroclaw-planner-first-execution-plan.md
+++ b/docs/plans/2026-03-29-sgclaw-zeroclaw-planner-first-execution-plan.md
@@ -0,0 +1,451 @@
+# SGClaw ZeroClaw Planner-First Realignment Implementation Plan
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** Realign the browser submit path so `sgclaw` uses `zeroclaw` as the primary planner/executor, with `sgclaw` acting only as the secure SuperRPA host plus custom tool bridge.
+
+**Architecture:** Stop treating `zeroclaw` as a thin LLM wrapper. The browser message path should enter a `zeroclaw`-native orchestration entry point first, let `zeroclaw` perform planning/tool-loop control, and expose SuperRPA-specific browser/office/screen capabilities as regular tools inside that runtime. Any deterministic fast paths for Zhihu/Office must be implemented as `zeroclaw`-aligned execution components, not as frontend-owned control flow. The frontend may display the generated plan and current stage for UX, but it must not own planning or execution decisions.
+
+**Tech Stack:** Rust, `sgclaw` compat bridge, `third_party/zeroclaw` agent loop, SuperRPA browser pipe, local skill library, OpenXML office export, HTML screen export, cargo tests, Python live acceptance.
+
+### Task 1: Freeze The Current Architecture Gap With Characterization Tests
+
+**Files:**
+- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
+- Reference only: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
+- Reference only: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
+- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/agent/loop_.rs`
+
+**Step 1: Write the failing test**
+
+Add a test that submits `读取知乎热榜前10，并导出 excel 文件` through `handle_browser_message_with_context(...)` and asserts the browser submit path does **not** terminate inside the current thin `Agent::turn_streamed(...)` compat bridge.
+
+The test should check for one of these observable signals:
+- a new orchestration mode log such as `zeroclaw_process_message_primary`
+- absence of the old `compat_llm_primary` mode log
+- absence of selector-thrashing logs like repeated `getText .HotList-item`, `[data-hot-item]`, `ol li`
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
+```
+
+Expected: FAIL because the current implementation still enters `src/compat/runtime.rs` and drives `agent.turn_streamed(...)` directly.
+
+**Step 3: Write the smallest additional characterization test**
+
+Add a second failing test that proves SuperRPA-specific tools remain available after the orchestration switch:
+- browser host tool
+- `openxml_office`
+- `screen_html_export`
+
+This test should not require real network calls.
+
+**Step 4: Run both failing tests**
+
+Run:
+```bash
+cargo test --test compat_runtime_test -- --nocapture
+```
+
+Expected: at least the new characterization tests fail for the expected reason.
+
+**Step 5: Commit**
+
+```bash
+git add tests/compat_runtime_test.rs
+git commit -m "test: characterize browser path bypass of zeroclaw orchestrator"
+```
+
+### Task 2: Introduce A ZeroClaw-Native Browser Orchestration Entry Point
+
+**Files:**
+- Create: `/home/zyl/projects/sgClaw/claw/src/compat/orchestration.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/mod.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
+- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/agent/loop_.rs:4752`
+
+**Step 1: Write the failing unit test for the new entry point**
+
+Add a test for a new helper in `src/compat/orchestration.rs` that:
+- receives browser task context
+- builds a `zeroclaw` config
+- returns a browser-safe orchestration handle or result
+
+The test should prove the new helper is chosen by `handle_browser_message_with_context(...)`.
+
+**Step 2: Run the new test to verify it fails**
+
+Run:
+```bash
+cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
+```
+
+Expected: FAIL because the helper does not exist yet.
+
+**Step 3: Implement the minimal entry point**
+
+Create `src/compat/orchestration.rs` with one responsibility:
+- bridge browser submit tasks into a `zeroclaw`-native orchestration path
+
+Do not implement Zhihu-specific logic here. This layer must only:
+- map config
+- map task context/history
+- inject SuperRPA tools
+- call the chosen `zeroclaw` orchestration function
+
+**Step 4: Switch `handle_browser_message_with_context(...)` to the new entry point**
+
+Modify:
+- `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
+
+Replace the direct `compat::runtime::execute_task_with_sgclaw_settings(...)` primary path with the new orchestration bridge.
+
+**Step 5: Run the test to verify it passes**
+
+Run:
+```bash
+cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
+```
+
+Expected: PASS.
+
+**Step 6: Commit**
+
+```bash
+git add src/compat/orchestration.rs src/compat/mod.rs src/agent/mod.rs src/compat/runtime.rs tests/compat_runtime_test.rs
+git commit -m "refactor: route browser submit flow through zeroclaw orchestration bridge"
+```
+
+### Task 3: Register SuperRPA Browser/Office/Screen Capabilities As Native ZeroClaw Tools
+
+**Files:**
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/browser_tool_adapter.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/openxml_office_tool.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/screen_html_export_tool.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/orchestration.rs`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_openxml_office_tool_test.rs`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_screen_html_export_tool_test.rs`
+
+**Step 1: Write the failing tool-registration test**
+
+Add a test that asserts the `zeroclaw` orchestration path exposes:
+- the preferred SuperRPA browser tool
+- `openxml_office` when Excel export is requested
+- `screen_html_export` when screen export is requested
+
+The test must verify this through the new orchestration path, not the old compat path.
+
+**Step 2: Run the test to verify it fails**
+
+Run:
+```bash
+cargo test --test compat_runtime_test browser_orchestration_registers_superrpa_tools_natively -- --nocapture
+```
+
+Expected: FAIL until tool wiring is complete.
+
+**Step 3: Implement minimal native tool registration**
+
+Ensure the new orchestration bridge injects `sgclaw` tools into the `zeroclaw` runtime without changing frontend code. Keep tool naming stable:
+- `superrpa_browser`
+- `openxml_office`
+- `screen_html_export`
+
+**Step 4: Verify tool-level tests still pass**
+
+Run:
+```bash
+cargo test --test compat_openxml_office_tool_test -- --nocapture
+cargo test --test compat_screen_html_export_tool_test -- --nocapture
+```
+
+Expected: PASS.
+
+**Step 5: Run the new orchestration registration test**
+
+Run:
+```bash
+cargo test --test compat_runtime_test browser_orchestration_registers_superrpa_tools_natively -- --nocapture
+```
+
+Expected: PASS.
+
+**Step 6: Commit**
+
+```bash
+git add src/compat/browser_tool_adapter.rs src/compat/openxml_office_tool.rs src/compat/screen_html_export_tool.rs src/runtime/engine.rs src/compat/orchestration.rs tests/compat_runtime_test.rs tests/compat_openxml_office_tool_test.rs tests/compat_screen_html_export_tool_test.rs
+git commit -m "feat: expose superrpa browser and export tools through zeroclaw orchestration"
+```
+
+### Task 4: Remove Frontend-Owned Or Custom Compat Mainline Control Flow
+
+**Files:**
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/skill_runner.rs`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
+- Reference only: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-29-sgclaw-zeroclaw-planner-first-execution-plan.md`
+
+**Step 1: Write the failing regression test**
+
+Add a test that proves Zhihu hotlist export no longer depends on a frontend-owned mainline such as:
+- `compat_skill_runner_primary`
+- direct `sgclaw`-local branching before `zeroclaw`
+
+The expected primary mode should be a `zeroclaw`-owned orchestration mode.
+
+**Step 2: Run the regression test to verify it fails**
+
+Run:
+```bash
+cargo test --test compat_runtime_test zhihu_export_does_not_use_frontend_owned_mainline -- --nocapture
+```
+
+Expected: FAIL while `src/compat/skill_runner.rs` still owns primary control flow.
+
+**Step 3: Remove or demote the custom mainline**
+
+Change the code so:
+- `src/compat/skill_runner.rs` becomes either a helper invoked inside the `zeroclaw` tool/runtime ecosystem, or is removed if redundant
+- `src/agent/mod.rs` no longer branches to a custom primary executor before `zeroclaw`
+
+Do not leave two competing primary modes.
+
+**Step 4: Run the regression test**
+
+Run:
+```bash
+cargo test --test compat_runtime_test zhihu_export_does_not_use_frontend_owned_mainline -- --nocapture
+```
+
+Expected: PASS.
+
+**Step 5: Run the broader compat suite**
+
+Run:
+```bash
+cargo test --test compat_runtime_test -- --nocapture
+```
+
+Expected: PASS.
+
+**Step 6: Commit**
+
+```bash
+git add src/compat/runtime.rs src/agent/mod.rs src/compat/skill_runner.rs tests/compat_runtime_test.rs
+git commit -m "refactor: remove frontend-owned primary control flow from browser submit path"
+```
+
+### Task 5: Align Skills With ZeroClaw Execution Semantics Instead Of Prompt-Only Semantics
+
+**Files:**
+- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/tools/read_skill.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/tests/read_skill_tool_test.rs`
+- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
+- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/SKILL.md`
+- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist-screen/SKILL.md`
+
+**Step 1: Write the failing skill-execution regression test**
+
+Add a test that proves skill usage in the browser submit path is not just:
+- prompt injection
+- `read_skill` text stuffing
+- model-led selector wandering
+
+Instead, the test should verify the task produces:
+- a plan-driven collection/execution flow
+- a real `.xlsx` or `.html` artifact path
+- no selector-thrashing loop
+
+**Step 2: Run the test to verify it fails**
+
+Run:
+```bash
+cargo test --test compat_runtime_test browser_skill_usage_is_execution_not_prompt_only -- --nocapture
+```
+
+Expected: FAIL until skill semantics are aligned with execution.
+
+**Step 3: Implement the minimal alignment**
+
+Change the orchestration so `read_skill` is a fallback for missing context, not the primary means of making high-frequency browser workflows executable.
+
+Keep:
+- skill discovery
+- skill references
+- artifact contract wording
+
+Reduce:
+- over-reliance on prompt stuffing
+- over-reliance on model-led selector discovery for known workflows
+
+**Step 4: Re-run the skill regression tests**
+
+Run:
+```bash
+cargo test --test compat_runtime_test browser_skill_usage_is_execution_not_prompt_only -- --nocapture
+cargo test --test read_skill_tool_test -- --nocapture
+```
+
+Expected: PASS.
+
+**Step 5: Commit**
+
+```bash
+git add src/runtime/engine.rs src/compat/runtime.rs third_party/zeroclaw/src/tools/read_skill.rs tests/compat_runtime_test.rs tests/read_skill_tool_test.rs
+git commit -m "refactor: align browser skill execution with zeroclaw-native workflow semantics"
+```
+
+### Task 6: Verify The Planner-First Path End-To-End
+
+**Files:**
+- Modify: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/runtime_profile_test.rs`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/live_acceptance_score_test.py`
+- Reference only: `/home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw`
+
+**Step 1: Run the Rust regression suites**
+
+Run:
+```bash
+cargo test --test runtime_profile_test -- --nocapture
+cargo test --test compat_config_test -- --nocapture
+cargo test --test compat_runtime_test -- --nocapture
+cargo test --test read_skill_tool_test -- --nocapture
+```
+
+Expected: PASS.
+
+**Step 2: Run the Python scoring test**
+
+Run:
+```bash
+python3 -m unittest tests/live_acceptance_score_test.py
+```
+
+Expected: PASS.
+
+**Step 3: Run the live Zhihu hotlist Excel acceptance**
+
+Run:
+```bash
+python3 tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py
+```
+
+Expected:
+- total score returns to `100`
+- logs show planner-first `zeroclaw` orchestration instead of selector-thrashing
+- no `shell`, `web_fetch`, `web_search_tool`
+- final summary includes a real `.xlsx` path
+
+**Step 4: Update the acceptance note**
+
+Record:
+- new orchestration mode
+- tool sequence
+- timing notes
+- any remaining selector or latency risk
+
+**Step 5: Rebuild and sync the runtime binary used by SuperRPA**
+
+Run:
+```bash
+cargo build
+cp /home/zyl/projects/sgClaw/claw/target/debug/sgclaw /home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw
+sha256sum /home/zyl/projects/sgClaw/claw/target/debug/sgclaw /home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw
+```
+
+Expected: the two hashes match exactly.
+
+**Step 6: Commit**
+
+```bash
+git add docs/acceptance/2026-03-29-zhihu-hotlist-excel.md tests/runtime_profile_test.rs tests/compat_config_test.rs tests/compat_runtime_test.rs tests/live_acceptance_score_test.py
+git commit -m "test: verify planner-first zeroclaw browser orchestration end to end"
+```
+
+### Task 7: Surface The Generated Plan In The Chat UI Without Giving Frontend Control
+
+**Files:**
+- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/event_bridge.rs`
+- Modify: `/home/zyl/projects/sgClaw/claw/src/pipe/protocol.rs`
+- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
+- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/` (the active sgClaw chat UI files that render task progress)
+- Test: `/home/zyl/projects/sgClaw/claw/tests/pipe_protocol_test.rs`
+- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
+
+**Step 1: Write the failing protocol/UI test**
+
+Add a test that proves the backend can emit a structured planning event before tool execution starts. The event must carry:
+- a short plan title
+- a flat ordered step list
+- current phase such as `planning`, `executing`, `completed`
+
+The frontend test or fixture should verify the chat can render the plan summary without waiting for final completion.
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+cargo test --test pipe_protocol_test -- --nocapture
+cargo test --test compat_runtime_test plan_events_are_emitted_before_browser_execution -- --nocapture
+```
+
+Expected: FAIL because the protocol does not yet expose a dedicated plan-progress event.
+
+**Step 3: Add the minimal backend event shape**
+
+Extend the `sgclaw` pipe/event bridge so the orchestration layer can emit:
+- planner summary
+- execution stage transitions
+
+Keep the event read-only from the frontend’s perspective. The UI may display it, but cannot edit or branch execution.
+
+**Step 4: Render the plan in the active chat UI**
+
+Update the SuperRPA sgClaw chat UI so it:
+- prints the generated plan immediately after planning completes
+- keeps the plan compact and collapsible
+- highlights the current phase while waiting
+
+Do not add frontend-owned retry logic, decision logic, or browser action generation.
+
+**Step 5: Run verification**
+
+Run:
+```bash
+cargo test --test pipe_protocol_test -- --nocapture
+cargo test --test compat_runtime_test -- --nocapture
+```
+
+Expected: PASS.
+
+**Step 6: Manual browser validation**
+
+Submit:
+```text
+读取知乎热榜前10，并导出 excel 文件
+```
+
+Expected:
+- the chat first shows a short generated plan
+- the user sees stage transitions instead of a blank wait
+- execution still follows the backend-owned `zeroclaw` path
+
+**Step 7: Commit**
+
+```bash
+git add src/compat/event_bridge.rs src/pipe/protocol.rs tests/pipe_protocol_test.rs tests/compat_runtime_test.rs
+git commit -m "feat: surface backend-generated execution plans in sgclaw chat ui"
+```