Files
skill-lib/docs/plans/2026-03-27-sgclaw-floating-chat-plan.md
2026-03-27 00:50:47 +08:00

364 lines
14 KiB
Markdown

# sgClaw Floating Chat Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Replace the current debug-style `sgclaw-chat` page as the primary UX with a floating page button + popup chat window, add real multi-turn conversation support, and harden the DeepSeek/browser tool protocol so browser automation is stable.
**Architecture:** Keep `chrome://superrpa-functions/sgclaw-chat` and `chrome://superrpa-functions/sgclaw-config` as debug/config pages, but make the user-facing entry a floating page launcher injected into allowed HTTP/HTTPS pages via existing SuperRPA page-injection capabilities. Reuse the browser-side persistent `SgClawSessionService` as the session owner, extend it from “logs + final result” to “conversation + runtime state”, and extend the sgClaw pipe path so each submit can carry conversation context instead of behaving like a fresh one-shot task. Fix protocol bugs in parallel: strict action-schema validation, better browser/sgClaw error attribution, and DeepSeek tool-call history compatibility.
**Tech Stack:** Chromium WebUI + Lit, existing SuperRPA page injection (`sg_compat.js` / hook injection), browser-side `FunctionsUI`/`SgClawSessionService`, Rust `sgClaw`, ZeroClaw compatibility runtime, DeepSeek OpenAI-compatible chat API.
### Task 1: Freeze Current Baseline And Add Pure UI State Tests
**Files:**
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/BUILD.gn`
**Step 1: Write the failing test**
Write a pure state test that describes the floating UX:
```ts
import {
collapseFloatingWindow,
createFloatingViewState,
openFloatingWindow,
toggleSettingsPanel,
} from './sgclaw-floating_state.js';
test('opens from fab and collapses back on blur', () => {
let state = createFloatingViewState();
state = openFloatingWindow(state);
expect(state.windowOpen).toBe(true);
state = collapseFloatingWindow(state);
expect(state.windowOpen).toBe(false);
expect(state.fabVisible).toBe(true);
});
```
**Step 2: Run test to verify it fails**
Run: `autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts`
Expected: build/test target fails because the new state module and test do not exist yet.
**Step 3: Write minimal implementation**
Create a small pure state module with:
- `fabVisible`
- `windowOpen`
- `settingsOpen`
- `statusBadge`
- `unreadCount`
Keep it logic-only; no DOM code here.
**Step 4: Run test to verify it passes**
Run the same `autoninja` target or the relevant TS unit target once wired.
Expected: the new state test compiles and passes.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts \
chrome/browser/resources/superrpa/devtools/BUILD.gn
git -C /home/zyl/projects/superRpa/src commit -m "test: add sgclaw floating UI state"
```
### Task 2: Build The Floating Page Entry Using Existing SuperRPA Overlay Capabilities
**Files:**
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sg_compat.js`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/hooks/hook_injector.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/hooks/hook_injector.h`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
**Step 1: Write the failing smoke expectation**
Update the browser smoke so it expects:
- a floating button exists on a normal page
- clicking it opens the sgClaw popup
- clicking outside collapses the popup back to the button
Use an assertion like:
```js
await waitFor(() => page.evaluate(() =>
!!document.querySelector('#superrpa-sgclaw-fab')));
```
**Step 2: Run smoke to verify it fails**
Run: `node /home/zyl/projects/sgClaw/claw/tools/browser_smoke/run_deepseek_browser_smoke.mjs`
Expected: smoke fails because the floating entry does not exist.
**Step 3: Write minimal implementation**
Implement the launcher inside injected page JS, not a side panel:
- floating circular button in bottom-right
- popup window anchored to the button
- button actions: open chat, stop/start runtime, open settings
- blur/outside-click collapses popup back to button
Prefer reusing the existing SuperRPA overlay/dialog/message primitives in `sg_compat.js` instead of inventing a second overlay stack.
**Step 4: Run smoke to verify it passes**
Run the same smoke command.
Expected: smoke reaches the popup, submits a task, and collapses correctly after blur.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/sgclaw_overlay.js \
chrome/browser/resources/superrpa/sg_compat.js \
chrome/browser/superrpa/hooks/hook_injector.cc \
chrome/browser/superrpa/hooks/hook_injector.h \
chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw floating launcher"
```
### Task 3: Upgrade Browser Session State From “Result Page” To “Real Conversation”
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.h`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui.h`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc`
**Step 1: Write the failing browser-side tests**
Add tests for:
- conversation messages are returned by `sgclawConnect`
- reopening the chat keeps prior user/assistant turns
- `sgclawSubmitTask` appends a user turn immediately and an assistant turn when complete
Example expectation:
```cc
EXPECT_EQ("user", FindStringValue(*message, "role"));
EXPECT_EQ("打开百度搜索天气", FindStringValue(*message, "content"));
```
**Step 2: Run test to verify it fails**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease \
functions_ui_mainline_unittests
./out/KylinRelease/functions_ui_mainline_unittests
```
Expected: tests fail because runtime state only has logs/final result.
**Step 3: Write minimal implementation**
Extend `SgClawSessionService` to store:
- conversation id
- ordered messages
- pending assistant reply state
- runtime status/logs
Keep the debug page and popup both consuming the same runtime shape.
**Step 4: Run test to verify it passes**
Run the same test command.
Expected: connect/reopen behavior passes and conversation persists while browser stays open.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/ui/webui/superrpa/sgclaw_session_service.h \
chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc \
chrome/browser/ui/webui/superrpa/functions_ui.h \
chrome/browser/ui/webui/superrpa/functions_ui.cc \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts \
chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc
git -C /home/zyl/projects/superRpa/src commit -m "feat: persist sgclaw conversation state"
```
### Task 4: Extend sgClaw Submit Protocol For Multi-Turn Context
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.h`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
- Modify: `/home/zyl/projects/sgClaw/claw/src/pipe/protocol.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc`
**Step 1: Write the failing protocol tests**
Add tests that `submit_task` can carry:
- current user input
- prior user/assistant turns
- active page URL / title hints if needed
For Rust, add a test that two consecutive submits produce a provider request containing prior turns.
**Step 2: Run tests to verify they fail**
Run:
```bash
python3 /home/zyl/projects/superRpa/src/tools/crates/run_cargo.py test \
--manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_runtime_test
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease \
sgclaw_process_host_mainline_unittests
./out/KylinRelease/sgclaw_process_host_mainline_unittests \
--gtest_filter='SgClawProcessHostMainlineTest.*'
```
Expected: tests fail because submit currently only sends a raw instruction string.
**Step 3: Write minimal implementation**
Change the pipe payload from one-shot instruction to:
```json
{
"type": "submit_task",
"instruction": "...",
"messages": [
{"role": "user", "content": "..."},
{"role": "assistant", "content": "..."}
]
}
```
On the Rust side, feed this history into the ZeroClaw turn so the next submit is a continuation, not a new session.
**Step 4: Run tests to verify they pass**
Run the same Rust + browser unit commands.
Expected: previous-turn context reaches the provider path.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/sgClaw/claw add \
src/pipe/protocol.rs src/agent/mod.rs src/compat/runtime.rs tests/compat_runtime_test.rs
git -C /home/zyl/projects/sgClaw/claw commit -m "feat: carry conversation history through sgclaw pipe"
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.h \
chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc \
chrome/browser/superrpa/sgclaw/sgclaw_process_host.cc \
chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc \
chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc
git -C /home/zyl/projects/superRpa/src commit -m "feat: send sgclaw conversation context"
```
### Task 5: Harden Tool Schema And DeepSeek Compatibility
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/browser_tool_adapter.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_browser_tool_test.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tools/browser_smoke/run_deepseek_browser_smoke.mjs`
**Step 1: Write the failing tests**
Cover:
- `getText` without `selector` is rejected before it hits the browser
- `click` without `selector` is rejected
- `navigate` without `url` is rejected
- DeepSeek multi-round tool-call history does not trigger the `role=tool` 400 anymore
- non-task greeting behavior is explicit: either reject or answer in chat-only mode, but not silently pretend to be a browser task
**Step 2: Run tests to verify they fail**
Run:
```bash
python3 /home/zyl/projects/superRpa/src/tools/crates/run_cargo.py test \
--manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --lib --tests
node /home/zyl/projects/sgClaw/claw/tools/browser_smoke/run_deepseek_browser_smoke.mjs
```
Expected: current code allows incomplete tool args and still has DeepSeek history edge cases.
**Step 3: Write minimal implementation**
Implement:
- action-specific required param validation in `browser_tool_adapter.rs`
- better tool-result/history formatting if needed for DeepSeek compatibility
- explicit user-facing handling for non-browser-chat input
**Step 4: Run tests to verify they pass**
Run the same Rust tests and browser smoke.
Expected: no malformed tool actions, no DeepSeek `role=tool` 400 in smoke.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/sgClaw/claw add \
src/compat/browser_tool_adapter.rs \
src/compat/runtime.rs \
tests/compat_browser_tool_test.rs \
tests/compat_runtime_test.rs \
tools/browser_smoke/run_deepseek_browser_smoke.mjs
git -C /home/zyl/projects/sgClaw/claw commit -m "fix: harden sgclaw tool protocol for DeepSeek"
```
### Task 6: Final Verification And Manual Smoke Checklist
**Files:**
- Modify if needed: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
- Document manual steps in PR/summary, not code
**Step 1: Run automated verification**
```bash
python3 /home/zyl/projects/superRpa/src/tools/crates/run_cargo.py test \
--manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --lib --tests
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease \
functions_ui_mainline_unittests \
sgclaw_process_host_mainline_unittests
./out/KylinRelease/functions_ui_mainline_unittests
./out/KylinRelease/sgclaw_process_host_mainline_unittests \
--gtest_filter='SgClawProcessHostMainlineTest.*'
node /home/zyl/projects/sgClaw/claw/tools/browser_smoke/run_deepseek_browser_smoke.mjs
```
Expected: all pass.
**Step 2: Manual smoke**
1. Open a normal HTTP/HTTPS page.
2. Verify the floating button appears.
3. Click to open popup.
4. Start sgClaw from popup.
5. Submit one browser task and one follow-up task.
6. Click outside popup and verify it collapses to the button.
7. Reopen popup and verify conversation history is still present.
8. Open settings from the launcher, update model/base URL, return to popup, submit again, and verify hot update.
**Step 3: Final commit if verification requires touch-ups**
Use focused commit messages only for actual fixes found during verification.