645 lines
42 KiB
HTML
645 lines
42 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="zh-CN">
|
||
<head>
|
||
<meta charset="UTF-8">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||
<title>sgClaw 智能浏览器自动化平台 - 组件职责与流转全景图</title>
|
||
<script src="https://cdn.jsdelivr.net/npm/mermaid@10.9.5/dist/mermaid.min.js"></script>
|
||
<style>
|
||
*{margin:0;padding:0;box-sizing:border-box}
|
||
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI","PingFang SC","Hiragino Sans GB","Microsoft YaHei",sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.8}
|
||
.header{background:linear-gradient(135deg,#0a1628,#16213e,#1a3a5c);padding:3rem 2rem;text-align:center;border-bottom:3px solid #e65100}
|
||
.header h1{font-size:2.2rem;color:#e6edf3;margin-bottom:.5rem}
|
||
.header .subtitle{color:#8b949e;font-size:1rem}
|
||
.container{max-width:1400px;margin:0 auto;padding:2rem}
|
||
.section{background:#161b22;border:1px solid #30363d;border-radius:12px;margin-bottom:2rem;overflow:hidden}
|
||
.section-header{background:linear-gradient(90deg,#1a1a2e,#16213e);padding:1rem 1.5rem;border-bottom:1px solid #30363d;display:flex;align-items:center;gap:.8rem}
|
||
.section-number{background:#e65100;color:#fff;width:32px;height:32px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;flex-shrink:0}
|
||
.section-title{font-size:1.2rem;color:#e6edf3;font-weight:600}
|
||
.section-body{padding:1.5rem;overflow-x:auto}
|
||
.mermaid{display:flex;justify-content:center;padding:1rem 0}
|
||
.mermaid svg{max-width:100%;height:auto}
|
||
.desc{background:#1a1a2e;border-left:3px solid #e65100;padding:1rem 1.2rem;margin:1rem 0;border-radius:0 8px 8px 0;font-size:.95rem;color:#8b949e}
|
||
.desc strong{color:#e6edf3}
|
||
.component-grid{display:grid;grid-template-columns:repeat(auto-fit,minmax(300px,1fr));gap:1.2rem;margin:1.5rem 0}
|
||
.component-card{background:#1a1a2e;border:1px solid #30363d;border-radius:10px;padding:1.3rem;transition:all .2s}
|
||
.component-card:hover{border-color:#e65100;transform:translateY(-2px)}
|
||
.component-card h3{color:#e65100;font-size:1.05rem;margin-bottom:.6rem;display:flex;align-items:center;gap:.5rem}
|
||
.component-card .badge{background:#e65100;color:#fff;padding:.15rem .5rem;border-radius:12px;font-size:.75rem;font-weight:600}
|
||
.component-card .badge.external{background:#4a9eff}
|
||
.component-card p{color:#8b949e;font-size:.9rem;margin-bottom:.5rem}
|
||
.component-card .meta{display:flex;flex-direction:column;gap:.3rem;margin-top:.8rem;padding-top:.8rem;border-top:1px solid #30363d}
|
||
.component-card .meta-item{display:flex;gap:.5rem;font-size:.85rem}
|
||
.component-card .meta-label{color:#6e7681;white-space:nowrap;min-width:60px}
|
||
.component-card .meta-value{color:#c9d1d9}
|
||
.flow-container{background:#1a1a2e;border-radius:10px;padding:1.5rem;margin:1rem 0}
|
||
.flow-step{display:flex;gap:1rem;align-items:flex-start;margin-bottom:1rem;padding:.8rem;background:#161b22;border-radius:8px;border-left:3px solid #e65100}
|
||
.flow-step:last-child{margin-bottom:0}
|
||
.flow-step-number{background:#e65100;color:#fff;width:28px;height:28px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;flex-shrink:0}
|
||
.flow-step-content{flex:1}
|
||
.flow-step-content h4{color:#e6edf3;font-size:1rem;margin-bottom:.3rem}
|
||
.flow-step-content p{color:#8b949e;font-size:.9rem}
|
||
.flow-step-content .highlight{color:#4a9eff;font-weight:600}
|
||
.footer{text-align:center;padding:2rem;color:#484f58;font-size:.85rem;border-top:1px solid #21262d}
|
||
</style>
|
||
</head>
|
||
<body>
|
||
<div class="header">
|
||
<h1>sgClaw 智能浏览器自动化平台</h1>
|
||
<div class="subtitle">核心组件职责与流转全景图 - 每个组件是什么 做什么 什么时候调用</div>
|
||
</div>
|
||
<div class="container">
|
||
|
||
<!-- Section 1: Overview -->
|
||
<div class="section">
|
||
<div class="section-header">
|
||
<div class="section-number">1</div>
|
||
<div class="section-title">全景概览 - 从用户指令到浏览器执行的完整链路</div>
|
||
</div>
|
||
<div class="section-body">
|
||
<div class="desc">
|
||
当用户说出"帮我查本月线损率"时,sgClaw 内部多个组件协同工作。以下是<strong>完整的执行链路</strong>,展示每个组件在哪个环节被调用、承担什么职责。
|
||
</div>
|
||
<div class="mermaid">
|
||
graph TB
|
||
U["用户\n输入自然语言指令"] -->|"1. SubmitTask"| GW["通信网关\nSTDIO Pipe / Service WS\n接收请求 建立会话"]
|
||
GW -->|"2. 加载配置"| CFG["SgClawSettings\n加载 sgclaw_config.json\nLLM Provider RuntimeProfile SkillsDir"]
|
||
CFG -->|"3. 四级路由决策"| RT["Agent Runtime\ntask_runner 任务调度"]
|
||
|
||
RT -->|"3a. 匹配场景"| DS["确定性执行\ndeterministic_submit\nscene_platform 匹配场景清单\n直接执行预设脚本 无需LLM"]
|
||
RT -->|"3b. 主编排"| PO["主编排路径\nzeroclaw_process_message_primary\n完整Agent工具循环 LLM自主规划"]
|
||
RT -->|"3c. 直连技能"| DSK["直连技能路径\ndirect_skill_primary\n配置指定skill.tool直接执行"]
|
||
RT -->|"3d. 标准LLM"| ZC["标准LLM路径\ncompat_llm_primary\nzeroclaw agent turn 默认回退"]
|
||
|
||
DS -->|"4. 执行操作"| BB["浏览器后端\nBrowserBackend trait\nPipeBrowser / WsBrowser"]
|
||
PO -->|"4. 调用工具"| BB
|
||
DSK -->|"4. 执行操作"| BB
|
||
ZC -->|"4. 调用工具"| BB
|
||
|
||
BB -->|"5. 安全校验"| SC["MAC Policy\n检查 rules.json\n域名白名单 动作白名单 HMAC"]
|
||
SC -->|"6. 执行命令"| EXT["SuperRPA Chromium\n执行实际DOM操作\n导航 点击 输入 读取"]
|
||
EXT -->|"7. 返回结果"| BB
|
||
BB -->|"8. 结果回传"| RT
|
||
RT -->|"9. 后处理"| PH["Report Artifact\nopenxml_office 生成Excel\nscreen_html_export 生成大屏"]
|
||
PH -->|"10. TaskComplete"| GW
|
||
GW -->|"11. 结果"| U
|
||
|
||
classDef userNode fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||
classDef coreNode fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||
classDef routeNode fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||
classDef extNode fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||
classDef cfgNode fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||
|
||
class U userNode
|
||
class GW,RT,BB,PH coreNode
|
||
class DS,PO,DSK,ZC routeNode
|
||
class SC,EXT extNode
|
||
class CFG cfgNode
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- Section 2: Core Components Detail -->
|
||
<div class="section">
|
||
<div class="section-header">
|
||
<div class="section-number">2</div>
|
||
<div class="section-title">核心组件详解 - 职责 调用时机 输入输出</div>
|
||
</div>
|
||
<div class="section-body">
|
||
<div class="desc">
|
||
以下是每个核心组件的详细说明。点击卡片可查看<strong>什么时候调用</strong>、<strong>输入什么</strong>、<strong>输出什么</strong>。
|
||
</div>
|
||
<div class="component-grid">
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>通信网关</h3>
|
||
<p>负责接收用户请求、建立会话、返回最终结果。支持两种模式:STDIO Pipe(默认,与浏览器宿主通过 stdin/stdout JSON Line 通信)和 Service WS(WebSocket 服务模式,接受外部客户端连接)。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">用户发起请求时第一时间响应</span></div>
|
||
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">SubmitTask 消息(指令 conversationId pageUrl pageTitle)</span></div>
|
||
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">TaskComplete LogEntry StatusChanged 消息</span></div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>Agent Runtime 任务调度</h3>
|
||
<p>run_submit_task() 是任务执行入口。依次执行四级路由决策:① deterministic_submit 确定性场景匹配 ② primary_orchestration 主编排 ③ direct_submit_skill 直连技能 ④ compat_llm_primary 标准LLM回退。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">SubmitTask 消息到达后</span></div>
|
||
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">指令 AgentRuntimeContext BrowserPipeTool</span></div>
|
||
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">AgentMessage::TaskComplete</span></div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>场景平台 Scene Platform</h3>
|
||
<p>扫描 skills/ 目录下的场景清单(scene.toml),解析 deterministic 段落的关键词规则。当用户指令匹配时,构建 DeterministicExecutionPlan(含 target_url org_code period_mode 等执行参数),直接执行预设脚本。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">四级路由决策第一步</span></div>
|
||
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">用户指令 pageUrl pageTitle skills目录</span></div>
|
||
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">DeterministicExecutionPlan 或 NotDeterministic</span></div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>SgClawSettings 配置管理</h3>
|
||
<p>从 JSON 配置文件或环境变量加载运行时配置:多 Provider 管理(apiKey baseUrl model)、Runtime Profile、SkillsDir、BrowserBackend 类型、OfficeBackend、Service WS 监听地址等。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次任务提交时加载</span></div>
|
||
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">sgclaw_config.json 或环境变量</span></div>
|
||
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">SgClawSettings 结构体</span></div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>Runtime Engine 运行时引擎</h3>
|
||
<p>根据 Runtime Profile(BrowserAttached/BrowserHeavy/GeneralAssistant)构建 Tool Policy 白名单,加载技能包,注入 Memory,构建 Agent 实例。同时负责指令增强(附加浏览器合约提示、检测特定任务类型)。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">主编排路径和标准LLM路径构建Agent时</span></div>
|
||
<div class="meta-item"><span class="meta-label">核心方法:</span><span class="meta-value">build_agent() build_instruction()</span></div>
|
||
<div class="meta-item"><span class="meta-label">Profile:</span><span class="meta-value">BrowserAttached / BrowserHeavy / GeneralAssistant</span></div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="component-card">
|
||
<h3><span class="badge external">外部</span>ZeroClaw Core 智能体核心</h3>
|
||
<p>位于 third_party/zeroclaw/ 的 vendored Agent 核心库。提供 Agent 构建、Provider 管理、工具循环、Memory 接口、技能加载、Prompt 组装等核心能力。sgClaw 在其基础上叠加安全信封层。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">主编排和标准LLM路径中</span></div>
|
||
<div class="meta-item"><span class="meta-label">位置:</span><span class="meta-value">third_party/zeroclaw/</span></div>
|
||
<div class="meta-item"><span class="meta-label">核心能力:</span><span class="meta-value">Agent Provider ToolLoop Memory Skills</span></div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>Browser Backend 浏览器后端</h3>
|
||
<p>统一的浏览器操作接口(BrowserBackend trait)。两种实现:PipeBrowserBackend(通过 STDIO 与宿主通信)和 WsBrowserBackend(通过 WebSocket 直连 DevTools)。支持 SuperRpa/AgentBrowser/RustNative/ComputerUse 多种后端类型。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">需要操作浏览器时</span></div>
|
||
<div class="meta-item"><span class="meta-label">支持操作:</span><span class="meta-value">navigate click type getText eval select scrollTo 等15种</span></div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>MAC Policy 安全策略</h3>
|
||
<p>从 resources/rules.json 加载安全规则。三层安全模型:①握手时 HMAC seed 交换和会话密钥派生 ②Rust 侧域名+动作白名单校验 ③宿主侧 HMAC 二次验证。拒绝不在白名单的域名和被禁用的动作。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次浏览器操作执行前</span></div>
|
||
<div class="meta-item"><span class="meta-label">检查项:</span><span class="meta-value">域名白名单 动作类型 HMAC验证</span></div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="component-card">
|
||
<h3><span class="badge external">外部</span>SuperRPA Chromium 浏览器宿主</h3>
|
||
<p>实际执行 DOM 操作的外部系统。接收 sgClaw 的 Command(含 HMAC),验证后执行 navigate/click/type/getText 等操作,返回 Response(含操作结果 + HMAC)。STDIO 模式下与 sgClaw 进程通过管道通信。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">BrowserBackend 发送命令时</span></div>
|
||
<div class="meta-item"><span class="meta-label">通信协议:</span><span class="meta-value">STDIO JSON Line 或 WebSocket</span></div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- Section 3: LLM Detail Flow -->
|
||
<div class="section">
|
||
<div class="section-header">
|
||
<div class="section-number">3</div>
|
||
<div class="section-title">LLM 大模型工作全流程 - 从语义识别到任务规划</div>
|
||
</div>
|
||
<div class="section-body">
|
||
<div class="desc">
|
||
当用户指令无法匹配已知技能时,LLM 大模型开始工作。以下是<strong>大模型从理解用户意图到生成可执行计划的完整过程</strong>。
|
||
</div>
|
||
<div class="flow-container">
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">1</div>
|
||
<div class="flow-step-content">
|
||
<h4>语义识别 - 理解用户说了什么</h4>
|
||
<p>LLM 接收用户自然语言指令,识别用户的<strong>真实意图</strong>。例如"帮我查本月线损率" → 识别为"查询线损率数据"。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">2</div>
|
||
<div class="flow-step-content">
|
||
<h4>场景匹配 - 判断是否为已知场景</h4>
|
||
<p>结合 <span class="highlight">Memory(记忆模块)</span>中存储的历史任务记录,判断该指令是否与已有技能匹配。如果匹配,转交快速通道执行。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">3</div>
|
||
<div class="flow-step-content">
|
||
<h4>任务拆解 - 将大目标分解为小步骤</h4>
|
||
<p>如果是新场景,LLM 将用户目标拆解为具体的、可操作的步骤序列。例如:打开系统 → 选择月份 → 点击查询 → 读取数据 → 导出Excel。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">4</div>
|
||
<div class="flow-step-content">
|
||
<h4>工具选择 - 决定用什么能力完成任务</h4>
|
||
<p>LLM 根据步骤需求,从可用工具库中选择合适的工具。例如:需要打开网页选择"导航工具",需要点击按钮选择"点击工具",需要读取数据选择"读取工具"。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">5</div>
|
||
<div class="flow-step-content">
|
||
<h4>参数填充 - 确定每个工具的具体参数</h4>
|
||
<p>LLM 为每个工具填充具体参数。例如点击工具需要知道"点击哪个按钮",导航工具需要知道"打开哪个URL"。这些参数从用户指令和上下文中提取。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">6</div>
|
||
<div class="flow-step-content">
|
||
<h4>执行计划生成 - 输出可执行的JSON/结构化指令</h4>
|
||
<p>LLM 将拆解的步骤、选择的工具、填充的参数整合为<strong>结构化的执行计划</strong>,交由工具执行引擎依次执行。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">7</div>
|
||
<div class="flow-step-content">
|
||
<h4>循环迭代 - 根据执行结果动态调整</h4>
|
||
<p>如果某一步执行失败或结果不符合预期,LLM 会收到反馈,重新规划后续步骤。例如页面打不开则尝试备用URL,元素找不到则换选择器。</p>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- Section 4: Memory, Skills & Runtime Engine -->
|
||
<div class="section">
|
||
<div class="section-header">
|
||
<div class="section-number">4</div>
|
||
<div class="section-title">Memory 技能管理 与 Runtime Engine - 运行时核心引擎</div>
|
||
</div>
|
||
<div class="section-body">
|
||
<div class="desc">
|
||
sgClaw 的运行时核心由三大引擎协同工作:<strong>Memory(记忆模块)</strong>负责持久化存储对话历史与任务状态,<strong>技能管理系统</strong>负责加载和注入技能包到 Agent,<strong>Runtime Engine</strong>负责根据 Runtime Profile 构建完整的 Agent 运行环境(工具策略 + 技能加载 + 指令增强)。
|
||
</div>
|
||
<div class="mermaid">
|
||
graph TB
|
||
subgraph Memory["Memory 记忆模块 zeroclaw::memory"]
|
||
M1["SQLite 存储 brain.db\n对话历史 任务状态 执行结果"]
|
||
M2["Memory Trait 接口\ncreateMemoryWithStorage\n支持多种后端 SQLite/文件"]
|
||
M1 -.->|"读写"| M2
|
||
end
|
||
|
||
subgraph SkillMgmt["技能管理 Skills Management"]
|
||
S1["技能加载器\nloadSkillsFromDirectory\n按目录扫描技能包"]
|
||
S2["技能过滤器\n按浏览器可用性过滤\nbrowser_script 工具裁剪"]
|
||
S3["ReadSkill Tool\n运行时按需读取技能详情\n支持 open_skills 配置"]
|
||
S4["技能目录解析\nskills/ 默认目录\n自定义 skillsDir"]
|
||
S1 --> S2
|
||
S4 --> S1
|
||
S1 --> S3
|
||
end
|
||
|
||
subgraph RuntimeEngine["Runtime Engine 运行时引擎"]
|
||
R1["Runtime Profile\nBrowserAttached / BrowserHeavy / GeneralAssistant"]
|
||
R2["Tool Policy 工具策略\n按 Profile 维护工具白名单\nallowed_tools 列表"]
|
||
R3["Agent Builder\n组装 Provider + Tools + Memory + Skills\n构建完整 Agent 实例"]
|
||
R4["指令增强器\n附加浏览器合约提示\n检测知乎热榜/Excel导出/大屏任务"]
|
||
R1 -->|"决定"| R2
|
||
R2 -->|"约束"| R3
|
||
R3 -->|"使用"| R4
|
||
end
|
||
|
||
Memory -->|"注入"| RuntimeEngine
|
||
SkillMgmt -->|"注入"| RuntimeEngine
|
||
|
||
classDef memFill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||
classDef skillFill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||
classDef runtimeFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||
|
||
class Memory,M1,M2 memFill
|
||
class SkillMgmt,S1,S2,S3,S4 skillFill
|
||
class RuntimeEngine,R1,R2,R3,R4 runtimeFill
|
||
</div>
|
||
<div class="component-grid" style="margin-top:1.5rem">
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>Memory 记忆模块</h3>
|
||
<p><strong>职责:</strong>基于 SQLite(brain.db)持久化存储对话历史、任务状态和执行结果。通过 zeroclaw::memory::Memory trait 提供统一接口,支持多种存储后端。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">Agent 构建时创建 每次 LLM 调用前后读写</span></div>
|
||
<div class="meta-item"><span class="meta-label">调用者:</span><span class="meta-value">Runtime Engine(build_agent 方法)</span></div>
|
||
<div class="meta-item"><span class="meta-label">存储路径:</span><span class="meta-value">workspace/memory/brain.db</span></div>
|
||
</div>
|
||
</div>
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>技能管理系统</h3>
|
||
<p><strong>职责:</strong>从 skills/ 目录(或自定义 skillsDir)扫描加载技能包,按浏览器是否可用过滤 browser_script 工具,通过 ReadSkill Tool 让 Agent 按需读取技能详情。支持 open_skills 独立技能目录配置。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次 Agent 构建时加载技能列表</span></div>
|
||
<div class="meta-item"><span class="meta-label">调用者:</span><span class="meta-value">Runtime Engine(load_skills_for_surface)</span></div>
|
||
<div class="meta-item"><span class="meta-label">技能来源:</span><span class="meta-value">workspace/skills/ 或 skillsDir 配置</span></div>
|
||
</div>
|
||
</div>
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>Runtime Engine</h3>
|
||
<p><strong>职责:</strong>运行时核心编排器。根据 Runtime Profile 决定工具白名单,加载技能,注入 Memory,构建 Agent 实例。同时负责指令增强(附加浏览器合约提示、检测特定任务类型如知乎热榜/Excel导出/大屏展示)。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次任务提交时 构建 Agent 前</span></div>
|
||
<div class="meta-item"><span class="meta-label">核心方法:</span><span class="meta-value">build_agent() build_instruction()</span></div>
|
||
<div class="meta-item"><span class="meta-label">Profile:</span><span class="meta-value">BrowserAttached / BrowserHeavy / GeneralAssistant</span></div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- Section 5: Task Routing - 4 Execution Paths -->
|
||
<div class="section">
|
||
<div class="section-header">
|
||
<div class="section-number">5</div>
|
||
<div class="section-title">任务路由 - 四种执行路径决策树</div>
|
||
</div>
|
||
<div class="section-body">
|
||
<div class="desc">
|
||
任务提交到 sgClaw 后,<strong>Agent Runtime</strong> 按优先级依次判断走哪条执行路径。这不是简单的"快速/AI"二选一,而是<strong>四级决策树</strong>。
|
||
</div>
|
||
<div class="mermaid">
|
||
graph TB
|
||
A["SubmitTask 用户指令进入"] --> B["1. deterministic_submit\n场景平台匹配"]
|
||
B -->|"匹配已知确定场景"| C["确定性执行路径\ndeterministic_submit\n直接执行预设场景脚本"]
|
||
B -->|"未匹配 非确定性"| D["2. Primary Orchestration\nzeroclaw process_message"]
|
||
|
||
D -->|"browser_surface_enabled\n且 should_use_primary"| E["主编排路径\nzeroclaw_process_message_primary\n完整 Agent 工具循环"]
|
||
D -->|"不满足条件"| F["3. direct_submit_skill\n配置了直连技能"]
|
||
|
||
F -->|"directSubmitSkill已配置"| G["直连技能路径\ndirect_skill_primary\n绕过Agent直接执行"]
|
||
F -->|"未配置"| H["4. compat_llm_primary\n标准LLM路径\nzeroclaw agent turn"]
|
||
|
||
C --> I["TaskComplete 返回结果"]
|
||
E --> I
|
||
G --> I
|
||
H --> I
|
||
|
||
classDef routeFill fill:#e65100,stroke:#ff6d00,color:#fff
|
||
classDef path1Fill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||
classDef path2Fill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||
classDef path3Fill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||
classDef path4Fill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||
classDef endFill fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||
|
||
class B,D,F routeFill
|
||
class C path1Fill
|
||
class E path2Fill
|
||
class G path3Fill
|
||
class H path4Fill
|
||
class I endFill
|
||
</div>
|
||
<div class="flow-container" style="margin-top:1.5rem">
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">1</div>
|
||
<div class="flow-step-content">
|
||
<h4>确定性场景匹配 - deterministic_submit</h4>
|
||
<p>通过 <span class="highlight">scene_platform</span> 模块扫描 skills/ 目录下的场景清单(scene.toml),匹配指令关键词、URL、页面标题。匹配成功则构建 <span class="highlight">DeterministicExecutionPlan</span>,直接执行场景预设的浏览器脚本,<strong>无需 LLM 参与</strong>。典型场景:线损查询、报表导出等固定流程。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">2</div>
|
||
<div class="flow-step-content">
|
||
<h4>主编排路径 - zeroclaw_process_message_primary</h4>
|
||
<p>当 Runtime Profile 启用浏览器工具(browser_surface_enabled)且 <span class="highlight">orchestration::should_use_primary</span> 判定走主编排时,调用 zeroclaw 的 process_message 完整 Agent 循环。LLM 可以调用所有允许的工具(浏览器操作、技能工具等),支持多轮工具调用和动态规划。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">3</div>
|
||
<div class="flow-step-content">
|
||
<h4>直连技能路径 - direct_skill_primary</h4>
|
||
<p>当配置中设置了 <span class="highlight">directSubmitSkill</span>(格式:skillName.toolName),绕过正常 Agent 循环,直接执行指定的技能工具。适用于需要固定流程但又不适合确定性场景的中间态。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">4</div>
|
||
<div class="flow-step-content">
|
||
<h4>标准 LLM 路径 - compat_llm_primary</h4>
|
||
<p>以上三条路都不通时的默认回退。创建标准 zeroclaw Agent turn,LLM 根据指令自主决定使用哪些工具。这是最灵活但也最慢的路径。</p>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- Section 6: Browser Execution Full Process -->
|
||
<div class="section">
|
||
<div class="section-header">
|
||
<div class="section-number">6</div>
|
||
<div class="section-title">浏览器执行全过程 - 从sgClaw到SuperRPA浏览器的命令传输</div>
|
||
</div>
|
||
<div class="section-body">
|
||
<div class="desc">
|
||
sgClaw 有两种浏览器后端模式:<strong>STDIO Pipe 模式</strong>(sgClaw 进程通过 stdin/stdout 与浏览器宿主通信)和 <strong>WebSocket 模式</strong>(直接连接浏览器 DevTools WebSocket)。安全校验在两种模式下都由 MAC Policy 层负责。
|
||
</div>
|
||
<div class="mermaid">
|
||
graph TB
|
||
subgraph PipeMode["STDIO Pipe 模式(嵌入SuperRPA)"]
|
||
TE1["ZeroClawBrowserTool\n实现 zeroclaw::tools::Tool trait\n暴露 browser_action / superrpa_browser"]
|
||
SC1["MAC Policy 安全策略\n检查 rules.json 域名白名单\n动作白名单 HMAC验证"]
|
||
BC1["BrowserPipeTool\n分配 seq 计算 HMAC\n发送Command 等待Response"]
|
||
TP1["StdioTransport\nJSON Line 协议\nstdin/stdout 1MB限制"]
|
||
HOST1["浏览器宿主进程\nSuperRPA Chromium\n验证HMAC 执行DOM操作"]
|
||
|
||
TE1 -->|"tool call"| SC1
|
||
SC1 -->|"校验通过"| BC1
|
||
BC1 -->|"Command + HMAC"| TP1
|
||
TP1 -->|"JSON Line"| HOST1
|
||
HOST1 -->|"Response + HMAC"| TP1
|
||
TP1 -->|"匹配 seq 返回"| BC1
|
||
BC1 -->|"结果"| TE1
|
||
end
|
||
|
||
subgraph WsMode["WebSocket 模式(独立运行)"]
|
||
TE2["ZeroClawBrowserTool\n相同的 Tool 接口"]
|
||
SC2["MAC Policy 相同的安全检查"]
|
||
BC2["WsBrowserBackend\nWebSocket 连接\nDevTools Protocol"]
|
||
WS1["WebSocket 协议层\ntungstenite 库"]
|
||
HOST2["浏览器 DevTools\nChrome DevTools Protocol"]
|
||
|
||
TE2 -->|"tool call"| SC2
|
||
SC2 -->|"校验通过"| BC2
|
||
BC2 -->|"CDP Command"| WS1
|
||
WS1 -->|"ws://host:port"| HOST2
|
||
HOST2 -->|"CDP Response"| WS1
|
||
WS1 -->|"结果"| BC2
|
||
BC2 -->|"结果"| TE2
|
||
end
|
||
|
||
classDef teFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||
classDef scFill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||
classDef bcFill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||
classDef tpFill fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||
classDef hostFill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||
|
||
class TE1,TE2 teFill
|
||
class SC1,SC2 scFill
|
||
class BC1,BC2 bcFill
|
||
class TP1,WS1 tpFill
|
||
class HOST1,HOST2 hostFill
|
||
</div>
|
||
<div class="component-grid" style="margin-top:1.5rem">
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>ZeroClawBrowserTool</h3>
|
||
<p><strong>职责:</strong>实现 zeroclaw::tools::Tool trait,将 BrowserBackend 适配为 LLM 可调用的工具。暴露两个工具名:browser_action(传统别名)和 superrpa_browser(SuperRPA 专用,优先使用)。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">LLM 决定操作浏览器时</span></div>
|
||
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">compat/browser_tool_adapter.rs</span></div>
|
||
</div>
|
||
</div>
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>MAC Policy 安全策略</h3>
|
||
<p><strong>职责:</strong>从 resources/rules.json 加载安全规则。三层安全检查:①握手时 HMAC seed 交换 ②Rust 侧域名+动作白名单校验 ③宿主侧 HMAC 二次验证。拒绝不在白名单的域名和被禁用的动作。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次浏览器工具调用前</span></div>
|
||
<div class="meta-item"><span class="meta-label">规则文件:</span><span class="meta-value">resources/rules.json</span></div>
|
||
</div>
|
||
</div>
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>BrowserBackend 浏览器后端</h3>
|
||
<p><strong>职责:</strong>统一的浏览器操作接口(BrowserBackend trait)。两种实现:PipeBrowserBackend(通过 StdioTransport 与宿主通信)和 WsBrowserBackend(通过 WebSocket 直连 DevTools)。由 BrowserBackend 配置决定使用哪种。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">后端类型:</span><span class="meta-value">SuperRpa / AgentBrowser / RustNative / ComputerUse / Auto</span></div>
|
||
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">browser/pipe_backend.rs browser/ws_backend.rs</span></div>
|
||
</div>
|
||
</div>
|
||
<div class="component-card">
|
||
<h3><span class="badge">内部</span>BrowserPipeTool</h3>
|
||
<p><strong>职责:</strong>STDIO Pipe 模式下的特权浏览器工具。为每个命令分配单调递增 seq,使用派生会话密钥计算 HMAC,发送 Command 消息后阻塞等待匹配的 Response,支持超时。</p>
|
||
<div class="meta">
|
||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">Pipe 模式下每次浏览器操作</span></div>
|
||
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">pipe/browser_tool.rs</span></div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- Section 7: External Systems -->
|
||
<div class="section">
|
||
<div class="section-header">
|
||
<div class="section-number">7</div>
|
||
<div class="section-title">外部系统关系图 - sgClaw与谁交互</div>
|
||
</div>
|
||
<div class="section-body">
|
||
<div class="desc">
|
||
sgClaw 不是孤立运行的,它与多个<strong>外部系统</strong>协同工作。以下是sgClaw与外部系统的交互关系。
|
||
</div>
|
||
<div class="mermaid">
|
||
graph TB
|
||
subgraph External["外部系统 - sgClaw不控制这些系统"]
|
||
E1["LLM 提供商\nDeepSeek OpenAI Claude\nHTTP API 调用"]
|
||
E2["SuperRPA Chromium\n浏览器宿主进程\nSTDIO 或 WebSocket"]
|
||
E3["业务系统\n线损系统 客服系统\n通过浏览器访问"]
|
||
E4["客户端\nsg_claw_client CLI\nService WebSocket 连接"]
|
||
end
|
||
|
||
subgraph sgClawInternal["sgClaw 内部"]
|
||
S1["通信网关\nSTDIO Pipe / Service WS"]
|
||
S2["Agent Runtime\ntask_runner 任务调度"]
|
||
S3["Runtime Engine\n构建Agent 工具策略"]
|
||
S4["ZeroClaw Core\nthird_party/zeroclaw\nAgent循环 工具循环"]
|
||
S5["MAC Policy\n安全策略 rules.json"]
|
||
S6["Browser Backend\nPipeBrowser / WsBrowser"]
|
||
end
|
||
|
||
E4 -->|"SubmitTask"| S1
|
||
S1 -->|"TaskComplete / LogEntry"| E4
|
||
|
||
S2 -->|"构建 Agent"| S3
|
||
S3 -->|"build_agent"| S4
|
||
|
||
S4 -->|"发送Prompt 接收响应"| E1
|
||
S4 -->|"调用工具"| S5
|
||
S5 -->|"校验通过"| S6
|
||
S6 -->|"浏览器命令"| E2
|
||
E2 -->|"DOM操作"| E3
|
||
E3 -->|"页面数据"| E2
|
||
E2 -->|"命令结果"| S6
|
||
S6 -->|"结果"| S4
|
||
S4 -->|"事件桥接 log_entry"| S1
|
||
|
||
classDef extFill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||
classDef intFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||
|
||
class External,E1,E2,E3,E4 extFill
|
||
class sgClawInternal,S1,S2,S3,S4,S5,S6 intFill
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- Section 8: Complete Lifecycle -->
|
||
<div class="section">
|
||
<div class="section-header">
|
||
<div class="section-number">8</div>
|
||
<div class="section-title">完整生命周期 - 一个任务从出生到结束</div>
|
||
</div>
|
||
<div class="section-body">
|
||
<div class="desc">
|
||
以一个真实场景为例:<strong>"帮我查本月线损率并导出Excel"</strong>,展示sgClaw从接收指令到返回结果的完整生命周期。
|
||
</div>
|
||
<div class="flow-container">
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">1</div>
|
||
<div class="flow-step-content">
|
||
<h4><span class="highlight">通信网关</span>接收指令</h4>
|
||
<p>浏览器宿主进程通过 STDIO(JSON Line 协议)发送 SubmitTask 消息。sgClaw 创建会话,解析指令、page_url、page_title、conversation_id。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">2</div>
|
||
<div class="flow-step-content">
|
||
<h4><span class="highlight">加载配置</span>SgClawSettings</h4>
|
||
<p>从 sgclaw_config.json 或环境变量加载配置:LLM provider(apiKey/baseUrl/model)、runtimeProfile、skillsDir、directSubmitSkill 等。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">3</div>
|
||
<div class="flow-step-content">
|
||
<h4><span class="highlight">确定性场景匹配</span>deterministic_submit</h4>
|
||
<p>扫描 skills/ 目录下的场景清单(scene.toml),发现指令包含"线损率"、"本月"关键词,匹配到"线损查询"场景。构建 DeterministicExecutionPlan(含 target_url、org_code、period_mode 等参数)。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">4</div>
|
||
<div class="flow-step-content">
|
||
<h4><span class="highlight">MAC Policy</span>安全校验</h4>
|
||
<p>检查目标域名是否在 rules.json 白名单中 → 通过。检查操作类型(navigate、click、getText)是否在动作白名单中 → 通过。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">5</div>
|
||
<div class="flow-step-content">
|
||
<h4><span class="highlight">BrowserPipeTool</span>执行浏览器命令</h4>
|
||
<p>为每个命令分配单调递增 seq,使用派生会话密钥计算 HMAC。通过 StdioTransport 发送 Command 消息给浏览器宿主。执行:导航到线损系统 → 选择月份 → 点击查询 → 读取表格数据。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">6</div>
|
||
<div class="flow-step-content">
|
||
<h4><span class="highlight">SuperRPA Chromium</span>执行DOM操作</h4>
|
||
<p>浏览器宿主接收 Command,验证 HMAC,执行实际 DOM 操作(导航、选择下拉框、点击按钮、读取表格内容),返回 Response(含操作结果 + HMAC)。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">7</div>
|
||
<div class="flow-step-content">
|
||
<h4><span class="highlight">Report Artifact</span>后处理</h4>
|
||
<p>将浏览器返回的表格数据解析为结构化格式。根据场景的 postprocess 配置,使用 openxml_office 工具生成 .xlsx 文件。生成结果包含本地文件路径。</p>
|
||
</div>
|
||
</div>
|
||
<div class="flow-step">
|
||
<div class="flow-step-number">8</div>
|
||
<div class="flow-step-content">
|
||
<h4><span class="highlight">通信网关</span>返回结果</h4>
|
||
<p>通过 StdioTransport 发送 TaskComplete 消息给浏览器宿主,包含 success=true 和执行摘要(含生成的 .xlsx 文件路径)。浏览器宿主提示用户下载完成。</p>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
</div>
|
||
<div class="footer">sgClaw 智能浏览器自动化平台 - 组件职责与流转全景图 - 2026年4月</div>
|
||
<script>
|
||
mermaid.initialize({ startOnLoad:true, theme:'dark', securityLevel:'loose', logLevel:'warn' });
|
||
</script>
|
||
</body>
|
||
</html> |