feat: add generated scene skill platform hardening
This commit is contained in:
BIN
docs/2026-04-18-102-scenes-validation-overview.xlsx
Normal file
BIN
docs/2026-04-18-102-scenes-validation-overview.xlsx
Normal file
Binary file not shown.
309
docs/sgClaw技术路线总览.html
Normal file
309
docs/sgClaw技术路线总览.html
Normal file
@@ -0,0 +1,309 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="zh-CN">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
|
<title>sgClaw 智能浏览器自动化平台 - 技术路线总览</title>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/mermaid@10.9.5/dist/mermaid.min.js"></script>
|
||||||
|
<style>
|
||||||
|
*{margin:0;padding:0;box-sizing:border-box}
|
||||||
|
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI","PingFang SC","Hiragino Sans GB","Microsoft YaHei",sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.8}
|
||||||
|
.header{background:linear-gradient(135deg,#0a1628,#16213e,#1a3a5c);padding:3rem 2rem;text-align:center;border-bottom:3px solid #e65100}
|
||||||
|
.header h1{font-size:2.2rem;color:#e6edf3;margin-bottom:.5rem}
|
||||||
|
.header .subtitle{color:#8b949e;font-size:1rem}
|
||||||
|
.container{max-width:1300px;margin:0 auto;padding:2rem}
|
||||||
|
.section{background:#161b22;border:1px solid #30363d;border-radius:12px;margin-bottom:2rem;overflow:hidden}
|
||||||
|
.section-header{background:linear-gradient(90deg,#1a1a2e,#16213e);padding:1rem 1.5rem;border-bottom:1px solid #30363d;display:flex;align-items:center;gap:.8rem}
|
||||||
|
.section-number{background:#e65100;color:#fff;width:32px;height:32px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;flex-shrink:0}
|
||||||
|
.section-title{font-size:1.2rem;color:#e6edf3;font-weight:600}
|
||||||
|
.section-body{padding:1.5rem;overflow-x:auto}
|
||||||
|
.mermaid{display:flex;justify-content:center;padding:1rem 0}
|
||||||
|
.mermaid svg{max-width:100%;height:auto}
|
||||||
|
.desc{background:#1a1a2e;border-left:3px solid #e65100;padding:1rem 1.2rem;margin:1rem 0;border-radius:0 8px 8px 0;font-size:.95rem;color:#8b949e}
|
||||||
|
.desc strong{color:#e6edf3}
|
||||||
|
.value-grid{display:grid;grid-template-columns:repeat(auto-fit,minmax(280px,1fr));gap:1rem;margin:1rem 0}
|
||||||
|
.value-card{background:#1a1a2e;border:1px solid #30363d;border-radius:10px;padding:1.2rem}
|
||||||
|
.value-card h3{color:#e65100;font-size:1rem;margin-bottom:.5rem}
|
||||||
|
.value-card p{color:#8b949e;font-size:.9rem}
|
||||||
|
.phase-list{display:flex;flex-direction:column;gap:.8rem;margin:1rem 0}
|
||||||
|
.phase-item{display:flex;gap:1rem;align-items:flex-start;background:#1a1a2e;border-radius:10px;padding:1rem;border-left:4px solid #e65100}
|
||||||
|
.phase-badge{background:#e65100;color:#fff;padding:.3rem .8rem;border-radius:20px;font-size:.85rem;font-weight:600;white-space:nowrap}
|
||||||
|
.phase-item h4{color:#e6edf3;font-size:1rem;margin-bottom:.2rem}
|
||||||
|
.phase-item p{color:#8b949e;font-size:.9rem}
|
||||||
|
.footer{text-align:center;padding:2rem;color:#484f58;font-size:.85rem;border-top:1px solid #21262d}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="header">
|
||||||
|
<h1>sgClaw 智能浏览器自动化平台</h1>
|
||||||
|
<div class="subtitle">用自然语言驱动浏览器操作 让业务流程自动执行</div>
|
||||||
|
</div>
|
||||||
|
<div class="container">
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">1</div>
|
||||||
|
<div class="section-title">一句话理解 sgClaw</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
<strong>sgClaw 是一个"智能浏览器助手"。</strong>用户用自然语言说出需求(例如"帮我查本月线损率"),sgClaw 自动在浏览器中完成点击、输入、查询、导出等一系列操作,最终将结果呈现给用户。全程无需人工逐步操作浏览器。
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">2</div>
|
||||||
|
<div class="section-title">整体业务流程 - 从用户指令到结果呈现</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
以下是用户使用 sgClaw 的完整流程。用户只需<strong>输入一句话</strong>,剩下的全部自动完成。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph LR
|
||||||
|
A["用户输入自然语言指令\n例如: 帮我查本月线损率"] --> B["sgClaw 理解指令意图\n识别是哪个业务场景"]
|
||||||
|
B --> C{"是否已知场景?"}
|
||||||
|
C -->|是 已知场景| D["直接执行预设流程\n快速通道 无需AI"]
|
||||||
|
C -->|否 新场景| E["AI大模型分析理解\n拆解为具体操作步骤"]
|
||||||
|
D --> F["自动操作浏览器\n点击 输入 查询 导出"]
|
||||||
|
E --> F
|
||||||
|
F --> G["将结果呈现给用户\n生成报表 打开Excel"]
|
||||||
|
|
||||||
|
classDef userInput fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef ai fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
classDef fast fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||||
|
classDef action fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef result fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
|
||||||
|
class A userInput
|
||||||
|
class B ai
|
||||||
|
class C ai
|
||||||
|
class D fast
|
||||||
|
class E ai
|
||||||
|
class F action
|
||||||
|
class G result
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">3</div>
|
||||||
|
<div class="section-title">平台如何与现有业务系统协同工作</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
sgClaw <strong>不需要改造现有业务系统</strong>,它像一个坐在电脑前的员工,直接操作浏览器完成工作。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
User["业务人员\n分公司副主任 线损专责 班组长"]
|
||||||
|
|
||||||
|
subgraph Platform["统一业务平台"]
|
||||||
|
S1["线损大数据系统\n查询线损率 统计分析"]
|
||||||
|
S2["95598客服系统\n故障报修 工单处理"]
|
||||||
|
S3["其他业务子系统\n..."]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph sgClaw["sgClaw 智能助手"]
|
||||||
|
SG1["理解用户自然语言指令"]
|
||||||
|
SG2["自动操作浏览器完成任务"]
|
||||||
|
SG3["安全保障 权限管控"]
|
||||||
|
end
|
||||||
|
|
||||||
|
Result["最终结果\nExcel报表 Word文档 数据展示"]
|
||||||
|
|
||||||
|
User -->|"说出需求"| SG1
|
||||||
|
SG1 --> SG2
|
||||||
|
SG1 --> SG3
|
||||||
|
SG2 -->|"自动点击查询"| S1
|
||||||
|
SG2 -->|"自动填写表单"| S2
|
||||||
|
SG2 -->|"自动导出报表"| S3
|
||||||
|
S1 -->|"数据返回"| SG2
|
||||||
|
S2 -->|"数据返回"| SG2
|
||||||
|
S3 -->|"数据返回"| SG2
|
||||||
|
SG2 -->|"生成报表文件"| Result
|
||||||
|
Result -->|"展示给用户"| User
|
||||||
|
|
||||||
|
classDef people fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef plat fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
classDef sg fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef out fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||||
|
|
||||||
|
class User people
|
||||||
|
class Platform,S1,S2,S3 plat
|
||||||
|
class sgClaw,SG1,SG2,SG3 sg
|
||||||
|
class Result out
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">4</div>
|
||||||
|
<div class="section-title">安全管控体系</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
sgClaw 建立了<strong>三道安全防线</strong>,确保即使在AI驱动下,所有操作也在可控范围内。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
A["第一道防线\n身份确认: 确保通信双方可信"] --> B["第二道防线\n规则校验: 只能访问允许的系统和页面"]
|
||||||
|
B --> C["第三道防线\n二次复核: 操作前再次确认合法性"]
|
||||||
|
C --> D["最终结果\n所有操作可追溯 可审计"]
|
||||||
|
|
||||||
|
classDef l1 fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef l2 fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef l3 fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
classDef ok fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||||
|
|
||||||
|
class A l1
|
||||||
|
class B l2
|
||||||
|
class C l3
|
||||||
|
class D ok
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">5</div>
|
||||||
|
<div class="section-title">两种运行模式</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
sgClaw 支持两种运行方式,适应不同场景需求。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph LR
|
||||||
|
subgraph Mode1["模式一: 嵌入式 浏览器子进程模式"]
|
||||||
|
M1A["浏览器启动sgClaw"]
|
||||||
|
M1B["一问一答式通信"]
|
||||||
|
M1C["适合单次任务执行"]
|
||||||
|
M1A --> M1B --> M1C
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph Mode2["模式二: 独立服务模式"]
|
||||||
|
M2A["sgClaw作为持久化服务运行"]
|
||||||
|
M2B["前端网页随时连接使用"]
|
||||||
|
M2C["适合频繁交互使用"]
|
||||||
|
M2A --> M2B --> M2C
|
||||||
|
end
|
||||||
|
|
||||||
|
classDef m1 fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef m2 fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
|
||||||
|
class Mode1,M1A,M1B,M1C m1
|
||||||
|
class Mode2,M2A,M2B,M2C m2
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">6</div>
|
||||||
|
<div class="section-title">技术演进路线</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="phase-list">
|
||||||
|
<div class="phase-item">
|
||||||
|
<div class="phase-badge">第一阶段</div>
|
||||||
|
<div>
|
||||||
|
<h4>基础能力构建</h4>
|
||||||
|
<p>完成浏览器基础操作能力(点击、输入、导航、读取页面内容),建立安全管控体系,实现与现有业务平台的对接。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="phase-item">
|
||||||
|
<div class="phase-badge">第二阶段</div>
|
||||||
|
<div>
|
||||||
|
<h4>AI智能驱动</h4>
|
||||||
|
<p>接入AI大模型,支持自然语言理解,用户用日常语言描述需求,AI自动拆解为操作步骤并执行。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="phase-item">
|
||||||
|
<div class="phase-badge">第三阶段</div>
|
||||||
|
<div>
|
||||||
|
<h4>业务场景沉淀</h4>
|
||||||
|
<p>将高频使用的场景沉淀为标准化技能包(如线损查询、故障统计、周报生成等),实现快速执行,减少对AI的依赖。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="phase-item">
|
||||||
|
<div class="phase-badge">第四阶段</div>
|
||||||
|
<div>
|
||||||
|
<h4>平台化服务</h4>
|
||||||
|
<p>从单次任务执行升级为持久化服务,支持多用户并发使用,建立完整的技能市场和任务编排体系。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">7</div>
|
||||||
|
<div class="section-title">核心价值</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="value-grid">
|
||||||
|
<div class="value-card">
|
||||||
|
<h3>效率提升</h3>
|
||||||
|
<p>原来需要人工逐步操作浏览器完成的任务,现在只需一句话,自动完成查询、导出、报表生成全流程。</p>
|
||||||
|
</div>
|
||||||
|
<div class="value-card">
|
||||||
|
<h3>零改造接入</h3>
|
||||||
|
<p>不需要改造现有业务系统,sgClaw像员工一样直接操作浏览器,对现有系统零侵入。</p>
|
||||||
|
</div>
|
||||||
|
<div class="value-card">
|
||||||
|
<h3>安全可控</h3>
|
||||||
|
<p>三道安全防线确保所有操作在允许范围内,域名白名单、动作管控、二次复核,全程可追溯。</p>
|
||||||
|
</div>
|
||||||
|
<div class="value-card">
|
||||||
|
<h3>灵活扩展</h3>
|
||||||
|
<p>新业务场景通过编写技能包快速接入,已有场景走快速通道无需AI,兼顾效率和灵活性。</p>
|
||||||
|
</div>
|
||||||
|
<div class="value-card">
|
||||||
|
<h3>技术自主</h3>
|
||||||
|
<p>核心代码自主可控,基于Rust语言构建,性能优异,不依赖外部SaaS服务,数据安全有保障。</p>
|
||||||
|
</div>
|
||||||
|
<div class="value-card">
|
||||||
|
<h3>持续演进</h3>
|
||||||
|
<p>从单任务执行到持久化服务,从人工指令到AI驱动,技术路线清晰,逐步构建平台化能力。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">8</div>
|
||||||
|
<div class="section-title">典型使用场景举例</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
以下是业务人员日常使用 sgClaw 的真实场景。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
U1["线损专责\n每月查询线损率统计数据"] -->|"输入: 帮我查本月线损率"| SG1["sgClaw自动完成\n打开线损系统 选择月份 查询数据 导出Excel"]
|
||||||
|
U2["供电所班组长\n每周生成线损分析周报"] -->|"输入: 生成上周线损周报"| SG2["sgClaw自动完成\n查询周数据 汇总分析 生成Word报告"]
|
||||||
|
U3["客服专责\n处理95598故障工单统计"] -->|"输入: 统计本周故障工单"| SG3["sgClaw自动完成\n登录客服系统 筛选工单 生成统计表"]
|
||||||
|
|
||||||
|
classDef user fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef sg fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
|
||||||
|
class U1,U2,U3 user
|
||||||
|
class SG1,SG2,SG3 sg
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
<div class="footer">sgClaw 智能浏览器自动化平台 - 技术路线总览 - 2026年4月</div>
|
||||||
|
<script>
|
||||||
|
mermaid.initialize({ startOnLoad:true, theme:'dark', securityLevel:'loose', logLevel:'warn' });
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
413
docs/sgClaw系统架构全景图.html
Normal file
413
docs/sgClaw系统架构全景图.html
Normal file
@@ -0,0 +1,413 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="zh-CN">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
|
<title>sgClaw 系统架构全景图</title>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/mermaid@10.9.5/dist/mermaid.min.js"></script>
|
||||||
|
<style>
|
||||||
|
*{margin:0;padding:0;box-sizing:border-box}
|
||||||
|
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI","PingFang SC","Hiragino Sans GB","Microsoft YaHei",sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.6}
|
||||||
|
.header{background:linear-gradient(135deg,#1a1a2e,#16213e,#0f3460);padding:3rem 2rem;text-align:center;border-bottom:3px solid #e65100}
|
||||||
|
.header h1{font-size:2.5rem;color:#e6edf3;margin-bottom:.5rem}
|
||||||
|
.header .subtitle{color:#8b949e;font-size:.95rem}
|
||||||
|
.container{max-width:1400px;margin:0 auto;padding:2rem}
|
||||||
|
.section{background:#161b22;border:1px solid #30363d;border-radius:12px;margin-bottom:2rem;overflow:hidden}
|
||||||
|
.section-header{background:linear-gradient(90deg,#1a1a2e,#16213e);padding:1.2rem 1.5rem;border-bottom:1px solid #30363d;display:flex;align-items:center;gap:1rem}
|
||||||
|
.section-number{background:#e65100;color:#fff;width:36px;height:36px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;font-size:1.1rem}
|
||||||
|
.section-title{font-size:1.3rem;color:#e6edf3;font-weight:600}
|
||||||
|
.section-body{padding:1.5rem;overflow-x:auto}
|
||||||
|
.mermaid{display:flex;justify-content:center;padding:1rem 0}
|
||||||
|
.mermaid svg{max-width:100%;height:auto}
|
||||||
|
.file-table{width:100%;border-collapse:collapse;margin-top:1rem}
|
||||||
|
.file-table th,.file-table td{padding:.75rem 1rem;text-align:left;border-bottom:1px solid #21262d}
|
||||||
|
.file-table th{background:#1a1a2e;color:#e65100;font-weight:600}
|
||||||
|
.file-table td:first-child{color:#58a6ff;font-family:"SF Mono",Monaco,Consolas,monospace;font-size:.85rem}
|
||||||
|
.file-table tr:hover{background:rgba(230,81,0,.05)}
|
||||||
|
.footer{text-align:center;padding:2rem;color:#484f58;font-size:.85rem;border-top:1px solid #21262d}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="header">
|
||||||
|
<h1>sgClaw 系统架构全景图</h1>
|
||||||
|
<div class="subtitle">浏览器宿主 x Rust 安全控制层 x ZeroClaw 能力核心 - 双部署模式 三层安全防线 Skill 体系</div>
|
||||||
|
</div>
|
||||||
|
<div class="container">
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">1</div>
|
||||||
|
<div class="section-title">系统边界总览 - 四大区域与数据流向</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
BH["浏览器宿主\n受保护的安全边界\n启动和托管sgClaw子进程"]
|
||||||
|
SP["sgClaw进程\nRust安全控制层\nZeroClaw为能力核心"]
|
||||||
|
ZC["ZeroClaw核心\nvendored crate\n任务分解 工具循环 LLM路由"]
|
||||||
|
ES["外部服务\nLLM API和业务浏览器页面"]
|
||||||
|
BH <-- "STDIO JSON Line 进程间通信协议" --> SP
|
||||||
|
SP <-- "Rust API调用 vendored库" --> ZC
|
||||||
|
ZC <-- "HTTP API 或内部调用" --> ES
|
||||||
|
SP <-- "Browser Backend Pipe或WS" --> ES
|
||||||
|
classDef hostClass fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef sgclawClass fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef zcClass fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
classDef extClass fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||||
|
class BH hostClass
|
||||||
|
class SP sgclawClass
|
||||||
|
class ZC zcClass
|
||||||
|
class ES extClass
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">2</div>
|
||||||
|
<div class="section-title">双部署模式 - Pipe Mode STDIO一问一答</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
sequenceDiagram
|
||||||
|
participant Host as 浏览器宿主
|
||||||
|
participant Pipe as StdioTransport
|
||||||
|
participant MAC as MAC Policy
|
||||||
|
participant Agent as Agent/TaskRunner
|
||||||
|
participant ZC as ZeroClaw Runtime
|
||||||
|
participant Tool as BrowserPipeTool
|
||||||
|
participant Exec as 宿主命令执行器
|
||||||
|
Note over Host,Exec: Pipe Mode 一问一答式STDIO通信
|
||||||
|
Host->>Pipe: Init 握手:携带版本号 HMAC种子 能力列表
|
||||||
|
Pipe->>Pipe: derive_session_key 派生会话密钥
|
||||||
|
Pipe-->>Host: InitAck 确认:返回agent_id和支持动作
|
||||||
|
Host->>Agent: SubmitTask 提交任务
|
||||||
|
Agent->>Agent: 检测确定性提交模式
|
||||||
|
alt 确定性提交
|
||||||
|
Agent->>Agent: 生成执行计划
|
||||||
|
Agent->>Tool: 直接执行Skill
|
||||||
|
else LLM驱动
|
||||||
|
Agent->>ZC: 构造ZeroClaw Agent
|
||||||
|
ZC->>Tool: tool loop调用
|
||||||
|
end
|
||||||
|
Tool->>MAC: 校验域名和动作
|
||||||
|
MAC-->>Tool: 允许或拒绝
|
||||||
|
Tool->>Pipe: 写入Command JSON
|
||||||
|
Pipe-->>Host: 浏览器接收命令
|
||||||
|
Host->>Exec: 执行浏览器命令
|
||||||
|
Exec-->>Host: 返回执行结果
|
||||||
|
Host->>Pipe: Response回包
|
||||||
|
Pipe-->>Tool: 结果回传
|
||||||
|
Tool-->>ZC: ToolResult
|
||||||
|
ZC-->>Agent: 继续或完成
|
||||||
|
Agent-->>Host: TaskComplete
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">3</div>
|
||||||
|
<div class="section-title">双部署模式 - Service Mode TCP加WebSocket加Helper Page桥接</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
sequenceDiagram
|
||||||
|
participant Console as 前端控制台
|
||||||
|
participant WS as WebSocket Server
|
||||||
|
participant Agent as Agent/TaskRunner
|
||||||
|
participant CB as BrowserCallbackBackend
|
||||||
|
participant HTTP as Callback HTTP Server
|
||||||
|
participant Helper as Helper Page
|
||||||
|
participant Target as 目标业务页面
|
||||||
|
Note over Console,Target: Service Mode 持久化服务+Helper Page桥接
|
||||||
|
Console->>WS: WebSocket Connect
|
||||||
|
WSS->>CB: 创建会话
|
||||||
|
Console->>WS: SubmitTask
|
||||||
|
WS->>Agent: 分发任务
|
||||||
|
Agent->>CB: invoke执行
|
||||||
|
CB->>HTTP: POST Command到队列
|
||||||
|
HTTP-->>Helper: long-poll返回Command
|
||||||
|
Helper->>Target: sgBrowserExcuteJsCodeByDomain执行JS
|
||||||
|
Target-->>Helper: callBackJsToCpp回调
|
||||||
|
Helper->>HTTP: POST事件回传
|
||||||
|
HTTP-->>CB: Callback事件
|
||||||
|
CB-->>Agent: CommandOutput
|
||||||
|
Agent-->>WS: TaskComplete
|
||||||
|
WS-->>Console: 推送结果
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">4</div>
|
||||||
|
<div class="section-title">sgClaw 内部模块关系图</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
graph LR
|
||||||
|
E1["main.rs Pipe模式入口"]
|
||||||
|
E2["service模式入口"]
|
||||||
|
P1["StdioTransport STDIO读写"]
|
||||||
|
P2["消息枚举定义"]
|
||||||
|
P3["Handshake握手协议"]
|
||||||
|
P4["BrowserPipeTool发送等待响应"]
|
||||||
|
P5["HMAC签名防篡改"]
|
||||||
|
M1["MacPolicy加载解析"]
|
||||||
|
M2["Domain白名单标准化比对"]
|
||||||
|
M3["Action黑白名单双重过滤"]
|
||||||
|
A1["消息分发handle_browser_message"]
|
||||||
|
A2["TaskRunner任务解析"]
|
||||||
|
A3["Deterministic Submit指令检测"]
|
||||||
|
C1["RuntimeEngine构建Agent"]
|
||||||
|
C2["ToolPolicy工具权限"]
|
||||||
|
C3["BrowserScriptSkillTool执行器"]
|
||||||
|
C4["DeterministicSubmit线损快速通道"]
|
||||||
|
C5["BrowserToolAdapter工具适配"]
|
||||||
|
B1["BrowserBackend统一接口"]
|
||||||
|
B2["PipeBrowserBackend实现"]
|
||||||
|
B3["WsBrowserBackend实现"]
|
||||||
|
B4["BrowserCallbackBackend实现"]
|
||||||
|
SV1["WebSocket Server监听"]
|
||||||
|
SV2["Session Manager单客户端单任务"]
|
||||||
|
SV3["Callback HTTP Server监听"]
|
||||||
|
CF1["SgClawSettings加载"]
|
||||||
|
CF2["Provider Config"]
|
||||||
|
CF3["Backend Selection选择"]
|
||||||
|
E1 --> P1 --> P2 --> P3 --> P4 --> P5 --> M1
|
||||||
|
M1 --> M2
|
||||||
|
M1 --> M3 --> A1 --> A2 --> A3
|
||||||
|
A3 --> C1 --> C2 --> C5 --> B1
|
||||||
|
A3 --> C4 --> B1
|
||||||
|
CF1 --> C1
|
||||||
|
B1 --> B2
|
||||||
|
B1 --> B3
|
||||||
|
B1 --> B4
|
||||||
|
E2 --> SV1 --> SV2 --> B4
|
||||||
|
SV1 --> SV3
|
||||||
|
CF1 --> CF2
|
||||||
|
CF1 --> CF3 --> A1
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">5</div>
|
||||||
|
<div class="section-title">安全模型 - 三层防线</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
L1A["浏览器发送Init携带hmac_seed"]
|
||||||
|
L1B["sgClaw回InitAck分配agent_id"]
|
||||||
|
L1C["派生Session Key SHA256"]
|
||||||
|
L1D["未完成握手拒绝运行"]
|
||||||
|
L1A --> L1B --> L1C --> L1D
|
||||||
|
L2A["加载rules.json解析规则"]
|
||||||
|
L2B["Domain白名单校验去掉协议路径端口"]
|
||||||
|
L2C["Action黑白名单双重过滤"]
|
||||||
|
L2D["本地仪表盘特殊处理"]
|
||||||
|
L2A --> L2B
|
||||||
|
L2A --> L2C
|
||||||
|
L2A --> L2D
|
||||||
|
L3A["序列号关联校验"]
|
||||||
|
L3B["HMAC-SHA256签名验证"]
|
||||||
|
L3C["域名与页面上下文匹配"]
|
||||||
|
L3D["非法参数拒绝执行"]
|
||||||
|
L3A --> L3B --> L3C --> L3D
|
||||||
|
L1D ==> L2A
|
||||||
|
L2B ==> L3A
|
||||||
|
L2C ==> L3A
|
||||||
|
L2D ==> L3A
|
||||||
|
classDef l1Class fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef l2Class fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef l3Class fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
class L1A,L1B,L1C,L1D l1Class
|
||||||
|
class L2A,L2B,L2C,L2D l2Class
|
||||||
|
class L3A,L3B,L3C,L3D l3Class
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">6</div>
|
||||||
|
<div class="section-title">Skill体系与执行路径</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
SD1["SKILL.toml元数据"]
|
||||||
|
SD2["tools数组kind定义"]
|
||||||
|
SD3["prompts数组触发条件"]
|
||||||
|
SD4["scripts目录JS脚本"]
|
||||||
|
SL1["ZeroClaw Skill Loader扫描"]
|
||||||
|
SL2["BrowserScriptSkillTool创建执行器"]
|
||||||
|
SL3["命名规范skill.tool"]
|
||||||
|
EP1["路径A LLM驱动"]
|
||||||
|
EP2["路径B Deterministic Submit"]
|
||||||
|
EP3["路径C Direct Skill Runtime"]
|
||||||
|
BE1["Eval包装脚本注入args"]
|
||||||
|
BE2["Action Eval执行"]
|
||||||
|
BE3["返回ToolResult结构化JSON"]
|
||||||
|
SD1 --> SD2 --> SD4
|
||||||
|
SD2 --> SD3
|
||||||
|
SD1 --> SL1 --> SL2 --> SL3
|
||||||
|
SL3 --> EP1
|
||||||
|
SL3 --> EP2
|
||||||
|
SL3 --> EP3
|
||||||
|
EP1 --> BE1 --> BE2 --> BE3
|
||||||
|
EP2 --> BE1
|
||||||
|
EP3 --> BE1
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">7</div>
|
||||||
|
<div class="section-title">Helper Page机制 - Service Mode核心桥接</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
WS["WebSocket Server监听42321"]
|
||||||
|
HTTP["Callback HTTP Server监听17888"]
|
||||||
|
CB["BrowserCallbackBackend交互"]
|
||||||
|
Helper["Helper Page Tab辅助页"]
|
||||||
|
Target1["业务页面1线损系统"]
|
||||||
|
Target2["业务页面2平台页面"]
|
||||||
|
HP1["WebSocket连接特权API"]
|
||||||
|
HP2["轮询Command长轮询"]
|
||||||
|
HP3["推送Events POST回调"]
|
||||||
|
HP4["回调函数注册"]
|
||||||
|
WS --> CB --> HTTP --> HP2
|
||||||
|
HP1 --> Target1
|
||||||
|
HP1 --> Target2
|
||||||
|
HP2 --> Target1
|
||||||
|
HP2 --> Target2
|
||||||
|
Target1 --> HP4 --> HP3 --> HTTP
|
||||||
|
HTTP --> CB --> WS
|
||||||
|
classDef svcClass fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef tabClass fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef hpClass fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
class WS,HTTP,CB svcClass
|
||||||
|
class Helper,Target1,Target2 tabClass
|
||||||
|
class HP1,HP2,HP3,HP4 hpClass
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">8</div>
|
||||||
|
<div class="section-title">线损确定性提交流程 - 用户输入到Excel导出</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
sequenceDiagram
|
||||||
|
participant User as 用户
|
||||||
|
participant Host as 浏览器宿主
|
||||||
|
participant Agent as Agent/TaskRunner
|
||||||
|
participant DS as DeterministicSubmit
|
||||||
|
participant Skill as collect_lineloss
|
||||||
|
participant Backend as BrowserBackend
|
||||||
|
participant Browser as 线损浏览器页面
|
||||||
|
participant Rust as Rust xlsx导出
|
||||||
|
User->>Host: 输入指令:帮我查本月线损率
|
||||||
|
Host->>Agent: SubmitTask
|
||||||
|
Agent->>DS: decide_deterministic_submit
|
||||||
|
Note over DS: 指令以句号结尾且包含线损关键词
|
||||||
|
DS-->>Agent: Execute执行计划
|
||||||
|
Agent->>Skill: execute_browser_script
|
||||||
|
Skill->>Backend: Action Eval
|
||||||
|
Backend->>Browser: sgBrowserExcuteJsCodeByDomain
|
||||||
|
Browser->>Browser: validatePageContext
|
||||||
|
Browser->>Browser: buildRequest
|
||||||
|
Browser->>Browser: ajax查询API
|
||||||
|
Browser-->>Backend: 返回JSON
|
||||||
|
Backend-->>Skill: ToolResult
|
||||||
|
Skill-->>Agent: artifact
|
||||||
|
Agent->>Rust: export_lineloss_xlsx
|
||||||
|
Rust->>Rust: 生成xlsx文件
|
||||||
|
Rust-->>Agent: 导出完成
|
||||||
|
Agent-->>Host: TaskComplete
|
||||||
|
Host-->>User: 展示结果打开Excel
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">9</div>
|
||||||
|
<div class="section-title">平台浏览器与sgClaw交互边界</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
PlatformBrowser["平台浏览器Chromium"]
|
||||||
|
sgClawProcess["sgClaw进程Rust"]
|
||||||
|
PP1["场景页Vue实例window.mac"]
|
||||||
|
PP2["mutableSystemList子系统账号池"]
|
||||||
|
PP3["getLogint登录编排方法"]
|
||||||
|
TP1["线损系统20.76.57.61"]
|
||||||
|
TP2["其他子系统"]
|
||||||
|
BC1["sgBrowserExcuteJsCodeByDomain按域名执行JS"]
|
||||||
|
BC2["sgHideBrowerserOpenPage打开隐藏页面"]
|
||||||
|
BC3["sgBrowserCallAfterLoaded加载后执行JS"]
|
||||||
|
BC4["callBackJsToCpp JS到C++回调"]
|
||||||
|
T1["Transport层STDIO传输"]
|
||||||
|
T2["MAC Policy加HMAC安全校验"]
|
||||||
|
T3["Agent/TaskRunner任务分发器"]
|
||||||
|
T4["Compat层ZeroClaw兼容"]
|
||||||
|
T5["Browser Backend浏览器后端"]
|
||||||
|
PP1 --> PP2
|
||||||
|
PP1 --> PP3
|
||||||
|
PP3 -.-> TP1
|
||||||
|
T1 --> PlatformBrowser
|
||||||
|
PlatformBrowser --> T1
|
||||||
|
T3 --> T4 --> T5
|
||||||
|
T5 --> BC1
|
||||||
|
T5 --> BC2
|
||||||
|
T5 --> BC3
|
||||||
|
BC4 -.-> T5
|
||||||
|
PlatformBrowser -.-> sgClawProcess
|
||||||
|
classDef browserSide fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef sgclawSide fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
class PlatformBrowser,PP1,PP2,PP3,TP1,TP2,BC1,BC2,BC3,BC4 browserSide
|
||||||
|
class sgClawProcess,T1,T2,T3,T4,T5 sgclawSide
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">10</div>
|
||||||
|
<div class="section-title">模块文件映射表</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<table class="file-table">
|
||||||
|
<thead><tr><th>模块</th><th>主要源文件</th><th>职责说明</th></tr></thead>
|
||||||
|
<tbody>
|
||||||
|
<tr><td>pipe传输层</td><td>src/pipe/mod.rs transport.rs handshake.rs browser_tool.rs</td><td>STDIO读写 握手流程 消息编码解码 HMAC签名</td></tr>
|
||||||
|
<tr><td>security安全层</td><td>src/security/mod.rs mac_policy.rs hmac.rs</td><td>MAC Policy加载 域名白名单 动作黑白名单 HMAC签名</td></tr>
|
||||||
|
<tr><td>agent消息路由</td><td>src/agent/mod.rs task_runner.rs</td><td>消息分发 任务解析 Deterministic Submit检测</td></tr>
|
||||||
|
<tr><td>browser后端抽象</td><td>src/browser/mod.rs callback_backend.rs callback_host.rs ws_protocol.rs</td><td>BrowserBackend接口 Pipe/WS/Callback实现</td></tr>
|
||||||
|
<tr><td>compat兼容层</td><td>src/compat/mod.rs runtime.rs deterministic_submit.rs browser_script_skill_tool.rs</td><td>ZeroClaw运行时构建 线损快速通道 Skill执行</td></tr>
|
||||||
|
<tr><td>service服务模式</td><td>src/service/mod.rs session.rs</td><td>WS服务器 单客户端单任务模型</td></tr>
|
||||||
|
<tr><td>config配置</td><td>src/config/mod.rs settings.rs</td><td>Settings加载 Provider配置 Backend选择</td></tr>
|
||||||
|
<tr><td>runtime引擎</td><td>src/runtime/mod.rs engine.rs tool_policy.rs</td><td>Agent实例构建 ToolPolicy权限控制</td></tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
<div class="footer">sgClaw 系统架构全景图 - 2026-04-15 - 基于 Mermaid.js 10.9.5</div>
|
||||||
|
<script>
|
||||||
|
mermaid.initialize({ startOnLoad:true, theme:'dark', securityLevel:'loose', logLevel:'warn' });
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
494
docs/sgClaw系统架构全景图.md
Normal file
494
docs/sgClaw系统架构全景图.md
Normal file
@@ -0,0 +1,494 @@
|
|||||||
|
# sgClaw 系统架构全景图
|
||||||
|
|
||||||
|
**文档版本**: 1.0<br>
|
||||||
|
**适用项目**: sgClaw<br>
|
||||||
|
**编制日期**: 2026-04-15
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. 系统边界总览
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TB
|
||||||
|
subgraph BrowserHost["浏览器宿主 (SuperRPA / Chromium)"]
|
||||||
|
direction TB
|
||||||
|
H1["Launch Config<br/>启动配置"]
|
||||||
|
H2["Chromium 子进程管理<br/>启动/监控 sgClaw"]
|
||||||
|
H3["Browser Command 执行器<br/>click/type/navigate/eval/..."]
|
||||||
|
H4["HMAC 复检 + 域名校验<br/>宿主侧安全边界"]
|
||||||
|
H5["Frontend Bundle<br/>展示面 (Vue 2 页面)"]
|
||||||
|
|
||||||
|
H1 --> H2
|
||||||
|
H2 --> H3
|
||||||
|
H3 --> H4
|
||||||
|
H4 -.展示.-> H5
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph sgClawProcess["sgClaw 进程 (Rust)"]
|
||||||
|
direction TB
|
||||||
|
S1["Transport 层<br/>STDIO / WebSocket"]
|
||||||
|
S2["Security 层<br/>MAC Policy + HMAC 签名"]
|
||||||
|
S3["Agent 层<br/>消息路由 + 任务分发"]
|
||||||
|
S4["Compat 层<br/>ZeroClaw 运行时 + Skill 工具链"]
|
||||||
|
S5["Browser Backend 抽象<br/>Pipe / WS / Callback / Bridge"]
|
||||||
|
S6["Config 层<br/>Runtime Config + 环境变量"]
|
||||||
|
|
||||||
|
S1 --> S2
|
||||||
|
S2 --> S3
|
||||||
|
S3 --> S4
|
||||||
|
S4 --> S5
|
||||||
|
S6 -.配置注入.-> S4
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph ZeroClawCore["ZeroClaw 核心 (vendored)"]
|
||||||
|
direction TB
|
||||||
|
Z1["Planner / Executor<br/>任务分解与执行"]
|
||||||
|
Z2["Tool Loop<br/>工具调用循环"]
|
||||||
|
Z3["Skills / Memory<br/>技能加载与记忆"]
|
||||||
|
Z4["Provider Dispatch<br/>LLM 路由"]
|
||||||
|
Z5["Prompt Builder<br/>System Prompt 组装"]
|
||||||
|
|
||||||
|
Z1 --> Z2
|
||||||
|
Z2 --> Z3
|
||||||
|
Z3 --> Z4
|
||||||
|
Z5 --> Z1
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph ExternalServices["外部服务"]
|
||||||
|
direction TB
|
||||||
|
E1["LLM Provider<br/>DeepSeek / OpenAI / Claude"]
|
||||||
|
E2["平台浏览器页面<br/>业务页面 + 隐藏域"]
|
||||||
|
end
|
||||||
|
|
||||||
|
BrowserHost <-->|"STDIO JSON Line<br/>AgentMessage / BrowserMessage"| sgClawProcess
|
||||||
|
sgClawProcess <-->|"Rust API 调用|vendored"| ZeroClawCore
|
||||||
|
ZeroClawCore <-->|"HTTP API|内部调用"| ExternalServices
|
||||||
|
sgClawProcess <-->|"Pipe Mode: STDIO<br/>Service Mode: WS|Browser Backend| ExternalServices
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. 双部署模式架构
|
||||||
|
|
||||||
|
### 2.1 Pipe Mode (STDIO) — 传统嵌入模式
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant Host as 浏览器宿主 (Chromium)
|
||||||
|
participant Pipe as StdioTransport
|
||||||
|
participant MAC as MAC Policy
|
||||||
|
participant Agent as Agent / TaskRunner
|
||||||
|
participant ZC as ZeroClaw Runtime
|
||||||
|
participant Backend as PipeBrowserBackend
|
||||||
|
participant Tool as BrowserPipeTool
|
||||||
|
participant HostExec as 宿主 Command 执行器
|
||||||
|
|
||||||
|
Note over Host,HostExec: Pipe Mode: 一问一答式 STDIO
|
||||||
|
|
||||||
|
Host->>Pipe: Init {version, hmac_seed, capabilities}
|
||||||
|
Pipe->>Pipe: derive_session_key(hmac_seed)
|
||||||
|
Pipe-->>Host: InitAck {version, agent_id, supported_actions}
|
||||||
|
|
||||||
|
Host->>Agent: SubmitTask {instruction, page_url, page_title}
|
||||||
|
Agent->>Agent: resolve_submit_instruction()
|
||||||
|
alt deterministic_submit (如 线损。。。)
|
||||||
|
Agent->>Agent: 生成 DeterministicExecutionPlan
|
||||||
|
Agent->>Tool: execute_browser_script_skill_raw_output
|
||||||
|
else 通用 LLM 驱动
|
||||||
|
Agent->>ZC: 构造 ZeroClaw Agent
|
||||||
|
ZC->>Tool: tool loop: browser_action
|
||||||
|
end
|
||||||
|
|
||||||
|
Tool->>MAC: validate(domain, action)
|
||||||
|
MAC-->>Tool: allow / deny
|
||||||
|
|
||||||
|
Tool->>Backend: invoke(action, params)
|
||||||
|
Backend->>Pipe: AgentMessage::Command {seq, action, params, hmac}
|
||||||
|
Pipe-->>Host: stdout: Command JSON
|
||||||
|
|
||||||
|
Host->>HostExec: 执行浏览器命令
|
||||||
|
HostExec-->>Host: 执行结果
|
||||||
|
Host->>Pipe: BrowserMessage::Response {seq, success, data}
|
||||||
|
Pipe-->>Backend: Response 回包
|
||||||
|
Backend-->>Tool: CommandOutput
|
||||||
|
Tool-->>ZC: ToolResult
|
||||||
|
ZC-->>Agent: tool loop 继续或完成
|
||||||
|
Agent-->>Host: TaskComplete {success, summary}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2.2 Service Mode (TCP + WebSocket) — 独立服务模式
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant Console as 前端控制台 (浏览器)
|
||||||
|
participant WSS as WebSocket Server<br/>(127.0.0.1:42321)
|
||||||
|
participant Agent as Agent / TaskRunner
|
||||||
|
participant Callback as BrowserCallbackBackend
|
||||||
|
participant HTTP as Callback HTTP Server<br/>(127.0.0.1:17888)
|
||||||
|
participant Helper as Helper Page<br/>(浏览器内嵌辅助页)
|
||||||
|
participant Target as 目标业务页面
|
||||||
|
|
||||||
|
Note over Console,Target: Service Mode: 持久化服务 + Helper Page 桥接
|
||||||
|
|
||||||
|
Console->>WSS: WebSocket Connect
|
||||||
|
WSS->>Callback: 创建会话
|
||||||
|
|
||||||
|
Console->>WSS: ClientMessage::SubmitTask
|
||||||
|
WSS->>Agent: 分发任务
|
||||||
|
Agent->>Callback: BrowserBackend::invoke()
|
||||||
|
|
||||||
|
callback Backend 内部流程:
|
||||||
|
Callback->>Helper: 通过 HTTP Server 推送 Command
|
||||||
|
Helper->>Target: sgBrowserExcuteJsCodeByDomain<br/>在目标域执行 JS
|
||||||
|
|
||||||
|
Target-->>Helper: callBackJsToCpp / XHR POST
|
||||||
|
Helper->>HTTP: POST /sgclaw/callback/events
|
||||||
|
HTTP-->>Callback: Callback 事件回传
|
||||||
|
|
||||||
|
Callback-->>Agent: CommandOutput
|
||||||
|
Agent-->>WSS: ServiceMessage::TaskComplete
|
||||||
|
WSS-->>Console: WebSocket 推送结果
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. sgClaw 内部模块关系
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
subgraph EntryPoints["入口点"]
|
||||||
|
E1["src/main.rs<br/>sgclaw::run()"]
|
||||||
|
E2["src/service/mod.rs<br/>service::run()"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph PipeLayer["pipe 层 — 传输与协议"]
|
||||||
|
P1["StdioTransport<br/>STDIO 读写"]
|
||||||
|
P2["BrowserMessage / AgentMessage<br/>消息枚举定义"]
|
||||||
|
P3["Handshake<br/>握手协议"]
|
||||||
|
P4["BrowserPipeTool<br/>发送 Command / 等待 Response"]
|
||||||
|
P5["HMAC 签名<br/>sign_command"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph SecurityLayer["security 层 — 安全策略"]
|
||||||
|
M1["MacPolicy<br/>从 rules.json 加载规则"]
|
||||||
|
M2["Domain Allowlist<br/>域名白名单校验"]
|
||||||
|
M3["Action Allowlist/Blocklist<br/>动作黑白名单"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph AgentLayer["agent 层 — 消息路由与任务分发"]
|
||||||
|
A1["handle_browser_message_with_context<br/>消息分发"]
|
||||||
|
A2["TaskRunner<br/>任务解析与执行"]
|
||||||
|
A3["resolve_submit_instruction<br/>Deterministic Submit 检测"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph CompatLayer["compat 层 — ZeroClaw 兼容"]
|
||||||
|
C1["RuntimeEngine<br/>构建 Agent 实例"]
|
||||||
|
C2["ToolPolicy<br/>工具权限控制"]
|
||||||
|
C3["BrowserScriptSkillTool<br/>Skill browser_script 执行"]
|
||||||
|
C4["DeterministicSubmit<br/>线损确定性提交"]
|
||||||
|
C5["BrowserToolAdapter<br/>ZeroClaw 工具适配"]
|
||||||
|
C6["ConfigAdapter<br/>配置转换"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph BrowserLayer["browser 层 — 浏览器后端"]
|
||||||
|
B1["BrowserBackend trait<br/>统一接口"]
|
||||||
|
B2["PipeBrowserBackend<br/>Pipe Mode 实现"]
|
||||||
|
B3["WsBrowserBackend<br/>WebSocket 直接连接"]
|
||||||
|
B4["BrowserCallbackBackend<br/>Helper Page 桥接"]
|
||||||
|
B5["BridgeBrowserBackend<br/>网桥模式"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph ServiceLayer["service 层 — 服务模式"]
|
||||||
|
SV1["WebSocket Server<br/>TCP 监听"]
|
||||||
|
SV2["Session Manager<br/>单客户端单任务"]
|
||||||
|
SV3["Callback HTTP Server<br/>辅助页通信"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph ConfigLayer["config 层 — 运行时配置"]
|
||||||
|
CF1["SgClawSettings<br/>从 JSON / 环境变量加载"]
|
||||||
|
CF2["Provider Config<br/>API Key / Model"]
|
||||||
|
CF3["Backend Selection<br/>Pipe vs Service"]
|
||||||
|
end
|
||||||
|
|
||||||
|
E1 --> P1
|
||||||
|
E2 --> SV1
|
||||||
|
|
||||||
|
P1 --> P2
|
||||||
|
P2 --> P3
|
||||||
|
P3 --> P4
|
||||||
|
P4 --> P5
|
||||||
|
|
||||||
|
P5 --> M1
|
||||||
|
M1 --> M2
|
||||||
|
M1 --> M3
|
||||||
|
|
||||||
|
M3 --> A1
|
||||||
|
A1 --> A2
|
||||||
|
A2 --> A3
|
||||||
|
|
||||||
|
A3 --> C1
|
||||||
|
A3 --> C4
|
||||||
|
C1 --> C2
|
||||||
|
C1 --> C3
|
||||||
|
C2 --> C5
|
||||||
|
C6 --> C1
|
||||||
|
|
||||||
|
C3 --> B1
|
||||||
|
C4 --> B1
|
||||||
|
C5 --> B1
|
||||||
|
|
||||||
|
B1 --> B2
|
||||||
|
B1 --> B3
|
||||||
|
B1 --> B4
|
||||||
|
B1 --> B5
|
||||||
|
|
||||||
|
SV1 --> SV2
|
||||||
|
SV1 --> SV3
|
||||||
|
SV2 --> B4
|
||||||
|
|
||||||
|
CF1 --> CF2
|
||||||
|
CF1 --> CF3
|
||||||
|
CF3 --> A1
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. 安全模型三层防线
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TB
|
||||||
|
subgraph Layer1["第一层: 握手与会话完整性"]
|
||||||
|
L1A["Browser 发送 Init<br/>携带 hmac_seed"]
|
||||||
|
L1B["sgClaw 回 InitAck<br/>分配 agent_id"]
|
||||||
|
L1C["派生 Session Key<br/>SHA256(hmac_seed + salt)"]
|
||||||
|
L1D["未完成握手<br/>拒绝进入运行态"]
|
||||||
|
|
||||||
|
L1A --> L1B --> L1C --> L1D
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph Layer2["第二层: Rust 侧 MAC Policy"]
|
||||||
|
L2A["加载 rules.json<br/>version, domains, actions"]
|
||||||
|
L2B["Domain 白名单校验<br/>strip scheme/path/port"]
|
||||||
|
L2C["Action 黑白名单<br/>allowed + blocked 双重过滤"]
|
||||||
|
L2D["本地仪表盘特殊处理<br/>__sgclaw_local_dashboard__"]
|
||||||
|
|
||||||
|
L2A --> L2B
|
||||||
|
L2A --> L2C
|
||||||
|
L2A --> L2D
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph Layer3["第三层: 宿主侧命令执行约束"]
|
||||||
|
L3A["序列号关联校验"]
|
||||||
|
L3B["HMAC-SHA256 签名验证"]
|
||||||
|
L3C["域名与页面上下文匹配"]
|
||||||
|
L3D["非法参数拒绝执行"]
|
||||||
|
|
||||||
|
L3A --> L3B --> L3C --> L3D
|
||||||
|
end
|
||||||
|
|
||||||
|
Layer1 ==>|"Session Key"| Layer2
|
||||||
|
Layer2 ==>|"Command + HMAC"| Layer3
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Skill 体系与执行路径
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TB
|
||||||
|
subgraph SkillDefinition["Skill 定义 (SKILL.toml)"]
|
||||||
|
SD1["skill metadata<br/>name, version, description"]
|
||||||
|
SD2["tools 数组<br/>kind: browser_script / http_request / ..."]
|
||||||
|
SD3["prompts 数组<br/>触发条件描述"]
|
||||||
|
SD4["scripts/ 目录<br/>JS 脚本文件"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph SkillLoading["Skill 加载"]
|
||||||
|
SL1["ZeroClaw Skill Loader<br/>从 skillsDir 扫描"]
|
||||||
|
SL2["BrowserScriptSkillTool<br/>为每个 tool 创建执行器"]
|
||||||
|
SL3["命名: {skill_name}.{tool_name}"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph ExecutionPaths["执行路径"]
|
||||||
|
EP1["路径 A: LLM 驱动<br/>Agent tool loop → browser_action"]
|
||||||
|
EP2["路径 B: Deterministic Submit<br/>指令匹配 → 直接执行 (无 LLM)"]
|
||||||
|
EP3["路径 C: Direct Skill Runtime<br/>配置指定 skill → 直接执行"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph BrowserExecution["浏览器侧执行"]
|
||||||
|
BE1["Eval 包装<br/>(function() { const args = {...}; ... })()"]
|
||||||
|
BE2["Action::Eval<br/>通过 BrowserBackend 执行"]
|
||||||
|
BE3["返回 ToolResult<br/>结构化结果"]
|
||||||
|
end
|
||||||
|
|
||||||
|
SD1 --> SD2 --> SD4
|
||||||
|
SD2 --> SD3
|
||||||
|
|
||||||
|
SD1 --> SL1 --> SL2 --> SL3
|
||||||
|
|
||||||
|
SL3 --> EP1
|
||||||
|
SL3 --> EP2
|
||||||
|
SL3 --> EP3
|
||||||
|
|
||||||
|
EP1 --> BE1
|
||||||
|
EP2 --> BE1
|
||||||
|
EP3 --> BE1
|
||||||
|
|
||||||
|
BE1 --> BE2 --> BE3
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Helper Page 机制 (Service Mode)
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TB
|
||||||
|
subgraph sgClawService["sgClaw Service 进程"]
|
||||||
|
WS["WebSocket Server<br/>:42321"]
|
||||||
|
HTTP["HTTP Server<br/>:17888"]
|
||||||
|
CB["BrowserCallbackBackend"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph BrowserTabs["浏览器 Tab 页"]
|
||||||
|
Helper["Helper Page Tab<br/>/sgclaw/browser-helper.html"]
|
||||||
|
Target1["业务页面 1<br/>20.76.57.61:18080/..."]
|
||||||
|
Target2["业务页面 2<br/>25.215.213.128:18080/..."]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph HelperPage["Helper Page 内部"]
|
||||||
|
HP1["WebSocket 连接<br/>ws://127.0.0.1:12345"]
|
||||||
|
HP2["轮询 Command<br/>GET /sgclaw/callback/commands/next"]
|
||||||
|
HP3["推送 Events<br/>POST /sgclaw/callback/events"]
|
||||||
|
HP4["回调函数注册<br/>sgclawOnClickProbe / sgclawOnEval / ..."]
|
||||||
|
end
|
||||||
|
|
||||||
|
WS -->|"WebSocket"| CB
|
||||||
|
CB -->|"推送 Command"| HTTP
|
||||||
|
HTTP -->|long-poll| HP2
|
||||||
|
|
||||||
|
HP1 -->|"浏览器 WebSocket API"| Target1
|
||||||
|
HP1 -->|"浏览器 WebSocket API"| Target2
|
||||||
|
|
||||||
|
HP2 -->|"执行 JS 命令<br/>sgBrowserExcuteJsCodeByDomain|Target1
|
||||||
|
HP2 -->|"执行 JS 命令<br/>sgBrowserExcuteJsCodeByDomain|Target2
|
||||||
|
|
||||||
|
Target1 -->|"callBackJsToCpp|HP4
|
||||||
|
HP3 -->|"XHR POST|HTTP
|
||||||
|
HP4 --> HP3
|
||||||
|
|
||||||
|
HTTP -->|"Callback 事件|CB
|
||||||
|
CB -->|"ToolResult|WS
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. 线损确定性提交流程 (Deterministic Submit)
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant User as 用户
|
||||||
|
participant Host as 浏览器宿主
|
||||||
|
participant Agent as Agent / TaskRunner
|
||||||
|
participant DS as DeterministicSubmit
|
||||||
|
participant Skill as BrowserScriptSkillTool<br/>(collect_lineloss)
|
||||||
|
participant Backend as BrowserBackend
|
||||||
|
participant Browser as 浏览器页面<br/>(线损域)
|
||||||
|
participant Rust as Rust 侧<br/>xlsx 导出
|
||||||
|
|
||||||
|
User->>Host: 输入: "帮我查本月线损率。。。"
|
||||||
|
Host->>Agent: SubmitTask {instruction}
|
||||||
|
|
||||||
|
Agent->>DS: decide_deterministic_submit()
|
||||||
|
Note over DS: 指令以 "。。。" 结尾<br/>且包含 "线损" 关键词
|
||||||
|
DS-->>Agent: Execute(DeterministicExecutionPlan)
|
||||||
|
|
||||||
|
Agent->>Skill: execute_browser_script_skill_raw_output()
|
||||||
|
Skill->>Backend: Action::Eval {script: collect_lineloss.js}
|
||||||
|
Backend->>Browser: sgBrowserExcuteJsCodeByDomain<br/>(20.76.57.61, js_code)
|
||||||
|
|
||||||
|
Browser->>Browser: validatePageContext(args)
|
||||||
|
Browser->>Browser: buildMonthRequest / buildWeekRequest
|
||||||
|
Browser->>Browser: $.ajax 查询线损 API
|
||||||
|
Browser-->>Backend: 返回 report-artifact JSON
|
||||||
|
Backend-->>Skill: ToolResult
|
||||||
|
Skill-->>Agent: artifact {status, rows, column_defs}
|
||||||
|
|
||||||
|
Agent->>Rust: export_lineloss_xlsx(artifact)
|
||||||
|
Rust->>Rust: 生成 .xlsx 文件
|
||||||
|
Rust-->>Agent: 导出完成
|
||||||
|
Agent-->>Host: TaskComplete {success: true}
|
||||||
|
Host-->>User: 展示结果 + 打开 Excel
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. 平台浏览器与 sgClaw 的交互边界
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TB
|
||||||
|
subgraph PlatformBrowser["平台浏览器 (Chromium)"]
|
||||||
|
direction TB
|
||||||
|
subgraph PlatformPages["平台场景页面"]
|
||||||
|
PP1["场景页 Vue 实例<br/>window.mac"]
|
||||||
|
PP2["mutableSystemList<br/>子系统账号池"]
|
||||||
|
PP3["getLogint / loginStatusTing<br/>子系统登录编排"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph TargetPages["目标业务页面"]
|
||||||
|
TP1["线损系统<br/>20.76.57.61:18080"]
|
||||||
|
TP2["其他子系统"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph BrowserCapabilities["浏览器特权能力"]
|
||||||
|
BC1["sgBrowserExcuteJsCodeByDomain<br/>按域名执行 JS"]
|
||||||
|
BC2["sgHideBrowerserOpenPage<br/>打开隐藏页面"]
|
||||||
|
BC3["sgBrowserCallAfterLoaded<br/>页面加载后执行 JS"]
|
||||||
|
BC4["callBackJsToCpp<br/>JS → C++ 回调"]
|
||||||
|
end
|
||||||
|
|
||||||
|
PP1 --> PP2
|
||||||
|
PP1 --> PP3
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph sgClawProcess["sgClaw 进程"]
|
||||||
|
direction TB
|
||||||
|
subsgClawTransport["Transport 层"]
|
||||||
|
subgClawSecurity["MAC Policy + HMAC"]
|
||||||
|
subgClawAgent["Agent / TaskRunner"]
|
||||||
|
subgClawCompat["Compat 层"]
|
||||||
|
subgClawBackend["Browser Backend"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgClawTransport <-->|"STDIO JSON Line<br/>AgentMessage / BrowserMessage|PlatformBrowser
|
||||||
|
subgClawAgent --> subgClawCompat
|
||||||
|
subgClawCompat --> subgClawBackend
|
||||||
|
subgClawBackend -->|"BrowserAction<br/>sgBrowserExcuteJsCodeByDomain|BC1
|
||||||
|
subgClawBackend -->|"BrowserAction<br/>sgHideBrowerserOpenPage|BC2
|
||||||
|
subgClawBackend -->|"BrowserAction<br/>sgBrowserCallAfterLoaded|BC3
|
||||||
|
|
||||||
|
BC4 -.回调.-> subgClawBackend
|
||||||
|
|
||||||
|
PlatformBrowser -.安全边界.-> sgClawProcess
|
||||||
|
|
||||||
|
classDef browserSide fill:#e3f2fd,stroke:#1565c0,color:#000
|
||||||
|
classDef sgclawSide fill:#fff3e0,stroke:#e65100,color:#000
|
||||||
|
classDef interaction fill:#f3e5f5,stroke:#7b1fa2,color:#000
|
||||||
|
|
||||||
|
class PlatformBrowser,PlatformPages,TargetPages,BrowserCapabilities browserSide
|
||||||
|
class sgClawProcess,subgClawTransport,subgClawSecurity,subgClawAgent,subgClawCompat,subgClawBackend sgclawSide
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. 模块文件映射
|
||||||
|
|
||||||
|
| 模块 | 主要文件 | 职责 |
|
||||||
|
|---|---|---|
|
||||||
|
| **pipe 传输层** | `src/pipe/mod.rs`, `src/pipe/transport.rs`, `src/pipe/handshake.rs`, `src/pipe/browser_tool.rs` | STDIO 读写、握手、消息编码解码、HMAC 签名、Command 发送与 Response 等待 |
|
||||||
|
| **security 安全层** | `src/security/mod.rs`, `src/security/mac_policy.rs`, `src/security/hmac.rs` | MAC Policy 加载与校验、Session Key 派生、命令签名 |
|
||||||
|
| **agent 消息路由** | `src/agent/mod.rs`, `src/agent/task_runner.rs` | 接收 BrowserMessage 并分发、任务解析、Deterministic Submit 检测 |
|
||||||
|
| **browser 后端抽象** | `src/browser/mod.rs`, `src/browser/callback_backend.rs`, `src/browser/callback_host.rs`, `src/browser/ws_protocol.rs` | BrowserBackend trait 定义、Pipe/WS/Callback/Bridge 四种实现 |
|
||||||
|
| **compat 兼容层** | `src/compat/mod.rs`, `src/compat/runtime.rs`, `src/compat/deterministic_submit.rs`, `src/compat/browser_script_skill_tool.rs` | ZeroClaw 运行时构建、线损确定性提交、Skill browser_script 执行 |
|
||||||
|
| **service 服务模式** | `src/service/mod.rs`, `src/service/session.rs` | WebSocket 服务器、客户端会话管理、单任务并发模型 |
|
||||||
|
| **config 运行时配置** | `src/config/mod.rs`, `src/config/settings.rs` | SgClawSettings 加载、Provider 配置、Backend 选择 |
|
||||||
|
| **runtime 运行时引擎** | `src/runtime/mod.rs`, `src/runtime/engine.rs`, `src/runtime/tool_policy.rs` | RuntimeEngine 构建 Agent、ToolPolicy 工具权限控制 |
|
||||||
645
docs/sgClaw组件职责与流转全景图.html
Normal file
645
docs/sgClaw组件职责与流转全景图.html
Normal file
@@ -0,0 +1,645 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="zh-CN">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
|
<title>sgClaw 智能浏览器自动化平台 - 组件职责与流转全景图</title>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/mermaid@10.9.5/dist/mermaid.min.js"></script>
|
||||||
|
<style>
|
||||||
|
*{margin:0;padding:0;box-sizing:border-box}
|
||||||
|
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI","PingFang SC","Hiragino Sans GB","Microsoft YaHei",sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.8}
|
||||||
|
.header{background:linear-gradient(135deg,#0a1628,#16213e,#1a3a5c);padding:3rem 2rem;text-align:center;border-bottom:3px solid #e65100}
|
||||||
|
.header h1{font-size:2.2rem;color:#e6edf3;margin-bottom:.5rem}
|
||||||
|
.header .subtitle{color:#8b949e;font-size:1rem}
|
||||||
|
.container{max-width:1400px;margin:0 auto;padding:2rem}
|
||||||
|
.section{background:#161b22;border:1px solid #30363d;border-radius:12px;margin-bottom:2rem;overflow:hidden}
|
||||||
|
.section-header{background:linear-gradient(90deg,#1a1a2e,#16213e);padding:1rem 1.5rem;border-bottom:1px solid #30363d;display:flex;align-items:center;gap:.8rem}
|
||||||
|
.section-number{background:#e65100;color:#fff;width:32px;height:32px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;flex-shrink:0}
|
||||||
|
.section-title{font-size:1.2rem;color:#e6edf3;font-weight:600}
|
||||||
|
.section-body{padding:1.5rem;overflow-x:auto}
|
||||||
|
.mermaid{display:flex;justify-content:center;padding:1rem 0}
|
||||||
|
.mermaid svg{max-width:100%;height:auto}
|
||||||
|
.desc{background:#1a1a2e;border-left:3px solid #e65100;padding:1rem 1.2rem;margin:1rem 0;border-radius:0 8px 8px 0;font-size:.95rem;color:#8b949e}
|
||||||
|
.desc strong{color:#e6edf3}
|
||||||
|
.component-grid{display:grid;grid-template-columns:repeat(auto-fit,minmax(300px,1fr));gap:1.2rem;margin:1.5rem 0}
|
||||||
|
.component-card{background:#1a1a2e;border:1px solid #30363d;border-radius:10px;padding:1.3rem;transition:all .2s}
|
||||||
|
.component-card:hover{border-color:#e65100;transform:translateY(-2px)}
|
||||||
|
.component-card h3{color:#e65100;font-size:1.05rem;margin-bottom:.6rem;display:flex;align-items:center;gap:.5rem}
|
||||||
|
.component-card .badge{background:#e65100;color:#fff;padding:.15rem .5rem;border-radius:12px;font-size:.75rem;font-weight:600}
|
||||||
|
.component-card .badge.external{background:#4a9eff}
|
||||||
|
.component-card p{color:#8b949e;font-size:.9rem;margin-bottom:.5rem}
|
||||||
|
.component-card .meta{display:flex;flex-direction:column;gap:.3rem;margin-top:.8rem;padding-top:.8rem;border-top:1px solid #30363d}
|
||||||
|
.component-card .meta-item{display:flex;gap:.5rem;font-size:.85rem}
|
||||||
|
.component-card .meta-label{color:#6e7681;white-space:nowrap;min-width:60px}
|
||||||
|
.component-card .meta-value{color:#c9d1d9}
|
||||||
|
.flow-container{background:#1a1a2e;border-radius:10px;padding:1.5rem;margin:1rem 0}
|
||||||
|
.flow-step{display:flex;gap:1rem;align-items:flex-start;margin-bottom:1rem;padding:.8rem;background:#161b22;border-radius:8px;border-left:3px solid #e65100}
|
||||||
|
.flow-step:last-child{margin-bottom:0}
|
||||||
|
.flow-step-number{background:#e65100;color:#fff;width:28px;height:28px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;flex-shrink:0}
|
||||||
|
.flow-step-content{flex:1}
|
||||||
|
.flow-step-content h4{color:#e6edf3;font-size:1rem;margin-bottom:.3rem}
|
||||||
|
.flow-step-content p{color:#8b949e;font-size:.9rem}
|
||||||
|
.flow-step-content .highlight{color:#4a9eff;font-weight:600}
|
||||||
|
.footer{text-align:center;padding:2rem;color:#484f58;font-size:.85rem;border-top:1px solid #21262d}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="header">
|
||||||
|
<h1>sgClaw 智能浏览器自动化平台</h1>
|
||||||
|
<div class="subtitle">核心组件职责与流转全景图 - 每个组件是什么 做什么 什么时候调用</div>
|
||||||
|
</div>
|
||||||
|
<div class="container">
|
||||||
|
|
||||||
|
<!-- Section 1: Overview -->
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">1</div>
|
||||||
|
<div class="section-title">全景概览 - 从用户指令到浏览器执行的完整链路</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
当用户说出"帮我查本月线损率"时,sgClaw 内部多个组件协同工作。以下是<strong>完整的执行链路</strong>,展示每个组件在哪个环节被调用、承担什么职责。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
U["用户\n输入自然语言指令"] -->|"1. SubmitTask"| GW["通信网关\nSTDIO Pipe / Service WS\n接收请求 建立会话"]
|
||||||
|
GW -->|"2. 加载配置"| CFG["SgClawSettings\n加载 sgclaw_config.json\nLLM Provider RuntimeProfile SkillsDir"]
|
||||||
|
CFG -->|"3. 四级路由决策"| RT["Agent Runtime\ntask_runner 任务调度"]
|
||||||
|
|
||||||
|
RT -->|"3a. 匹配场景"| DS["确定性执行\ndeterministic_submit\nscene_platform 匹配场景清单\n直接执行预设脚本 无需LLM"]
|
||||||
|
RT -->|"3b. 主编排"| PO["主编排路径\nzeroclaw_process_message_primary\n完整Agent工具循环 LLM自主规划"]
|
||||||
|
RT -->|"3c. 直连技能"| DSK["直连技能路径\ndirect_skill_primary\n配置指定skill.tool直接执行"]
|
||||||
|
RT -->|"3d. 标准LLM"| ZC["标准LLM路径\ncompat_llm_primary\nzeroclaw agent turn 默认回退"]
|
||||||
|
|
||||||
|
DS -->|"4. 执行操作"| BB["浏览器后端\nBrowserBackend trait\nPipeBrowser / WsBrowser"]
|
||||||
|
PO -->|"4. 调用工具"| BB
|
||||||
|
DSK -->|"4. 执行操作"| BB
|
||||||
|
ZC -->|"4. 调用工具"| BB
|
||||||
|
|
||||||
|
BB -->|"5. 安全校验"| SC["MAC Policy\n检查 rules.json\n域名白名单 动作白名单 HMAC"]
|
||||||
|
SC -->|"6. 执行命令"| EXT["SuperRPA Chromium\n执行实际DOM操作\n导航 点击 输入 读取"]
|
||||||
|
EXT -->|"7. 返回结果"| BB
|
||||||
|
BB -->|"8. 结果回传"| RT
|
||||||
|
RT -->|"9. 后处理"| PH["Report Artifact\nopenxml_office 生成Excel\nscreen_html_export 生成大屏"]
|
||||||
|
PH -->|"10. TaskComplete"| GW
|
||||||
|
GW -->|"11. 结果"| U
|
||||||
|
|
||||||
|
classDef userNode fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef coreNode fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef routeNode fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
classDef extNode fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||||
|
classDef cfgNode fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||||||
|
|
||||||
|
class U userNode
|
||||||
|
class GW,RT,BB,PH coreNode
|
||||||
|
class DS,PO,DSK,ZC routeNode
|
||||||
|
class SC,EXT extNode
|
||||||
|
class CFG cfgNode
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Section 2: Core Components Detail -->
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">2</div>
|
||||||
|
<div class="section-title">核心组件详解 - 职责 调用时机 输入输出</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
以下是每个核心组件的详细说明。点击卡片可查看<strong>什么时候调用</strong>、<strong>输入什么</strong>、<strong>输出什么</strong>。
|
||||||
|
</div>
|
||||||
|
<div class="component-grid">
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>通信网关</h3>
|
||||||
|
<p>负责接收用户请求、建立会话、返回最终结果。支持两种模式:STDIO Pipe(默认,与浏览器宿主通过 stdin/stdout JSON Line 通信)和 Service WS(WebSocket 服务模式,接受外部客户端连接)。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">用户发起请求时第一时间响应</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">SubmitTask 消息(指令 conversationId pageUrl pageTitle)</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">TaskComplete LogEntry StatusChanged 消息</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>Agent Runtime 任务调度</h3>
|
||||||
|
<p>run_submit_task() 是任务执行入口。依次执行四级路由决策:① deterministic_submit 确定性场景匹配 ② primary_orchestration 主编排 ③ direct_submit_skill 直连技能 ④ compat_llm_primary 标准LLM回退。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">SubmitTask 消息到达后</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">指令 AgentRuntimeContext BrowserPipeTool</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">AgentMessage::TaskComplete</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>场景平台 Scene Platform</h3>
|
||||||
|
<p>扫描 skills/ 目录下的场景清单(scene.toml),解析 deterministic 段落的关键词规则。当用户指令匹配时,构建 DeterministicExecutionPlan(含 target_url org_code period_mode 等执行参数),直接执行预设脚本。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">四级路由决策第一步</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">用户指令 pageUrl pageTitle skills目录</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">DeterministicExecutionPlan 或 NotDeterministic</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>SgClawSettings 配置管理</h3>
|
||||||
|
<p>从 JSON 配置文件或环境变量加载运行时配置:多 Provider 管理(apiKey baseUrl model)、Runtime Profile、SkillsDir、BrowserBackend 类型、OfficeBackend、Service WS 监听地址等。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次任务提交时加载</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">sgclaw_config.json 或环境变量</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">SgClawSettings 结构体</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>Runtime Engine 运行时引擎</h3>
|
||||||
|
<p>根据 Runtime Profile(BrowserAttached/BrowserHeavy/GeneralAssistant)构建 Tool Policy 白名单,加载技能包,注入 Memory,构建 Agent 实例。同时负责指令增强(附加浏览器合约提示、检测特定任务类型)。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">主编排路径和标准LLM路径构建Agent时</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">核心方法:</span><span class="meta-value">build_agent() build_instruction()</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">Profile:</span><span class="meta-value">BrowserAttached / BrowserHeavy / GeneralAssistant</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge external">外部</span>ZeroClaw Core 智能体核心</h3>
|
||||||
|
<p>位于 third_party/zeroclaw/ 的 vendored Agent 核心库。提供 Agent 构建、Provider 管理、工具循环、Memory 接口、技能加载、Prompt 组装等核心能力。sgClaw 在其基础上叠加安全信封层。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">主编排和标准LLM路径中</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">位置:</span><span class="meta-value">third_party/zeroclaw/</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">核心能力:</span><span class="meta-value">Agent Provider ToolLoop Memory Skills</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>Browser Backend 浏览器后端</h3>
|
||||||
|
<p>统一的浏览器操作接口(BrowserBackend trait)。两种实现:PipeBrowserBackend(通过 STDIO 与宿主通信)和 WsBrowserBackend(通过 WebSocket 直连 DevTools)。支持 SuperRpa/AgentBrowser/RustNative/ComputerUse 多种后端类型。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">需要操作浏览器时</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">支持操作:</span><span class="meta-value">navigate click type getText eval select scrollTo 等15种</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>MAC Policy 安全策略</h3>
|
||||||
|
<p>从 resources/rules.json 加载安全规则。三层安全模型:①握手时 HMAC seed 交换和会话密钥派生 ②Rust 侧域名+动作白名单校验 ③宿主侧 HMAC 二次验证。拒绝不在白名单的域名和被禁用的动作。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次浏览器操作执行前</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">检查项:</span><span class="meta-value">域名白名单 动作类型 HMAC验证</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge external">外部</span>SuperRPA Chromium 浏览器宿主</h3>
|
||||||
|
<p>实际执行 DOM 操作的外部系统。接收 sgClaw 的 Command(含 HMAC),验证后执行 navigate/click/type/getText 等操作,返回 Response(含操作结果 + HMAC)。STDIO 模式下与 sgClaw 进程通过管道通信。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">BrowserBackend 发送命令时</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">通信协议:</span><span class="meta-value">STDIO JSON Line 或 WebSocket</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Section 3: LLM Detail Flow -->
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">3</div>
|
||||||
|
<div class="section-title">LLM 大模型工作全流程 - 从语义识别到任务规划</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
当用户指令无法匹配已知技能时,LLM 大模型开始工作。以下是<strong>大模型从理解用户意图到生成可执行计划的完整过程</strong>。
|
||||||
|
</div>
|
||||||
|
<div class="flow-container">
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">1</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>语义识别 - 理解用户说了什么</h4>
|
||||||
|
<p>LLM 接收用户自然语言指令,识别用户的<strong>真实意图</strong>。例如"帮我查本月线损率" → 识别为"查询线损率数据"。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">2</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>场景匹配 - 判断是否为已知场景</h4>
|
||||||
|
<p>结合 <span class="highlight">Memory(记忆模块)</span>中存储的历史任务记录,判断该指令是否与已有技能匹配。如果匹配,转交快速通道执行。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">3</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>任务拆解 - 将大目标分解为小步骤</h4>
|
||||||
|
<p>如果是新场景,LLM 将用户目标拆解为具体的、可操作的步骤序列。例如:打开系统 → 选择月份 → 点击查询 → 读取数据 → 导出Excel。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">4</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>工具选择 - 决定用什么能力完成任务</h4>
|
||||||
|
<p>LLM 根据步骤需求,从可用工具库中选择合适的工具。例如:需要打开网页选择"导航工具",需要点击按钮选择"点击工具",需要读取数据选择"读取工具"。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">5</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>参数填充 - 确定每个工具的具体参数</h4>
|
||||||
|
<p>LLM 为每个工具填充具体参数。例如点击工具需要知道"点击哪个按钮",导航工具需要知道"打开哪个URL"。这些参数从用户指令和上下文中提取。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">6</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>执行计划生成 - 输出可执行的JSON/结构化指令</h4>
|
||||||
|
<p>LLM 将拆解的步骤、选择的工具、填充的参数整合为<strong>结构化的执行计划</strong>,交由工具执行引擎依次执行。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">7</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>循环迭代 - 根据执行结果动态调整</h4>
|
||||||
|
<p>如果某一步执行失败或结果不符合预期,LLM 会收到反馈,重新规划后续步骤。例如页面打不开则尝试备用URL,元素找不到则换选择器。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Section 4: Memory, Skills & Runtime Engine -->
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">4</div>
|
||||||
|
<div class="section-title">Memory 技能管理 与 Runtime Engine - 运行时核心引擎</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
sgClaw 的运行时核心由三大引擎协同工作:<strong>Memory(记忆模块)</strong>负责持久化存储对话历史与任务状态,<strong>技能管理系统</strong>负责加载和注入技能包到 Agent,<strong>Runtime Engine</strong>负责根据 Runtime Profile 构建完整的 Agent 运行环境(工具策略 + 技能加载 + 指令增强)。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
subgraph Memory["Memory 记忆模块 zeroclaw::memory"]
|
||||||
|
M1["SQLite 存储 brain.db\n对话历史 任务状态 执行结果"]
|
||||||
|
M2["Memory Trait 接口\ncreateMemoryWithStorage\n支持多种后端 SQLite/文件"]
|
||||||
|
M1 -.->|"读写"| M2
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph SkillMgmt["技能管理 Skills Management"]
|
||||||
|
S1["技能加载器\nloadSkillsFromDirectory\n按目录扫描技能包"]
|
||||||
|
S2["技能过滤器\n按浏览器可用性过滤\nbrowser_script 工具裁剪"]
|
||||||
|
S3["ReadSkill Tool\n运行时按需读取技能详情\n支持 open_skills 配置"]
|
||||||
|
S4["技能目录解析\nskills/ 默认目录\n自定义 skillsDir"]
|
||||||
|
S1 --> S2
|
||||||
|
S4 --> S1
|
||||||
|
S1 --> S3
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph RuntimeEngine["Runtime Engine 运行时引擎"]
|
||||||
|
R1["Runtime Profile\nBrowserAttached / BrowserHeavy / GeneralAssistant"]
|
||||||
|
R2["Tool Policy 工具策略\n按 Profile 维护工具白名单\nallowed_tools 列表"]
|
||||||
|
R3["Agent Builder\n组装 Provider + Tools + Memory + Skills\n构建完整 Agent 实例"]
|
||||||
|
R4["指令增强器\n附加浏览器合约提示\n检测知乎热榜/Excel导出/大屏任务"]
|
||||||
|
R1 -->|"决定"| R2
|
||||||
|
R2 -->|"约束"| R3
|
||||||
|
R3 -->|"使用"| R4
|
||||||
|
end
|
||||||
|
|
||||||
|
Memory -->|"注入"| RuntimeEngine
|
||||||
|
SkillMgmt -->|"注入"| RuntimeEngine
|
||||||
|
|
||||||
|
classDef memFill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef skillFill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
classDef runtimeFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
|
||||||
|
class Memory,M1,M2 memFill
|
||||||
|
class SkillMgmt,S1,S2,S3,S4 skillFill
|
||||||
|
class RuntimeEngine,R1,R2,R3,R4 runtimeFill
|
||||||
|
</div>
|
||||||
|
<div class="component-grid" style="margin-top:1.5rem">
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>Memory 记忆模块</h3>
|
||||||
|
<p><strong>职责:</strong>基于 SQLite(brain.db)持久化存储对话历史、任务状态和执行结果。通过 zeroclaw::memory::Memory trait 提供统一接口,支持多种存储后端。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">Agent 构建时创建 每次 LLM 调用前后读写</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">调用者:</span><span class="meta-value">Runtime Engine(build_agent 方法)</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">存储路径:</span><span class="meta-value">workspace/memory/brain.db</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>技能管理系统</h3>
|
||||||
|
<p><strong>职责:</strong>从 skills/ 目录(或自定义 skillsDir)扫描加载技能包,按浏览器是否可用过滤 browser_script 工具,通过 ReadSkill Tool 让 Agent 按需读取技能详情。支持 open_skills 独立技能目录配置。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次 Agent 构建时加载技能列表</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">调用者:</span><span class="meta-value">Runtime Engine(load_skills_for_surface)</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">技能来源:</span><span class="meta-value">workspace/skills/ 或 skillsDir 配置</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>Runtime Engine</h3>
|
||||||
|
<p><strong>职责:</strong>运行时核心编排器。根据 Runtime Profile 决定工具白名单,加载技能,注入 Memory,构建 Agent 实例。同时负责指令增强(附加浏览器合约提示、检测特定任务类型如知乎热榜/Excel导出/大屏展示)。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次任务提交时 构建 Agent 前</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">核心方法:</span><span class="meta-value">build_agent() build_instruction()</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">Profile:</span><span class="meta-value">BrowserAttached / BrowserHeavy / GeneralAssistant</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Section 5: Task Routing - 4 Execution Paths -->
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">5</div>
|
||||||
|
<div class="section-title">任务路由 - 四种执行路径决策树</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
任务提交到 sgClaw 后,<strong>Agent Runtime</strong> 按优先级依次判断走哪条执行路径。这不是简单的"快速/AI"二选一,而是<strong>四级决策树</strong>。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
A["SubmitTask 用户指令进入"] --> B["1. deterministic_submit\n场景平台匹配"]
|
||||||
|
B -->|"匹配已知确定场景"| C["确定性执行路径\ndeterministic_submit\n直接执行预设场景脚本"]
|
||||||
|
B -->|"未匹配 非确定性"| D["2. Primary Orchestration\nzeroclaw process_message"]
|
||||||
|
|
||||||
|
D -->|"browser_surface_enabled\n且 should_use_primary"| E["主编排路径\nzeroclaw_process_message_primary\n完整 Agent 工具循环"]
|
||||||
|
D -->|"不满足条件"| F["3. direct_submit_skill\n配置了直连技能"]
|
||||||
|
|
||||||
|
F -->|"directSubmitSkill已配置"| G["直连技能路径\ndirect_skill_primary\n绕过Agent直接执行"]
|
||||||
|
F -->|"未配置"| H["4. compat_llm_primary\n标准LLM路径\nzeroclaw agent turn"]
|
||||||
|
|
||||||
|
C --> I["TaskComplete 返回结果"]
|
||||||
|
E --> I
|
||||||
|
G --> I
|
||||||
|
H --> I
|
||||||
|
|
||||||
|
classDef routeFill fill:#e65100,stroke:#ff6d00,color:#fff
|
||||||
|
classDef path1Fill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||||
|
classDef path2Fill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
classDef path3Fill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef path4Fill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef endFill fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||||||
|
|
||||||
|
class B,D,F routeFill
|
||||||
|
class C path1Fill
|
||||||
|
class E path2Fill
|
||||||
|
class G path3Fill
|
||||||
|
class H path4Fill
|
||||||
|
class I endFill
|
||||||
|
</div>
|
||||||
|
<div class="flow-container" style="margin-top:1.5rem">
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">1</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>确定性场景匹配 - deterministic_submit</h4>
|
||||||
|
<p>通过 <span class="highlight">scene_platform</span> 模块扫描 skills/ 目录下的场景清单(scene.toml),匹配指令关键词、URL、页面标题。匹配成功则构建 <span class="highlight">DeterministicExecutionPlan</span>,直接执行场景预设的浏览器脚本,<strong>无需 LLM 参与</strong>。典型场景:线损查询、报表导出等固定流程。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">2</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>主编排路径 - zeroclaw_process_message_primary</h4>
|
||||||
|
<p>当 Runtime Profile 启用浏览器工具(browser_surface_enabled)且 <span class="highlight">orchestration::should_use_primary</span> 判定走主编排时,调用 zeroclaw 的 process_message 完整 Agent 循环。LLM 可以调用所有允许的工具(浏览器操作、技能工具等),支持多轮工具调用和动态规划。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">3</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>直连技能路径 - direct_skill_primary</h4>
|
||||||
|
<p>当配置中设置了 <span class="highlight">directSubmitSkill</span>(格式:skillName.toolName),绕过正常 Agent 循环,直接执行指定的技能工具。适用于需要固定流程但又不适合确定性场景的中间态。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">4</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4>标准 LLM 路径 - compat_llm_primary</h4>
|
||||||
|
<p>以上三条路都不通时的默认回退。创建标准 zeroclaw Agent turn,LLM 根据指令自主决定使用哪些工具。这是最灵活但也最慢的路径。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Section 6: Browser Execution Full Process -->
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">6</div>
|
||||||
|
<div class="section-title">浏览器执行全过程 - 从sgClaw到SuperRPA浏览器的命令传输</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
sgClaw 有两种浏览器后端模式:<strong>STDIO Pipe 模式</strong>(sgClaw 进程通过 stdin/stdout 与浏览器宿主通信)和 <strong>WebSocket 模式</strong>(直接连接浏览器 DevTools WebSocket)。安全校验在两种模式下都由 MAC Policy 层负责。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
subgraph PipeMode["STDIO Pipe 模式(嵌入SuperRPA)"]
|
||||||
|
TE1["ZeroClawBrowserTool\n实现 zeroclaw::tools::Tool trait\n暴露 browser_action / superrpa_browser"]
|
||||||
|
SC1["MAC Policy 安全策略\n检查 rules.json 域名白名单\n动作白名单 HMAC验证"]
|
||||||
|
BC1["BrowserPipeTool\n分配 seq 计算 HMAC\n发送Command 等待Response"]
|
||||||
|
TP1["StdioTransport\nJSON Line 协议\nstdin/stdout 1MB限制"]
|
||||||
|
HOST1["浏览器宿主进程\nSuperRPA Chromium\n验证HMAC 执行DOM操作"]
|
||||||
|
|
||||||
|
TE1 -->|"tool call"| SC1
|
||||||
|
SC1 -->|"校验通过"| BC1
|
||||||
|
BC1 -->|"Command + HMAC"| TP1
|
||||||
|
TP1 -->|"JSON Line"| HOST1
|
||||||
|
HOST1 -->|"Response + HMAC"| TP1
|
||||||
|
TP1 -->|"匹配 seq 返回"| BC1
|
||||||
|
BC1 -->|"结果"| TE1
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph WsMode["WebSocket 模式(独立运行)"]
|
||||||
|
TE2["ZeroClawBrowserTool\n相同的 Tool 接口"]
|
||||||
|
SC2["MAC Policy 相同的安全检查"]
|
||||||
|
BC2["WsBrowserBackend\nWebSocket 连接\nDevTools Protocol"]
|
||||||
|
WS1["WebSocket 协议层\ntungstenite 库"]
|
||||||
|
HOST2["浏览器 DevTools\nChrome DevTools Protocol"]
|
||||||
|
|
||||||
|
TE2 -->|"tool call"| SC2
|
||||||
|
SC2 -->|"校验通过"| BC2
|
||||||
|
BC2 -->|"CDP Command"| WS1
|
||||||
|
WS1 -->|"ws://host:port"| HOST2
|
||||||
|
HOST2 -->|"CDP Response"| WS1
|
||||||
|
WS1 -->|"结果"| BC2
|
||||||
|
BC2 -->|"结果"| TE2
|
||||||
|
end
|
||||||
|
|
||||||
|
classDef teFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
classDef scFill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||||
|
classDef bcFill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||||
|
classDef tpFill fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||||||
|
classDef hostFill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||||
|
|
||||||
|
class TE1,TE2 teFill
|
||||||
|
class SC1,SC2 scFill
|
||||||
|
class BC1,BC2 bcFill
|
||||||
|
class TP1,WS1 tpFill
|
||||||
|
class HOST1,HOST2 hostFill
|
||||||
|
</div>
|
||||||
|
<div class="component-grid" style="margin-top:1.5rem">
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>ZeroClawBrowserTool</h3>
|
||||||
|
<p><strong>职责:</strong>实现 zeroclaw::tools::Tool trait,将 BrowserBackend 适配为 LLM 可调用的工具。暴露两个工具名:browser_action(传统别名)和 superrpa_browser(SuperRPA 专用,优先使用)。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">LLM 决定操作浏览器时</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">compat/browser_tool_adapter.rs</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>MAC Policy 安全策略</h3>
|
||||||
|
<p><strong>职责:</strong>从 resources/rules.json 加载安全规则。三层安全检查:①握手时 HMAC seed 交换 ②Rust 侧域名+动作白名单校验 ③宿主侧 HMAC 二次验证。拒绝不在白名单的域名和被禁用的动作。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次浏览器工具调用前</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">规则文件:</span><span class="meta-value">resources/rules.json</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>BrowserBackend 浏览器后端</h3>
|
||||||
|
<p><strong>职责:</strong>统一的浏览器操作接口(BrowserBackend trait)。两种实现:PipeBrowserBackend(通过 StdioTransport 与宿主通信)和 WsBrowserBackend(通过 WebSocket 直连 DevTools)。由 BrowserBackend 配置决定使用哪种。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">后端类型:</span><span class="meta-value">SuperRpa / AgentBrowser / RustNative / ComputerUse / Auto</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">browser/pipe_backend.rs browser/ws_backend.rs</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="component-card">
|
||||||
|
<h3><span class="badge">内部</span>BrowserPipeTool</h3>
|
||||||
|
<p><strong>职责:</strong>STDIO Pipe 模式下的特权浏览器工具。为每个命令分配单调递增 seq,使用派生会话密钥计算 HMAC,发送 Command 消息后阻塞等待匹配的 Response,支持超时。</p>
|
||||||
|
<div class="meta">
|
||||||
|
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">Pipe 模式下每次浏览器操作</span></div>
|
||||||
|
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">pipe/browser_tool.rs</span></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Section 7: External Systems -->
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">7</div>
|
||||||
|
<div class="section-title">外部系统关系图 - sgClaw与谁交互</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
sgClaw 不是孤立运行的,它与多个<strong>外部系统</strong>协同工作。以下是sgClaw与外部系统的交互关系。
|
||||||
|
</div>
|
||||||
|
<div class="mermaid">
|
||||||
|
graph TB
|
||||||
|
subgraph External["外部系统 - sgClaw不控制这些系统"]
|
||||||
|
E1["LLM 提供商\nDeepSeek OpenAI Claude\nHTTP API 调用"]
|
||||||
|
E2["SuperRPA Chromium\n浏览器宿主进程\nSTDIO 或 WebSocket"]
|
||||||
|
E3["业务系统\n线损系统 客服系统\n通过浏览器访问"]
|
||||||
|
E4["客户端\nsg_claw_client CLI\nService WebSocket 连接"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph sgClawInternal["sgClaw 内部"]
|
||||||
|
S1["通信网关\nSTDIO Pipe / Service WS"]
|
||||||
|
S2["Agent Runtime\ntask_runner 任务调度"]
|
||||||
|
S3["Runtime Engine\n构建Agent 工具策略"]
|
||||||
|
S4["ZeroClaw Core\nthird_party/zeroclaw\nAgent循环 工具循环"]
|
||||||
|
S5["MAC Policy\n安全策略 rules.json"]
|
||||||
|
S6["Browser Backend\nPipeBrowser / WsBrowser"]
|
||||||
|
end
|
||||||
|
|
||||||
|
E4 -->|"SubmitTask"| S1
|
||||||
|
S1 -->|"TaskComplete / LogEntry"| E4
|
||||||
|
|
||||||
|
S2 -->|"构建 Agent"| S3
|
||||||
|
S3 -->|"build_agent"| S4
|
||||||
|
|
||||||
|
S4 -->|"发送Prompt 接收响应"| E1
|
||||||
|
S4 -->|"调用工具"| S5
|
||||||
|
S5 -->|"校验通过"| S6
|
||||||
|
S6 -->|"浏览器命令"| E2
|
||||||
|
E2 -->|"DOM操作"| E3
|
||||||
|
E3 -->|"页面数据"| E2
|
||||||
|
E2 -->|"命令结果"| S6
|
||||||
|
S6 -->|"结果"| S4
|
||||||
|
S4 -->|"事件桥接 log_entry"| S1
|
||||||
|
|
||||||
|
classDef extFill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||||
|
classDef intFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||||
|
|
||||||
|
class External,E1,E2,E3,E4 extFill
|
||||||
|
class sgClawInternal,S1,S2,S3,S4,S5,S6 intFill
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Section 8: Complete Lifecycle -->
|
||||||
|
<div class="section">
|
||||||
|
<div class="section-header">
|
||||||
|
<div class="section-number">8</div>
|
||||||
|
<div class="section-title">完整生命周期 - 一个任务从出生到结束</div>
|
||||||
|
</div>
|
||||||
|
<div class="section-body">
|
||||||
|
<div class="desc">
|
||||||
|
以一个真实场景为例:<strong>"帮我查本月线损率并导出Excel"</strong>,展示sgClaw从接收指令到返回结果的完整生命周期。
|
||||||
|
</div>
|
||||||
|
<div class="flow-container">
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">1</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4><span class="highlight">通信网关</span>接收指令</h4>
|
||||||
|
<p>浏览器宿主进程通过 STDIO(JSON Line 协议)发送 SubmitTask 消息。sgClaw 创建会话,解析指令、page_url、page_title、conversation_id。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">2</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4><span class="highlight">加载配置</span>SgClawSettings</h4>
|
||||||
|
<p>从 sgclaw_config.json 或环境变量加载配置:LLM provider(apiKey/baseUrl/model)、runtimeProfile、skillsDir、directSubmitSkill 等。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">3</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4><span class="highlight">确定性场景匹配</span>deterministic_submit</h4>
|
||||||
|
<p>扫描 skills/ 目录下的场景清单(scene.toml),发现指令包含"线损率"、"本月"关键词,匹配到"线损查询"场景。构建 DeterministicExecutionPlan(含 target_url、org_code、period_mode 等参数)。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">4</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4><span class="highlight">MAC Policy</span>安全校验</h4>
|
||||||
|
<p>检查目标域名是否在 rules.json 白名单中 → 通过。检查操作类型(navigate、click、getText)是否在动作白名单中 → 通过。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">5</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4><span class="highlight">BrowserPipeTool</span>执行浏览器命令</h4>
|
||||||
|
<p>为每个命令分配单调递增 seq,使用派生会话密钥计算 HMAC。通过 StdioTransport 发送 Command 消息给浏览器宿主。执行:导航到线损系统 → 选择月份 → 点击查询 → 读取表格数据。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">6</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4><span class="highlight">SuperRPA Chromium</span>执行DOM操作</h4>
|
||||||
|
<p>浏览器宿主接收 Command,验证 HMAC,执行实际 DOM 操作(导航、选择下拉框、点击按钮、读取表格内容),返回 Response(含操作结果 + HMAC)。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">7</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4><span class="highlight">Report Artifact</span>后处理</h4>
|
||||||
|
<p>将浏览器返回的表格数据解析为结构化格式。根据场景的 postprocess 配置,使用 openxml_office 工具生成 .xlsx 文件。生成结果包含本地文件路径。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="flow-step">
|
||||||
|
<div class="flow-step-number">8</div>
|
||||||
|
<div class="flow-step-content">
|
||||||
|
<h4><span class="highlight">通信网关</span>返回结果</h4>
|
||||||
|
<p>通过 StdioTransport 发送 TaskComplete 消息给浏览器宿主,包含 success=true 和执行摘要(含生成的 .xlsx 文件路径)。浏览器宿主提示用户下载完成。</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
<div class="footer">sgClaw 智能浏览器自动化平台 - 组件职责与流转全景图 - 2026年4月</div>
|
||||||
|
<script>
|
||||||
|
mermaid.initialize({ startOnLoad:true, theme:'dark', securityLevel:'loose', logLevel:'warn' });
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
BIN
docs/sgClaw组件职责与流转全景图.pdf
Normal file
BIN
docs/sgClaw组件职责与流转全景图.pdf
Normal file
Binary file not shown.
418
docs/superpowers/plans/2026-04-14-request-url-resolution-plan.md
Normal file
418
docs/superpowers/plans/2026-04-14-request-url-resolution-plan.md
Normal file
@@ -0,0 +1,418 @@
|
|||||||
|
# Request URL Resolution Implementation Plan
|
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
|
**Goal:** Replace the temporary line-loss request URL hardcode in `src/service/server.rs` with a unified bootstrap-target resolver that prefers current page context, then deterministic submit plans, then skill metadata, and finally `about:blank`.
|
||||||
|
|
||||||
|
**Architecture:** Add a small service-owned resolver that returns a narrow `SubmitBootstrapTarget` result and centralizes precedence rules. Reuse `DeterministicExecutionPlan.target_url` as the authoritative source for deterministic line-loss scenes, then add minimal skill metadata fallback for configured direct browser-script skills, while keeping callback-host behavior unchanged.
|
||||||
|
|
||||||
|
**Tech Stack:** Rust, serde/serde_json, tungstenite, zeroclaw skill loader, staged `SKILL.toml` manifests, cargo test
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 1: Add resolver-focused red tests for precedence
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/service/server.rs:422-467`
|
||||||
|
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||||
|
- Test: `tests/service_ws_session_test.rs`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Write the failing page-context precedence test**
|
||||||
|
|
||||||
|
In a crate-local unit test inside `src/service/server.rs`, add a focused resolver test that exercises the request-url resolver with:
|
||||||
|
- non-empty `page_url = "https://already-open.example.com/page"`
|
||||||
|
- an instruction that would otherwise match deterministic line-loss logic
|
||||||
|
- configured direct skill metadata present
|
||||||
|
|
||||||
|
Assert the resolved bootstrap target uses the explicit non-empty `page_url` and reports `PageContext` source.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run the test to verify it fails**
|
||||||
|
|
||||||
|
Run: `cargo test page_context_bootstrap_target_wins_over_deterministic_and_skill_fallback --lib -- --nocapture`
|
||||||
|
Expected: FAIL because no unified resolver/source enum exists yet.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Write the failing deterministic-precedence test**
|
||||||
|
|
||||||
|
In `src/service/server.rs` crate-local tests, add a focused test for a deterministic line-loss instruction with no `page_url`.
|
||||||
|
|
||||||
|
Use the same instruction shape already accepted by `decide_deterministic_submit(...)`, and assert:
|
||||||
|
- resolver source is `DeterministicPlan`
|
||||||
|
- resolved `request_url` equals `DeterministicExecutionPlan.target_url`
|
||||||
|
- no raw `instruction.contains("线损")` fallback is needed
|
||||||
|
|
||||||
|
- [ ] **Step 4: Run the test to verify it fails**
|
||||||
|
|
||||||
|
Run: `cargo test deterministic_bootstrap_target_uses_plan_target_url --lib -- --nocapture`
|
||||||
|
Expected: FAIL because service still uses `derive_request_url_from_instruction(...)`.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Write the failing skill-fallback test**
|
||||||
|
|
||||||
|
In `src/service/server.rs` crate-local tests, add a focused test for:
|
||||||
|
- no `page_url`
|
||||||
|
- instruction not deterministic
|
||||||
|
- configured direct-submit skill metadata provides `bootstrap_url`
|
||||||
|
|
||||||
|
Assert resolver source is `SkillConfig` and `request_url` matches metadata.
|
||||||
|
|
||||||
|
- [ ] **Step 6: Run the test to verify it fails**
|
||||||
|
|
||||||
|
Run: `cargo test skill_metadata_bootstrap_url_is_used_when_no_page_context_or_plan_exists --lib -- --nocapture`
|
||||||
|
Expected: FAIL because skill metadata is not read today.
|
||||||
|
|
||||||
|
- [ ] **Step 7: Write the failing malformed-metadata fallback test**
|
||||||
|
|
||||||
|
In `src/service/server.rs` crate-local tests, add a focused test for malformed `bootstrap_url` metadata, with no page context and no deterministic plan.
|
||||||
|
|
||||||
|
Assert the resolver:
|
||||||
|
- ignores malformed metadata
|
||||||
|
- returns `Fallback`
|
||||||
|
- resolves to `about:blank`
|
||||||
|
|
||||||
|
- [ ] **Step 8: Run the test to verify it fails**
|
||||||
|
|
||||||
|
Run: `cargo test malformed_skill_bootstrap_url_falls_back_to_about_blank --lib -- --nocapture`
|
||||||
|
Expected: FAIL because malformed metadata is not handled by a resolver yet.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 2: Introduce the bootstrap-target resolver in service code
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/service/server.rs:280-467`
|
||||||
|
- Modify: `src/service/mod.rs:17-22`
|
||||||
|
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add the narrow resolver types in service code**
|
||||||
|
|
||||||
|
In `src/service/server.rs`, add:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||||
|
pub(crate) struct SubmitBootstrapTarget {
|
||||||
|
pub request_url: String,
|
||||||
|
pub expected_domain: Option<String>,
|
||||||
|
pub source: BootstrapTargetSource,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub(crate) enum BootstrapTargetSource {
|
||||||
|
PageContext,
|
||||||
|
DeterministicPlan,
|
||||||
|
SkillConfig,
|
||||||
|
Fallback,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Keep them scoped to service code. Do not create a generic cross-runtime planning object.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Add a minimal resolver entry point**
|
||||||
|
|
||||||
|
Implement a service-owned function in `src/service/server.rs`, conceptually:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) fn resolve_submit_bootstrap_target(
|
||||||
|
request: &crate::agent::SubmitTaskRequest,
|
||||||
|
workspace_root: &Path,
|
||||||
|
settings: &SgClawSettings,
|
||||||
|
) -> SubmitBootstrapTarget
|
||||||
|
```
|
||||||
|
|
||||||
|
Initial behavior for this step:
|
||||||
|
- return `PageContext` only when `request.page_url` exists and is non-empty after trimming
|
||||||
|
- add a crate-local regression that empty/whitespace `page_url` does not short-circuit later precedence tiers
|
||||||
|
- otherwise fall through to existing behavior temporarily so the new tests can compile incrementally
|
||||||
|
|
||||||
|
- [ ] **Step 3: Update service startup to call the resolver**
|
||||||
|
|
||||||
|
At the callback-host startup call site in `serve_client(...)`, replace:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let bootstrap_url = initial_request_url_for_submit_task(&request);
|
||||||
|
```
|
||||||
|
|
||||||
|
with resolver usage:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let bootstrap_target = resolve_submit_bootstrap_target(&request, context.workspace_root(), &settings);
|
||||||
|
let bootstrap_url = bootstrap_target.request_url;
|
||||||
|
```
|
||||||
|
|
||||||
|
Use the current settings-loading seam already used elsewhere in service code. Keep callback-host startup behavior otherwise unchanged.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Keep resolver visibility crate-local**
|
||||||
|
|
||||||
|
Do not make the resolver types broadly public for integration tests. Keep the resolver and `BootstrapTargetSource` crate-local, and keep source-level assertions in `src/service/server.rs` unit tests.
|
||||||
|
|
||||||
|
Only re-export/remove existing `initial_request_url_for_submit_task(...)` seams through `src/service/mod.rs` if production callers still require that wiring.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Run the first precedence test to verify it passes**
|
||||||
|
|
||||||
|
Run: `cargo test page_context_bootstrap_target_wins_over_deterministic_and_skill_fallback --lib -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/service/server.rs src/service/mod.rs
|
||||||
|
git commit -m "refactor(service): add submit bootstrap target resolver scaffold"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 3: Make deterministic submit the authoritative source for line-loss bootstrap URLs
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/service/server.rs:422-467`
|
||||||
|
- Modify: `src/compat/deterministic_submit.rs:13-101`
|
||||||
|
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||||
|
- Test: `tests/service_ws_session_test.rs`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Write a small service-side seam for deterministic resolution**
|
||||||
|
|
||||||
|
In `src/service/server.rs`, update the resolver so that when `page_url` is absent it calls:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
crate::compat::deterministic_submit::decide_deterministic_submit(
|
||||||
|
&request.instruction,
|
||||||
|
request.page_url.as_deref(),
|
||||||
|
request.page_title.as_deref(),
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
Only `DeterministicSubmitDecision::Execute(plan)` should produce a deterministic bootstrap target.
|
||||||
|
|
||||||
|
Treat `NotDeterministic` and `Prompt { .. }` as “no deterministic bootstrap target” for service startup.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Use `plan.target_url` directly**
|
||||||
|
|
||||||
|
Map `DeterministicSubmitDecision::Execute(plan)` to:
|
||||||
|
- `request_url = plan.target_url.clone()`
|
||||||
|
- `expected_domain = Some(plan.expected_domain.clone())`
|
||||||
|
- `source = BootstrapTargetSource::DeterministicPlan`
|
||||||
|
|
||||||
|
Do not reconstruct the URL in `server.rs`.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Remove the temporary line-loss hardcode**
|
||||||
|
|
||||||
|
Delete this branch from `derive_request_url_from_instruction(...)` or remove the function entirely if it is no longer needed:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
if instruction.contains("线损") || instruction.contains("lineloss") {
|
||||||
|
return Some("http://20.76.57.61:18080".to_string());
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Keep any still-needed legacy Zhihu fallback only if the resolver still requires it after deterministic integration.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Add/adjust a deterministic regression test**
|
||||||
|
|
||||||
|
In `src/service/server.rs` crate-local tests, add a focused assertion that line-loss bootstrap URL now comes from `DeterministicExecutionPlan.target_url`, not raw text matching.
|
||||||
|
|
||||||
|
A good assertion shape is:
|
||||||
|
- call resolver with deterministic line-loss instruction
|
||||||
|
- assert `request_url == "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"`
|
||||||
|
- assert `source == DeterministicPlan`
|
||||||
|
|
||||||
|
- [ ] **Step 5: Run deterministic tests to verify they pass**
|
||||||
|
|
||||||
|
Run: `cargo test deterministic_bootstrap_target_uses_plan_target_url --lib -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 6: Run service websocket coverage for the same precedence**
|
||||||
|
|
||||||
|
Run: `cargo test callback_host --test service_ws_session_test -- --nocapture`
|
||||||
|
Expected: PASS with no line-loss hardcode dependency.
|
||||||
|
|
||||||
|
- [ ] **Step 7: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/service/server.rs src/compat/deterministic_submit.rs tests/service_ws_session_test.rs
|
||||||
|
git commit -m "refactor(service): derive line-loss bootstrap URL from deterministic plan"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 4: Add skill-metadata fallback for configured direct-submit skills
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/compat/direct_skill_runtime.rs:114-153`
|
||||||
|
- Modify: `src/service/server.rs:422-467`
|
||||||
|
- Optionally modify: `src/config/settings.rs` only if a tiny metadata pointer is required
|
||||||
|
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
|
||||||
|
- Optionally modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/95598-weekly-monitor-report/SKILL.toml`
|
||||||
|
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||||
|
- Test: `tests/service_ws_session_test.rs`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Define the minimal skill metadata shape**
|
||||||
|
|
||||||
|
Extend staged `SKILL.toml` parsing expectations to support a narrow metadata seam for browser-script direct skills.
|
||||||
|
|
||||||
|
The plan target fields are:
|
||||||
|
- `bootstrap_url`
|
||||||
|
- `expected_domain`
|
||||||
|
|
||||||
|
Keep the metadata minimal. Do not add a broad dispatch registry or scene-policy schema.
|
||||||
|
|
||||||
|
Recommended TOML shape in the skill manifest:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[tools.metadata]
|
||||||
|
bootstrap_url = "https://example.com/path"
|
||||||
|
expected_domain = "example.com"
|
||||||
|
```
|
||||||
|
|
||||||
|
If the actual skill loader only supports per-tool custom fields in another location, use that established seam instead. Do not invent a parallel config file.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Add a helper that reads fallback metadata for the configured direct skill**
|
||||||
|
|
||||||
|
In `src/compat/direct_skill_runtime.rs`, add a helper like:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) fn resolve_direct_submit_bootstrap_metadata(
|
||||||
|
configured_tool: &str,
|
||||||
|
workspace_root: &Path,
|
||||||
|
settings: &SgClawSettings,
|
||||||
|
) -> Result<Option<DirectSubmitBootstrapMetadata>, PipeError>
|
||||||
|
```
|
||||||
|
|
||||||
|
Recommended shape:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub(crate) struct DirectSubmitBootstrapMetadata {
|
||||||
|
pub bootstrap_url: String,
|
||||||
|
pub expected_domain: Option<String>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Reuse the existing `resolve_browser_script_skill(...)` lookup path so the service resolver does not duplicate staged-skill discovery logic.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Validate metadata conservatively**
|
||||||
|
|
||||||
|
When reading fallback metadata:
|
||||||
|
- accept only non-empty `bootstrap_url`
|
||||||
|
- require it to parse as a valid absolute URL
|
||||||
|
- normalize or preserve `expected_domain` only if non-empty
|
||||||
|
- on malformed metadata, return `Ok(None)` for resolver purposes instead of failing service startup
|
||||||
|
|
||||||
|
This keeps malformed fallback data from breaking submits and matches the approved spec.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Wire skill metadata into the service resolver**
|
||||||
|
|
||||||
|
Update `resolve_submit_bootstrap_target(...)` to:
|
||||||
|
- check skill metadata only after page context and deterministic parsing fail
|
||||||
|
- use `SkillConfig` as the source when metadata resolves
|
||||||
|
- fall through to `about:blank` when metadata is missing or malformed
|
||||||
|
|
||||||
|
- [ ] **Step 5: Add a staged-skill fixture update**
|
||||||
|
|
||||||
|
Update at least one configured direct skill fixture, likely `fault-details-report`, to include valid fallback metadata.
|
||||||
|
|
||||||
|
Use concrete values appropriate for that skill’s target page; do not reuse the line-loss URL.
|
||||||
|
|
||||||
|
- [ ] **Step 6: Run the skill-fallback test to verify it passes**
|
||||||
|
|
||||||
|
Run: `cargo test skill_metadata_bootstrap_url_is_used_when_no_page_context_or_plan_exists --lib -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 7: Run the malformed-metadata test to verify it passes**
|
||||||
|
|
||||||
|
Run: `cargo test malformed_skill_bootstrap_url_falls_back_to_about_blank --lib -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 8: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/compat/direct_skill_runtime.rs src/service/server.rs D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml tests/service_ws_session_test.rs
|
||||||
|
git commit -m "feat(service): add direct skill bootstrap URL fallback metadata"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 5: Remove obsolete request-url glue and lock the final precedence contract
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/service/server.rs:422-467`
|
||||||
|
- Modify: `src/service/mod.rs:20-22`
|
||||||
|
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||||
|
- Test: `tests/service_ws_session_test.rs`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Delete obsolete helper logic**
|
||||||
|
|
||||||
|
If `derive_request_url_from_instruction(...)` is no longer needed after resolver landing, delete it completely.
|
||||||
|
|
||||||
|
If a tiny legacy Zhihu-only seam still remains, keep it private behind the resolver and remove the old public shape from `service::browser_ws_client` if no longer needed.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Lock the precedence contract with one final matrix test**
|
||||||
|
|
||||||
|
In `src/service/server.rs` crate-local tests, add one table-driven or clearly segmented test that verifies all four final outcomes:
|
||||||
|
- non-empty page context wins
|
||||||
|
- deterministic plan wins when page context is absent or empty
|
||||||
|
- skill metadata wins when page context and deterministic plan are absent
|
||||||
|
- fallback becomes `about:blank` when nothing resolves
|
||||||
|
|
||||||
|
- [ ] **Step 3: Run the focused resolver suite**
|
||||||
|
|
||||||
|
Run: `cargo test bootstrap_target --lib -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Run service websocket regression coverage**
|
||||||
|
|
||||||
|
Run: `cargo test callback_host --test service_ws_session_test -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/service/server.rs src/service/mod.rs tests/service_ws_session_test.rs
|
||||||
|
git commit -m "refactor(service): finalize bootstrap target precedence"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 6: Full verification and implementation handoff check
|
||||||
|
|
||||||
|
**Files:** None (verification only)
|
||||||
|
|
||||||
|
- [ ] **Step 1: Run focused deterministic and direct-skill tests**
|
||||||
|
|
||||||
|
Run: `cargo test deterministic_submit -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
Run: `cargo test direct_submit -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run service submit regression coverage**
|
||||||
|
|
||||||
|
Run: `cargo test --test service_task_flow_test -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
Run: `cargo test --test service_ws_session_test -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Run targeted config/settings coverage if touched**
|
||||||
|
|
||||||
|
Run: `cargo test service_protocol_update_config_test -- --nocapture`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Build the project**
|
||||||
|
|
||||||
|
Run: `cargo build --bin sg_claw`
|
||||||
|
Expected: PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Manual behavior checklist**
|
||||||
|
|
||||||
|
Verify manually:
|
||||||
|
1. Existing page-attached submits still bootstrap against the current page URL.
|
||||||
|
2. Deterministic line-loss submit without page context boots helper against the line-loss target page from `DeterministicExecutionPlan.target_url`.
|
||||||
|
3. Non-deterministic configured direct skill without page context uses skill metadata bootstrap URL if present.
|
||||||
|
4. Missing or malformed skill metadata does not crash startup and falls back to `about:blank`.
|
||||||
|
5. No service code remains that hardcodes line-loss request URL by checking raw instruction text.
|
||||||
|
|
||||||
|
- [ ] **Step 6: Final commit (only if verification revealed required follow-up fixes)**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "test: lock request URL resolution precedence"
|
||||||
|
```
|
||||||
|
|
||||||
|
Only create this commit if verification required an additional code or test fix.
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,441 @@
|
|||||||
|
# Generated Scene Rectification Implementation Plan
|
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
|
**Goal:** Rectify the generated-scene pipeline so it stops emitting false-positive runnable skills for complex internal scenes, specifically by fixing `sceneId` degeneration, bootstrap pollution, incomplete workflow reconstruction, and readiness fail-open behavior.
|
||||||
|
|
||||||
|
**Architecture:** Keep the current `Scene IR` pipeline, but add four hard control chains around it: naming validation, bootstrap evidence stratification, workflow evidence reconstruction, and readiness gating. Generation must fail closed whenever these chains are incomplete.
|
||||||
|
|
||||||
|
**Tech Stack:** Rust, Node.js, HTML/CSS/JavaScript, serde_json, OpenAI-compatible LLM API
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scope Check
|
||||||
|
|
||||||
|
This plan implements the design in:
|
||||||
|
|
||||||
|
- `docs/superpowers/specs/2026-04-17-generated-scene-rectification-design.md`
|
||||||
|
|
||||||
|
This plan builds on the existing generated-scene foundation already described in:
|
||||||
|
|
||||||
|
- `docs/superpowers/specs/2026-04-17-scene-skill-compiler-design.md`
|
||||||
|
- `docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md`
|
||||||
|
- `docs/superpowers/specs/2026-04-17-enhanced-llm-extraction-schema-design.md`
|
||||||
|
|
||||||
|
This plan does not attempt to solve:
|
||||||
|
|
||||||
|
- login or authentication recovery
|
||||||
|
- Chromium host integration or browser embedding changes
|
||||||
|
- full runtime resolver expansion beyond what this rectification needs
|
||||||
|
- arbitrary historical scene compatibility outside the reference regression cases
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File Map
|
||||||
|
|
||||||
|
### Frontend scene generator
|
||||||
|
|
||||||
|
| File | Action | Purpose |
|
||||||
|
|------|--------|---------|
|
||||||
|
| `frontend/scene-generator/generator-runner.js` | Modify | Implement naming fallback control, URL evidence stratification, workflow evidence cleanup, and pre-generation gate inputs |
|
||||||
|
| `frontend/scene-generator/llm-client.js` | Modify | Tighten sceneId semantic constraints and reject low-entropy LLM naming output |
|
||||||
|
| `frontend/scene-generator/server.js` | Modify | Aggregate readiness gates, block unsafe generation, and return rectification diagnostics |
|
||||||
|
| `frontend/scene-generator/sg_scene_generator.html` | Modify | Show invalid `sceneId`, bootstrap role breakdown, workflow evidence completeness, and generation block reasons |
|
||||||
|
|
||||||
|
### Rust generated-scene pipeline
|
||||||
|
|
||||||
|
| File | Action | Purpose |
|
||||||
|
|------|--------|---------|
|
||||||
|
| `src/generated_scene/analyzer.rs` | Modify | Add endpoint denoising, evidence role typing, and stricter archetype preconditions |
|
||||||
|
| `src/generated_scene/ir.rs` | Modify | Extend IR to carry candidate roles, gate states, and workflow evidence completeness |
|
||||||
|
| `src/generated_scene/generator.rs` | Modify | Prevent compiler routing when gates fail and surface fail-closed diagnostics |
|
||||||
|
|
||||||
|
### Tests and fixtures
|
||||||
|
|
||||||
|
| File | Action | Purpose |
|
||||||
|
|------|--------|---------|
|
||||||
|
| `tests/scene_generator_test.rs` | Modify | Cover naming, bootstrap, workflow, and readiness regression cases |
|
||||||
|
| `tests/scene_generator_html_test.rs` | Modify | Cover HTML/UI risk and blocking output |
|
||||||
|
| `tests/fixtures/generated_scene/paginated_enrichment/*` | Modify | Preserve marketing-like reference coverage |
|
||||||
|
| `tests/fixtures/generated_scene/multi_mode/*` | Modify | Preserve tq-like multi-mode coverage |
|
||||||
|
| Additional fixture files as needed | Create | Add low-entropy naming and localhost-pollution regression inputs |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
- Do not broaden this work into a generic scene-generator redesign.
|
||||||
|
- Do not remove the existing `Scene IR` structure; extend and constrain it.
|
||||||
|
- Do not let `localhost` or helper/export endpoints participate in bootstrap selection.
|
||||||
|
- Do not silently coerce invalid `sceneId` values into accepted ids.
|
||||||
|
- Do not route into `paginated_enrichment` unless its minimum workflow evidence is complete.
|
||||||
|
- Do not emit a default runnable skill when any rectification gate fails.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 1: Rectify Naming Chain
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||||
|
- Modify: `frontend/scene-generator/llm-client.js`
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
- Modify: `src/generated_scene/ir.rs`
|
||||||
|
|
||||||
|
**Goal:** Stop Chinese-source scenes from degrading into low-information ids such as `2-0`, and turn `sceneId` into a validated business identifier instead of a raw slug fallback.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Classify sceneId candidate sources**
|
||||||
|
|
||||||
|
Define explicit candidate tiers for `sceneId`:
|
||||||
|
|
||||||
|
1. LLM semantic business id
|
||||||
|
2. deterministic keyword-derived id
|
||||||
|
3. controlled alias/transliteration fallback
|
||||||
|
4. invalid fallback candidate
|
||||||
|
|
||||||
|
Expected result: the pipeline can explain where the chosen id came from.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Add low-entropy sceneId validation**
|
||||||
|
|
||||||
|
Implement shared validation rules that reject ids which are:
|
||||||
|
|
||||||
|
- numeric-only or numeric-dominant
|
||||||
|
- too short to be business-readable
|
||||||
|
- generic placeholders such as `scene` or `report`
|
||||||
|
- semantically detached from the extracted `sceneName`
|
||||||
|
|
||||||
|
Expected result: ids like `2-0`, `1-0`, `scene`, `report` are blocked.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Fail closed on invalid sceneId**
|
||||||
|
|
||||||
|
Update generation flow so invalid `sceneId` produces:
|
||||||
|
|
||||||
|
- `invalid_scene_id` gate failure
|
||||||
|
- readiness downgrade
|
||||||
|
- analysis/report output only unless explicitly overridden later by a separate approved flow
|
||||||
|
|
||||||
|
Expected result: invalid ids never create a formal generated skill directory by default.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Surface naming diagnostics in server/UI**
|
||||||
|
|
||||||
|
Return and display:
|
||||||
|
|
||||||
|
- chosen `sceneId`
|
||||||
|
- candidate source
|
||||||
|
- validation result
|
||||||
|
- invalidation reason if blocked
|
||||||
|
|
||||||
|
- [ ] **Step 5: Add regression tests**
|
||||||
|
|
||||||
|
Cover at least:
|
||||||
|
|
||||||
|
- Chinese source name that previously degraded to `2-0`
|
||||||
|
- valid semantic id chosen over slug fallback
|
||||||
|
- invalid low-entropy id blocked from generation
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/llm-client.js frontend/scene-generator/server.js src/generated_scene/ir.rs tests/scene_generator_test.rs
|
||||||
|
git commit -m "fix(generator): block degenerate generated scene ids"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 2: Rectify Bootstrap Chain
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
- Modify: `src/generated_scene/analyzer.rs`
|
||||||
|
- Modify: `src/generated_scene/ir.rs`
|
||||||
|
|
||||||
|
**Goal:** Separate business bootstrap candidates from localhost/export/helper URLs so internal-network entry domains resolve correctly.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add URL evidence role stratification**
|
||||||
|
|
||||||
|
Classify URL candidates into:
|
||||||
|
|
||||||
|
- `business_entry`
|
||||||
|
- `business_api`
|
||||||
|
- `gateway_api`
|
||||||
|
- `export_service`
|
||||||
|
- `local_helper`
|
||||||
|
- `static_asset`
|
||||||
|
- `template_noise`
|
||||||
|
|
||||||
|
Expected result: every URL candidate is typed before bootstrap selection.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Add deterministic localhost and noise rejection**
|
||||||
|
|
||||||
|
Ensure that:
|
||||||
|
|
||||||
|
- `localhost`
|
||||||
|
- `127.0.0.1`
|
||||||
|
- `SurfaceServices`
|
||||||
|
- `ReportServices`
|
||||||
|
- `.js` / `.css` assets
|
||||||
|
- template placeholders and format strings
|
||||||
|
|
||||||
|
are routed away from bootstrap candidates.
|
||||||
|
|
||||||
|
Expected result: helper/export/static/template strings can remain as evidence but can never win bootstrap.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Redefine bootstrap resolution order**
|
||||||
|
|
||||||
|
Bootstrap selection may only consume:
|
||||||
|
|
||||||
|
1. `business_entry`
|
||||||
|
2. `business_api`
|
||||||
|
3. `gateway_api`
|
||||||
|
|
||||||
|
When only helper/noise roles exist, set bootstrap to unresolved and downgrade readiness.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Preserve export/helper evidence separately**
|
||||||
|
|
||||||
|
Retain localhost/export endpoints as downstream evidence for workflow/reporting, but isolate them from `expectedDomain` and `targetUrl`.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Add regression tests**
|
||||||
|
|
||||||
|
Cover at least:
|
||||||
|
|
||||||
|
- marketing-like source choosing `yx.gs.sgcc.com.cn` over `localhost`
|
||||||
|
- mixed business + gateway scene preserving business target page
|
||||||
|
- scene with only localhost/noise ending in unresolved bootstrap
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs tests/scene_generator_test.rs
|
||||||
|
git commit -m "fix(generator): stratify bootstrap evidence and exclude localhost"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 3: Rectify Workflow Chain
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
- Modify: `src/generated_scene/analyzer.rs`
|
||||||
|
- Modify: `src/generated_scene/ir.rs`
|
||||||
|
- Modify: `src/generated_scene/generator.rs`
|
||||||
|
|
||||||
|
**Goal:** Reconstruct workflow from request-chain evidence instead of generic field names, so `paginated_enrichment` is only emitted when its true workflow exists.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Split workflow evidence into typed layers**
|
||||||
|
|
||||||
|
Represent workflow evidence as:
|
||||||
|
|
||||||
|
- request evidence
|
||||||
|
- pagination evidence
|
||||||
|
- secondary request evidence
|
||||||
|
- post-process evidence
|
||||||
|
|
||||||
|
Expected result: archetype decisions operate on structured workflow signals instead of a flat endpoint list.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Denoise endpoint and method evidence**
|
||||||
|
|
||||||
|
Normalize and filter out:
|
||||||
|
|
||||||
|
- `${apiUrl}`
|
||||||
|
- template placeholders
|
||||||
|
- exception strings
|
||||||
|
- log text fragments
|
||||||
|
- localhost export endpoints
|
||||||
|
|
||||||
|
Expected result: workflow reconstruction only consumes business-relevant requests.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Tighten archetype routing rules**
|
||||||
|
|
||||||
|
Require `paginated_enrichment` to have at minimum:
|
||||||
|
|
||||||
|
1. one main list request
|
||||||
|
2. one pagination variable set
|
||||||
|
3. one secondary request or explicit per-item enrichment function
|
||||||
|
4. one post-process action among `filter`, `transform`, `export`
|
||||||
|
|
||||||
|
If only part of this exists, preserve it as candidate evidence but do not route into the compiler.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Narrow multi_mode detection**
|
||||||
|
|
||||||
|
Allow `multi_mode_request` only when mode switching materially changes at least one of:
|
||||||
|
|
||||||
|
- request body
|
||||||
|
- endpoint shape
|
||||||
|
- response path
|
||||||
|
- column definition
|
||||||
|
|
||||||
|
Expected result: generic `type/tab/mode/status` fields alone no longer misclassify marketing-like scenes.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Block compiler routing on incomplete workflow**
|
||||||
|
|
||||||
|
Update generator-side routing so incomplete evidence cannot produce a formal `paginated_enrichment` skill package.
|
||||||
|
|
||||||
|
- [ ] **Step 6: Add regression tests**
|
||||||
|
|
||||||
|
Cover at least:
|
||||||
|
|
||||||
|
- marketing-like scene must expose `paginate` + `secondary_request` + post-process evidence
|
||||||
|
- generic mode fields without real mode divergence must not force `multi_mode_request`
|
||||||
|
- noisy endpoint lists must still reconstruct the correct business request chain
|
||||||
|
|
||||||
|
- [ ] **Step 7: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs src/generated_scene/generator.rs tests/scene_generator_test.rs
|
||||||
|
git commit -m "fix(generator): require complete workflow evidence before archetype routing"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 4: Rectify Readiness Chain
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||||
|
- Modify: `src/generated_scene/ir.rs`
|
||||||
|
- Modify: `src/generated_scene/generator.rs`
|
||||||
|
- Modify: `tests/scene_generator_html_test.rs`
|
||||||
|
|
||||||
|
**Goal:** Turn readiness into a hard gate that distinguishes analysis output from runnable skill output.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add explicit rectification gates**
|
||||||
|
|
||||||
|
Track at minimum:
|
||||||
|
|
||||||
|
- `scene_id_valid`
|
||||||
|
- `bootstrap_resolved`
|
||||||
|
- `workflow_complete_for_archetype`
|
||||||
|
- `runtime_contract_compatible`
|
||||||
|
|
||||||
|
Expected result: readiness is derived from named gates rather than a loose score only.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Enforce fail-closed readiness rules**
|
||||||
|
|
||||||
|
Require:
|
||||||
|
|
||||||
|
- all core gates pass for readiness `A` or `B`
|
||||||
|
- any core gate failure forces readiness `C`
|
||||||
|
- generation endpoint blocks runnable output on gate failure
|
||||||
|
|
||||||
|
- [ ] **Step 3: Separate analysis result from generation result**
|
||||||
|
|
||||||
|
When gates fail, allow:
|
||||||
|
|
||||||
|
- analysis preview
|
||||||
|
- evidence report
|
||||||
|
- block reasons
|
||||||
|
|
||||||
|
But do not default to:
|
||||||
|
|
||||||
|
- full skill emission
|
||||||
|
- compiler success messaging
|
||||||
|
|
||||||
|
- [ ] **Step 4: Expose readiness breakdown in UI**
|
||||||
|
|
||||||
|
Display:
|
||||||
|
|
||||||
|
- gate names
|
||||||
|
- pass/fail state
|
||||||
|
- missing workflow pieces
|
||||||
|
- bootstrap resolution reason
|
||||||
|
- invalid sceneId reason
|
||||||
|
|
||||||
|
- [ ] **Step 5: Add regression tests**
|
||||||
|
|
||||||
|
Cover at least:
|
||||||
|
|
||||||
|
- invalid `sceneId` forcing readiness `C`
|
||||||
|
- unresolved bootstrap forcing readiness `C`
|
||||||
|
- incomplete paginated workflow forcing readiness `C`
|
||||||
|
- fully valid reference fixture remaining eligible for generation
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/ir.rs src/generated_scene/generator.rs tests/scene_generator_html_test.rs tests/scene_generator_test.rs
|
||||||
|
git commit -m "fix(generator): enforce readiness fail-closed gating"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 5: Reference Regression Verification
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `tests/scene_generator_test.rs`
|
||||||
|
- Modify: `tests/scene_generator_html_test.rs`
|
||||||
|
- Modify/Create: relevant fixtures under `tests/fixtures/generated_scene/`
|
||||||
|
|
||||||
|
**Goal:** Lock the rectification against the two reference scene families and ensure future changes do not reintroduce the same false positives.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Regress marketing-like fixture**
|
||||||
|
|
||||||
|
Verify the marketing reference path now satisfies:
|
||||||
|
|
||||||
|
- non-degenerate `sceneId`
|
||||||
|
- bootstrap rooted in `yx.gs.sgcc.com.cn` family
|
||||||
|
- workflow includes `paginate`
|
||||||
|
- workflow includes `secondary_request`
|
||||||
|
- readiness does not pass if any of the above are missing
|
||||||
|
|
||||||
|
- [ ] **Step 2: Regress tq-like fixture**
|
||||||
|
|
||||||
|
Verify the tq reference path still satisfies:
|
||||||
|
|
||||||
|
- stable semantic `sceneId`
|
||||||
|
- valid non-localhost bootstrap
|
||||||
|
- genuine `multi_mode_request` detection
|
||||||
|
- no downgrade caused by the stricter marketing rectification rules
|
||||||
|
|
||||||
|
- [ ] **Step 3: Run verification commands**
|
||||||
|
|
||||||
|
Run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cargo check
|
||||||
|
cargo test --test scene_generator_test -- --nocapture
|
||||||
|
cargo test --test scene_generator_html_test -- --nocapture
|
||||||
|
node --check frontend/scene-generator/llm-client.js
|
||||||
|
node --check frontend/scene-generator/generator-runner.js
|
||||||
|
node --check frontend/scene-generator/server.js
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected result: rectification passes both Rust and Node validation plus regression coverage.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Record outcomes in generated reports if needed**
|
||||||
|
|
||||||
|
If the implementation emits readiness or analysis JSON reports, ensure the test fixtures assert the key blocked/passed states directly.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add tests/scene_generator_test.rs tests/scene_generator_html_test.rs tests/fixtures/generated_scene frontend/scene-generator/llm-client.js frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs src/generated_scene/generator.rs
|
||||||
|
git commit -m "test(generator): lock generated scene rectification regressions"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
This plan is complete when all of the following are true:
|
||||||
|
|
||||||
|
1. Chinese-source scene names no longer degrade into low-entropy ids like `2-0`.
|
||||||
|
2. `localhost`, `127.0.0.1`, export services, and helper URLs no longer compete for bootstrap resolution.
|
||||||
|
3. `paginated_enrichment` routing only occurs when pagination, secondary request, and post-process evidence are all present.
|
||||||
|
4. Incomplete evidence paths fail closed with explicit readiness gate failures instead of generating false-positive runnable skills.
|
||||||
|
5. The marketing-like and tq-like reference scenes both remain covered by automated regression tests.
|
||||||
|
|
||||||
|
## Rollback Strategy
|
||||||
|
|
||||||
|
If this rectification causes unacceptable regressions:
|
||||||
|
|
||||||
|
1. Revert the latest rectification task commit only, not unrelated generated-scene work.
|
||||||
|
2. Keep the previous `Scene IR` and compiler structure intact.
|
||||||
|
3. Preserve newly added fixtures and tests where possible, then relax only the specific gate or classifier that caused the regression.
|
||||||
|
|
||||||
|
## Notes For Executors
|
||||||
|
|
||||||
|
- Implement this plan strictly in order: naming, bootstrap, workflow, readiness, verification.
|
||||||
|
- Do not skip ahead to UI polish before the gating logic is in place.
|
||||||
|
- Do not add speculative resolver or login work under this plan.
|
||||||
|
- Any need for user override or forced draft generation must be handled as a separate follow-up spec, not smuggled into this rectification plan.
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,382 @@
|
|||||||
|
# sgClaw Scene Skill 60-to-90 Roadmap Plan
|
||||||
|
|
||||||
|
> **Status:** Draft
|
||||||
|
> **Date:** 2026-04-17
|
||||||
|
> **Author:** Codex
|
||||||
|
> **Upstream Spec:** [2026-04-17-scene-skill-60-to-90-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-17-scene-skill-60-to-90-roadmap-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
本计划用于将“scene skill 自动生成能力从 60 分提升到 90 分”的设计方案拆解为可执行的交付阶段、任务边界、验收条件与实施顺序。计划严格服从上游 `spec`,不额外扩展问题空间,不提前引入未在 `spec` 中确认的实现目标。
|
||||||
|
|
||||||
|
本计划覆盖的核心目标仅包括:
|
||||||
|
|
||||||
|
1. 建立可裁决的语义证据层
|
||||||
|
2. 建立最小可编译业务契约
|
||||||
|
3. 冻结 P0 样板标准答案
|
||||||
|
4. 按 P0 到 P1 的路线推动 scene skill 自动转化能力从结构识别升级到业务语义恢复
|
||||||
|
|
||||||
|
## Success Criteria Baseline
|
||||||
|
|
||||||
|
本计划默认采用上游 `spec` 中已经收敛的成功标准:阶段性成功不再以“生成结果是否尽量接近某个参考 skill 的结构”作为唯一目标,而是以通用场景生成后的 skill 能否在内网环境中直接运行、拿到正确数据并产出正确报表作为主判定口径。
|
||||||
|
|
||||||
|
因此,实施验收默认同时检查以下三层闭环:
|
||||||
|
|
||||||
|
1. 执行闭环:生成 skill 可在自研浏览器承载的内网环境中完成执行
|
||||||
|
2. 数据闭环:查询、分页、提取后的数据正确且完整
|
||||||
|
3. 产物闭环:生成的 Excel 或其他报表符合业务规则
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
本计划执行过程中,以下边界保持不变:
|
||||||
|
|
||||||
|
1. 不以“一次覆盖全部 102 个场景”为目标
|
||||||
|
2. 不在本计划中展开统一平台登录或目标业务系统后台登录的自动恢复实现
|
||||||
|
3. 不把 BrowserAction 全链路抽象一次性做完
|
||||||
|
4. 不把复杂文档渲染、模板上传、附件解析场景纳入 P0
|
||||||
|
5. 不以“先做更多 prompt 调优”代替证据层、契约层和标准答案建设
|
||||||
|
|
||||||
|
## Scene Family Baseline
|
||||||
|
|
||||||
|
本计划执行时,默认承接上游 `spec` 对 `102` 个场景的家族分组结果:
|
||||||
|
|
||||||
|
1. `G1` 通用单页报表组:`68`
|
||||||
|
2. `G2` 多模式报表组:`11`
|
||||||
|
3. `G3` 分页明细补数组:`10`
|
||||||
|
4. `G4` 工具检测前置组:`8`
|
||||||
|
5. `G5` 低优先级噪声组:`5`
|
||||||
|
|
||||||
|
本计划的主线实施范围以 `G1 + G2 + G3` 为主,它们合计 `89` 个场景,约占全部样本的 `87%`。`G4` 作为后续检测类扩展前置保留,`G5` 默认降级处理,不进入首轮主线。
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
本计划拆分为四条主工作流:
|
||||||
|
|
||||||
|
1. `WS1` 语义证据层建设
|
||||||
|
2. `WS2` 最小可编译业务契约建设
|
||||||
|
3. `WS3` P0 标准答案与校准基线建设
|
||||||
|
4. `WS4` P0/P1 样板路线落地与验证
|
||||||
|
|
||||||
|
四条工作流之间的依赖关系为:
|
||||||
|
|
||||||
|
`WS1 + WS2 + WS3 -> WS4`
|
||||||
|
|
||||||
|
## Phase Overview
|
||||||
|
|
||||||
|
计划按五个阶段推进:
|
||||||
|
|
||||||
|
1. Phase 0:冻结边界与样板
|
||||||
|
2. Phase 1:建立语义证据层
|
||||||
|
3. Phase 2:建立最小可编译业务契约
|
||||||
|
4. Phase 3:冻结 P0 canonical answers
|
||||||
|
5. Phase 4:按 P0/P1 路线逐步验证 60-to-90 能力提升
|
||||||
|
|
||||||
|
其中 Phase 4 不是按业务部门推进,而是按场景家族推进,顺序固定为:
|
||||||
|
|
||||||
|
1. 先打 `G2` 多模式报表组,验证语义恢复上限
|
||||||
|
2. 再打 `G1` 通用单页报表组,验证规模化迁移能力
|
||||||
|
3. 再打 `G3` 分页明细补数组,验证复杂 workflow 与 fail-closed
|
||||||
|
4. `G4` 保留到后续检测类扩展
|
||||||
|
5. `G5` 默认降级处理
|
||||||
|
|
||||||
|
## Phase 0:冻结边界与样板
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
在进入建设阶段前,先冻结问题边界、P0 样板、P1 家族和对标基线,避免实施过程中反复漂移。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 固化 P0 样板清单
|
||||||
|
2. 固化 P1 家族清单
|
||||||
|
3. 固化 `台区线损大数据-月_周累计线损率统计分析 -> tq-lineloss-report` 的 canonical mapping
|
||||||
|
4. 固化宿主浏览器执行上下文和 `localhost:*` 的语义分类口径
|
||||||
|
5. 固化“业务语义层 / 宿主浏览器能力层 / 登录与本地桥接层”的分层约束
|
||||||
|
6. 固化 `102` 个场景的五大分组和分组口径
|
||||||
|
7. 固化各分组到 archetype / 阶段 / 验收重点的映射关系
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 冻结后的样板名单
|
||||||
|
2. 样板与 archetype 对照表
|
||||||
|
3. 宿主与业务分层约束说明
|
||||||
|
4. canonical benchmark 映射说明
|
||||||
|
5. 五大场景分组清单
|
||||||
|
6. 分组实施映射说明
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. P0 / P1 样板不再变动
|
||||||
|
2. `tq-lineloss-report` 已被明确为 P0-1 的 canonical benchmark
|
||||||
|
3. `localhost:*` 已被明确定义为宿主桥接证据而非默认业务域
|
||||||
|
4. `102` 个场景的五大分组和分组实施口径不再漂移
|
||||||
|
|
||||||
|
## Phase 1:建立语义证据层
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
将“源码直接汇总到 Scene IR”的生成路径,升级为“源码先形成可裁决语义证据,再归约为 Scene IR”的路径。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 定义统一证据对象 schema
|
||||||
|
2. 定义证据来源分层
|
||||||
|
3. 定义证据归并与冲突消解规则
|
||||||
|
4. 定义证据到 `Scene IR` 的映射边界
|
||||||
|
5. 建立核心证据类型集合
|
||||||
|
|
||||||
|
### Required Evidence Types
|
||||||
|
|
||||||
|
第一版最小证据类型集合固定为:
|
||||||
|
|
||||||
|
1. `bootstrap_candidate`
|
||||||
|
2. `endpoint_candidate`
|
||||||
|
3. `mode_candidate`
|
||||||
|
4. `request_template_candidate`
|
||||||
|
5. `response_path_candidate`
|
||||||
|
6. `column_defs_candidate`
|
||||||
|
7. `normalize_rules_candidate`
|
||||||
|
8. `workflow_candidate`
|
||||||
|
9. `localhost_dependency_candidate`
|
||||||
|
10. `browser_action_candidate`
|
||||||
|
11. `export_candidate`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 证据对象 schema 文档
|
||||||
|
2. 证据类型字典
|
||||||
|
3. 证据归并规则文档
|
||||||
|
4. 证据到 `Scene IR` 的映射规则文档
|
||||||
|
5. P0 样板的证据抽取结果样例
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. 任一 P0 样板都能输出结构化证据集合
|
||||||
|
2. `localhost:*`、宿主 JS 注入、隐藏域行为可进入独立证据槽位
|
||||||
|
3. `Scene IR` 的核心字段均可回溯到对应证据来源
|
||||||
|
4. 证据冲突时存在明确裁决路径,而不是被最终总结直接吞没
|
||||||
|
|
||||||
|
## Phase 2:建立最小可编译业务契约
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把 archetype 判断从“关键词命中”升级为“最小业务契约是否成立”,让 compiler 只接收证据闭合的输入。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 定义各 archetype 的最小可编译契约
|
||||||
|
2. 定义统一 gate 列表
|
||||||
|
3. 定义 gate 失败时的阻断规则
|
||||||
|
4. 定义 archetype 最小输出契约
|
||||||
|
5. 建立 fail-closed 优先的 readiness 判定口径
|
||||||
|
|
||||||
|
### Required Gates
|
||||||
|
|
||||||
|
统一 gate 名称最少包括:
|
||||||
|
|
||||||
|
1. `bootstrap_resolved`
|
||||||
|
2. `request_contract_complete`
|
||||||
|
3. `response_contract_complete`
|
||||||
|
4. `workflow_contract_complete`
|
||||||
|
5. `runtime_contract_compatible`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. archetype 最小契约表
|
||||||
|
2. gate 判定表
|
||||||
|
3. blocker / readiness 规则表
|
||||||
|
4. archetype 输出契约样例
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `multi_mode_request`、`single_request_table`、`paginated_enrichment` 均有明确最小契约
|
||||||
|
2. 没有通过 gate 的场景不能再伪装为 runnable skill
|
||||||
|
3. readiness 结果能够区分“业务证据不足”和“宿主运行时依赖未满足”
|
||||||
|
4. compiler 输入边界清晰,不能继续吞入未闭合 IR
|
||||||
|
|
||||||
|
## Phase 3:冻结 P0 Canonical Answers
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
为 P0 三个主样板建立稳定的标准答案、关键证据清单和验收基线,作为后续回归与迁移的唯一校准源。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 固化三个 P0 样板的标准 `Scene IR`
|
||||||
|
2. 固化三个 P0 样板的关键证据清单
|
||||||
|
3. 固化三个 P0 样板的验收标准
|
||||||
|
4. 固化三个 P0 样板的失败 taxonomy
|
||||||
|
5. 建立 canonical answer 与实际生成结果的比对方式
|
||||||
|
|
||||||
|
### P0 Canonical Targets
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
参考 `tq-lineloss-report`
|
||||||
|
2. `用户日电量监测`
|
||||||
|
对标单请求量产样板
|
||||||
|
3. `95598工单明细表`
|
||||||
|
对标分页补数识别与阻断样板
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 三个 P0 样板的 canonical `Scene IR`
|
||||||
|
2. 三个 P0 样板的关键语义证据基线
|
||||||
|
3. 三个 P0 样板的验收表
|
||||||
|
4. 三个 P0 样板的失败类型表
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. P0-1 能明确以 `tq-lineloss-report` 作为高质量参考样板,而非唯一硬标准答案
|
||||||
|
2. 三个 P0 样板都存在“生成结果 vs canonical answer”的对齐方式
|
||||||
|
3. 后续每次能力升级均可回归验证是否偏离 P0 标准答案
|
||||||
|
|
||||||
|
## Phase 4:按 P0/P1 路线逐步验证 60-to-90 提升
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
按照 `spec` 已定义的优先级,以 P0 为主、P1 为扩展,逐步验证自动转化器从结构识别向业务语义恢复的提升路径。
|
||||||
|
|
||||||
|
本阶段不按业务部门推进,而按场景家族推进。其首轮目标不是“覆盖全部 `102` 个场景”,而是先打穿主流报表型场景,再逐步扩展。
|
||||||
|
|
||||||
|
### Track A:P0-1 `tq` 主样板
|
||||||
|
|
||||||
|
#### Goal
|
||||||
|
|
||||||
|
打通 `multi_mode_request.month_week_table` 的主样板能力,并使结果在关键业务语义、内网可执行性与报表正确性上达到 `tq-lineloss-report` 同等级别。
|
||||||
|
|
||||||
|
#### Tasks
|
||||||
|
|
||||||
|
1. 恢复完整 `month / week` 模式矩阵
|
||||||
|
2. 恢复每个模式的请求契约与响应契约
|
||||||
|
3. 恢复列定义、归一化规则和导出语义
|
||||||
|
4. 校验 bootstrap 与目标系统上下文约束
|
||||||
|
5. 建立自动结果与 `tq-lineloss-report` 的关键语义比对
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `mode matrix` 稳定恢复
|
||||||
|
2. 关键 request / response contract 稳定恢复
|
||||||
|
3. 生成结果在关键业务语义与内网报表结果上达到高质量参考水平
|
||||||
|
|
||||||
|
### Track B:P0-2 单请求量产样板
|
||||||
|
|
||||||
|
#### Goal
|
||||||
|
|
||||||
|
证明单请求报表家族可以形成高通过率的通用转化模板。
|
||||||
|
|
||||||
|
#### Tasks
|
||||||
|
|
||||||
|
1. 恢复 request / response / normalize 三件套
|
||||||
|
2. 压缩伪通用兜底主路径
|
||||||
|
3. 验证同家族样板迁移能力
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `single_request_table` 样板稳定通过
|
||||||
|
2. 同家族样板具备可复用性
|
||||||
|
3. 结果判定不再过度依赖全文总结
|
||||||
|
|
||||||
|
### Track C:P0-3 分页补数样板
|
||||||
|
|
||||||
|
#### Goal
|
||||||
|
|
||||||
|
正确识别复杂分页补数场景的问题空间,并在证据不足时稳定阻断。
|
||||||
|
|
||||||
|
#### Tasks
|
||||||
|
|
||||||
|
1. 拆开主请求链、补数链、导出链
|
||||||
|
2. 建立 `paginated_enrichment` 最小可编译证据集
|
||||||
|
3. 区分业务 workflow 与宿主桥接行为
|
||||||
|
4. 落地 fail-closed 判定
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. 分页补数 workflow 被正确拆解
|
||||||
|
2. 证据不足时稳定 fail-closed
|
||||||
|
3. 不再把宿主链或 `localhost:*` 误判为业务主链
|
||||||
|
|
||||||
|
### Track D:P1 家族扩展
|
||||||
|
|
||||||
|
#### Goal
|
||||||
|
|
||||||
|
在 P0 样板稳定后,将能力迁移到已定义的 P1 家族,验证路线具备规模化复制能力。
|
||||||
|
|
||||||
|
#### Tasks
|
||||||
|
|
||||||
|
1. 迁移线损 / 电量多模式家族
|
||||||
|
2. 迁移单请求报表家族
|
||||||
|
3. 迁移分页补数家族
|
||||||
|
4. 记录每一类家族的复用成功率与失败类型
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. 每个 P1 家族至少完成一轮代表场景迁移验证
|
||||||
|
2. P1 验证主要依赖 P0 已沉淀的证据、契约和标准答案体系
|
||||||
|
3. 若超出当前 archetype 或契约能力边界,结果应明确 fail-closed
|
||||||
|
|
||||||
|
### Track E:Scene Family Expansion Policy
|
||||||
|
|
||||||
|
#### Goal
|
||||||
|
|
||||||
|
以五大场景分组为单位,明确哪些家族进入主线,哪些家族仅做预留或降级。
|
||||||
|
|
||||||
|
#### Tasks
|
||||||
|
|
||||||
|
1. 对 `G1` 通用单页报表组建立量产迁移节奏
|
||||||
|
2. 对 `G2` 多模式报表组建立深做样板节奏
|
||||||
|
3. 对 `G3` 分页明细补数组建立复杂链识别节奏
|
||||||
|
4. 对 `G4` 工具检测前置组仅保留架构入口与后续扩展口径
|
||||||
|
5. 对 `G5` 低优先级噪声组建立默认降级口径
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `G1 + G2 + G3` 成为首轮主线范围
|
||||||
|
2. `G4` 不抢占当前主线资源,但保留后续检测类扩展入口
|
||||||
|
3. `G5` 不污染主线 archetype 和验收口径
|
||||||
|
|
||||||
|
## Milestone Order
|
||||||
|
|
||||||
|
总前置里程碑的发生顺序固定为:
|
||||||
|
|
||||||
|
1. 先完成语义证据层
|
||||||
|
2. 再完成最小可编译业务契约
|
||||||
|
3. 再冻结 P0 标准答案
|
||||||
|
|
||||||
|
在这三个里程碑完成之前,不进入大规模家族扩展。
|
||||||
|
|
||||||
|
## File-Level Planning Targets
|
||||||
|
|
||||||
|
本计划要求后续实施至少覆盖以下资产类型:
|
||||||
|
|
||||||
|
1. `docs/superpowers/specs/` 中的上游设计稿
|
||||||
|
2. `docs/superpowers/plans/` 中的阶段计划与进展计划
|
||||||
|
3. scene 生成链中的证据层、契约层、readiness / blocker 相关实现
|
||||||
|
4. P0 样板对应的 fixture、golden IR、验收基线或等价校准资产
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
本计划完成的标志为:
|
||||||
|
|
||||||
|
1. `tq` 主样板可以稳定恢复核心业务语义,并在内网运行与报表结果上达到高质量参考水平
|
||||||
|
2. 单请求主样板可以形成可复制的高通过率模板,并覆盖主流通用报表场景
|
||||||
|
3. 分页补数主样板可以稳定识别复杂 workflow,并在证据不足时 fail-closed
|
||||||
|
4. `Scene IR` 前存在可裁决的证据层
|
||||||
|
5. archetype 前存在明确契约 gate
|
||||||
|
6. P0 标准答案已成为后续迁移与回归的统一校准基线
|
||||||
|
7. 实施主线明确聚焦 `G1 + G2 + G3`,不再被边界场景牵引偏航
|
||||||
|
|
||||||
|
## Risks and Control Points
|
||||||
|
|
||||||
|
1. 若证据层先天过薄,后续契约和 canonical answer 会失去支撑
|
||||||
|
2. 若契约 gate 定义过宽,系统会继续伪造 runnable skill
|
||||||
|
3. 若 P0 标准答案不冻结,后续优化将失去对齐基线
|
||||||
|
4. 若过早进入 P1 扩展,容易在未完成分层前再次引入宿主噪声污染
|
||||||
|
|
||||||
|
## Out of Plan
|
||||||
|
|
||||||
|
以下事项明确不属于本计划直接交付范围:
|
||||||
|
|
||||||
|
1. 统一平台登录流程自动恢复
|
||||||
|
2. 目标业务系统后台登录实现细节
|
||||||
|
3. 浏览器宿主能力的全量抽象
|
||||||
|
4. 所有场景的一次性端到端可运行保证
|
||||||
663
docs/superpowers/plans/2026-04-17-scene-skill-compiler-plan.md
Normal file
663
docs/superpowers/plans/2026-04-17-scene-skill-compiler-plan.md
Normal file
@@ -0,0 +1,663 @@
|
|||||||
|
# Scene Skill Compiler Implementation Plan
|
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
|
**Goal:** Upgrade `sg_scene_generate` from a scene metadata extractor plus template filler into a reusable scene skill compiler that can understand workflow semantics, classify scene archetypes, and generate runnable skills for both `tq-lineloss-report`-style and `marketing-zero-consumer-report`-style internal scenes.
|
||||||
|
|
||||||
|
**Architecture:** Introduce a unified `Scene IR`, switch extraction to a hybrid deterministic-plus-LLM pipeline, route generation by `workflowArchetype`, align runtime resolver contracts, and add readiness gates so users can tell whether a generated skill is safe to trial on the internal network.
|
||||||
|
|
||||||
|
**Tech Stack:** Rust, Node.js, HTML/CSS/JavaScript, serde_json, OpenAI-compatible LLM API
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scope Check
|
||||||
|
|
||||||
|
This plan implements the design in:
|
||||||
|
|
||||||
|
- `docs/superpowers/specs/2026-04-17-scene-skill-compiler-design.md`
|
||||||
|
|
||||||
|
This plan builds on the existing generator work already described in:
|
||||||
|
|
||||||
|
- `docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md`
|
||||||
|
- `docs/superpowers/specs/2026-04-17-enhanced-llm-extraction-schema-design.md`
|
||||||
|
- `docs/superpowers/specs/2026-04-17-progressive-template-enhancement-design.md`
|
||||||
|
- `docs/superpowers/specs/2026-04-16-multi-scene-kind-generator-design.md`
|
||||||
|
|
||||||
|
This plan does not attempt to solve:
|
||||||
|
|
||||||
|
- full login and authentication reconstruction
|
||||||
|
- all historical scene patterns in one pass
|
||||||
|
- 100% no-touch generation without human review
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File Map
|
||||||
|
|
||||||
|
### Core generator pipeline
|
||||||
|
|
||||||
|
| File | Action | Purpose |
|
||||||
|
|------|--------|---------|
|
||||||
|
| `frontend/scene-generator/llm-client.js` | Modify | Replace truncation-only extraction with chunked workflow-aware extraction and `Scene IR` schema output |
|
||||||
|
| `frontend/scene-generator/generator-runner.js` | Modify | Add deterministic scene scanning, key-fragment selection, and IR support |
|
||||||
|
| `frontend/scene-generator/server.js` | Modify | Expose analysis, preview, readiness, and generation endpoints for `Scene IR` |
|
||||||
|
| `frontend/scene-generator/sg_scene_generator.html` | Modify | Show extraction preview, archetype classification, bootstrap, risks, and readiness |
|
||||||
|
|
||||||
|
### Rust backend
|
||||||
|
|
||||||
|
| File | Action | Purpose |
|
||||||
|
|------|--------|---------|
|
||||||
|
| `src/generated_scene/analyzer.rs` | Modify | Add deterministic extraction helpers and archetype support |
|
||||||
|
| `src/generated_scene/generator.rs` | Modify | Route generation by archetype and compile from `Scene IR` instead of ad hoc fields |
|
||||||
|
| `src/generated_scene/ir.rs` | Create | Define unified `Scene IR` structs and serde contracts |
|
||||||
|
| `src/bin/sg_scene_generate.rs` | Modify | Accept `Scene IR` JSON or file input and pass it into generator |
|
||||||
|
| `src/compat/scene_platform/resolvers.rs` | Modify | Align runtime parameter resolution with generated contracts |
|
||||||
|
|
||||||
|
### Tests and fixtures
|
||||||
|
|
||||||
|
| File | Action | Purpose |
|
||||||
|
|------|--------|---------|
|
||||||
|
| `tests/scene_generator_test.rs` | Modify | Cover new analysis, archetype classification, and generation routing |
|
||||||
|
| `tests/generated_scene_*` or related fixtures | Modify/Create | Add representative fixtures for single-request, multi-mode, and paginated-enrichment scenes |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
- Do not break existing `--scene-id`, `--scene-name`, or `--scene-kind` compatibility.
|
||||||
|
- Do not require all scenes to provide complete metadata in HTML meta tags.
|
||||||
|
- Do not force the runtime to support new resolver contracts unless generation is updated to gate incompatible output.
|
||||||
|
- Do not assume all report scenes share `org + period` params.
|
||||||
|
- Do not silently generate low-confidence skills as if they were runnable.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 1: Fix Current Hard Failures Before Compiler Refactor
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `frontend/scene-generator/llm-client.js`
|
||||||
|
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||||
|
- Modify: `src/generated_scene/generator.rs`
|
||||||
|
|
||||||
|
**Goal:** Stop the most obvious wrong outputs that currently make generated skills fail on the internal network even before the full compiler architecture lands.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Remove report-scene hardcoded parameter assumptions**
|
||||||
|
|
||||||
|
Audit `scene.toml` generation in `src/generated_scene/generator.rs` and remove default injection of generic report params such as:
|
||||||
|
|
||||||
|
- fixed `org`
|
||||||
|
- fixed `period`
|
||||||
|
- default dictionary entity for a specific city
|
||||||
|
- generic page title keywords like `["报表", "线损"]`
|
||||||
|
|
||||||
|
Expected result: generated params come from extracted scene semantics or are omitted when not confidently known.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Rework bootstrap source priority**
|
||||||
|
|
||||||
|
Change bootstrap derivation so `expected_domain` and `target_url` are resolved using this order:
|
||||||
|
|
||||||
|
1. explicit deep extraction result
|
||||||
|
2. deterministic extraction from business entry points
|
||||||
|
3. HTML meta tags if trustworthy
|
||||||
|
4. fallback empty with warning
|
||||||
|
|
||||||
|
Explicitly prevent script-host URLs such as static JS includes from becoming the business domain by mistake.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Replace naive truncation with chunked extraction input**
|
||||||
|
|
||||||
|
Update `frontend/scene-generator/llm-client.js` and `frontend/scene-generator/generator-runner.js` so they no longer send only the first `15000/3000` characters. Replace with:
|
||||||
|
|
||||||
|
1. directory tree summary
|
||||||
|
2. `index.html` chunking
|
||||||
|
3. URL-bearing fragments
|
||||||
|
4. request-construction fragments
|
||||||
|
5. branching logic fragments
|
||||||
|
6. export-related fragments
|
||||||
|
|
||||||
|
- [ ] **Step 4: Add analysis preview and risk banner in Web UI**
|
||||||
|
|
||||||
|
Update `frontend/scene-generator/sg_scene_generator.html` and `frontend/scene-generator/server.js` to preview:
|
||||||
|
|
||||||
|
- detected archetype
|
||||||
|
- bootstrap
|
||||||
|
- key endpoints
|
||||||
|
- extracted params
|
||||||
|
- workflow steps
|
||||||
|
- confidence and risk notes
|
||||||
|
|
||||||
|
- [ ] **Step 5: Verify with marketing and tq reference scenes**
|
||||||
|
|
||||||
|
Run local analysis against the two reference scenes and confirm:
|
||||||
|
|
||||||
|
- `marketing-zero-consumer-report` no longer resolves the wrong domain
|
||||||
|
- `tq-lineloss-report` still identifies mode-related structures
|
||||||
|
- generated preview no longer shows generic hardcoded report params
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add frontend/scene-generator/llm-client.js frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/generator.rs
|
||||||
|
git commit -m "fix(generator): remove hardcoded report defaults and improve bootstrap extraction"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 2: Introduce Unified Scene IR
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `src/generated_scene/ir.rs`
|
||||||
|
- Modify: `src/generated_scene/generator.rs`
|
||||||
|
- Modify: `src/bin/sg_scene_generate.rs`
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
- Modify: `frontend/scene-generator/llm-client.js`
|
||||||
|
|
||||||
|
**Goal:** Introduce a single intermediate representation that all extraction and compilation stages use.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add Rust `Scene IR` structs**
|
||||||
|
|
||||||
|
Create `src/generated_scene/ir.rs` with serde-enabled structs for:
|
||||||
|
|
||||||
|
- `SceneIr`
|
||||||
|
- `BootstrapIr`
|
||||||
|
- `ParamIr`
|
||||||
|
- `ModeIr`
|
||||||
|
- `WorkflowStepIr`
|
||||||
|
- `ArtifactContractIr`
|
||||||
|
- `NormalizeRulesIr`
|
||||||
|
- `ReadinessIr`
|
||||||
|
- `EvidenceIr`
|
||||||
|
|
||||||
|
Minimum top-level fields:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"sceneId": "",
|
||||||
|
"sceneName": "",
|
||||||
|
"sceneKind": "",
|
||||||
|
"workflowArchetype": "",
|
||||||
|
"bootstrap": {},
|
||||||
|
"params": [],
|
||||||
|
"modes": [],
|
||||||
|
"workflowSteps": [],
|
||||||
|
"requestTemplate": {},
|
||||||
|
"responsePath": "",
|
||||||
|
"normalizeRules": {},
|
||||||
|
"artifactContract": {},
|
||||||
|
"validationHints": {},
|
||||||
|
"evidence": []
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Wire `Scene IR` into generator entrypoints**
|
||||||
|
|
||||||
|
Update `src/bin/sg_scene_generate.rs` to accept either:
|
||||||
|
|
||||||
|
- `--scene-info-json` upgraded to the new IR contract, or
|
||||||
|
- a new `--scene-ir-json` / `--scene-ir-file` parameter
|
||||||
|
|
||||||
|
Keep backward compatibility by translating old scene info into partial IR where needed.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Refactor generator to compile from IR**
|
||||||
|
|
||||||
|
Update `src/generated_scene/generator.rs` so its internal interfaces no longer directly depend on loosely grouped fields like `expectedDomain`, `staticParams`, and `columnDefs` alone. It should compile from unified `SceneIr`.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Update Node server to pass IR through generation**
|
||||||
|
|
||||||
|
Modify `frontend/scene-generator/server.js` so analyze endpoints return IR-shaped JSON and generate endpoints pass the same structure into Rust without flattening.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Verify serde and CLI compatibility**
|
||||||
|
|
||||||
|
Run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cargo check
|
||||||
|
node --check frontend/scene-generator/server.js
|
||||||
|
node --check frontend/scene-generator/llm-client.js
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: Rust and Node compile cleanly with the new IR contract.
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/generated_scene/ir.rs src/generated_scene/generator.rs src/bin/sg_scene_generate.rs frontend/scene-generator/server.js frontend/scene-generator/llm-client.js
|
||||||
|
git commit -m "feat(generator): introduce unified scene ir for analysis and compilation"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 3: Build Hybrid Extraction Pipeline
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/generated_scene/analyzer.rs`
|
||||||
|
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||||
|
- Modify: `frontend/scene-generator/llm-client.js`
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
|
||||||
|
**Goal:** Split extraction into deterministic signal collection plus LLM semantic completion.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Implement deterministic extraction helpers**
|
||||||
|
|
||||||
|
Add helper logic in `src/generated_scene/analyzer.rs` or adjacent extraction code to detect:
|
||||||
|
|
||||||
|
- URLs and request methods
|
||||||
|
- `contentType`
|
||||||
|
- request payload builders
|
||||||
|
- pagination variables such as `page`, `rows`, `pageSize`
|
||||||
|
- branch variables such as `period_mode`, `reportType`
|
||||||
|
- entry methods
|
||||||
|
- export methods
|
||||||
|
- obvious filter expressions such as `charge !== 0`
|
||||||
|
|
||||||
|
- [ ] **Step 2: Create key-fragment selection in Node runner**
|
||||||
|
|
||||||
|
Update `frontend/scene-generator/generator-runner.js` to extract and package:
|
||||||
|
|
||||||
|
- directory summary
|
||||||
|
- URL fragments
|
||||||
|
- branch fragments
|
||||||
|
- request-body fragments
|
||||||
|
- response normalization fragments
|
||||||
|
- export fragments
|
||||||
|
|
||||||
|
for LLM analysis.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Redesign LLM prompt for workflow understanding**
|
||||||
|
|
||||||
|
Update `frontend/scene-generator/llm-client.js` so the prompt explicitly asks for:
|
||||||
|
|
||||||
|
- `workflowArchetype`
|
||||||
|
- `bootstrap`
|
||||||
|
- `params`
|
||||||
|
- `modes`
|
||||||
|
- `workflowSteps`
|
||||||
|
- `requestTemplate`
|
||||||
|
- `responsePath`
|
||||||
|
- `normalizeRules`
|
||||||
|
- `artifactContract`
|
||||||
|
- `confidence`
|
||||||
|
- `uncertainties`
|
||||||
|
|
||||||
|
- [ ] **Step 4: Merge deterministic and LLM results**
|
||||||
|
|
||||||
|
Implement merge logic in `frontend/scene-generator/server.js` or a dedicated helper:
|
||||||
|
|
||||||
|
- deterministic extraction wins for hard facts
|
||||||
|
- LLM fills missing semantics
|
||||||
|
- conflicts are surfaced in preview as warnings
|
||||||
|
|
||||||
|
- [ ] **Step 5: Verify against reference workflows**
|
||||||
|
|
||||||
|
Check that:
|
||||||
|
|
||||||
|
- `marketing-zero-consumer-report` emits workflow steps including `paginate`, `secondary_request`, `filter`, and `export`
|
||||||
|
- `tq-lineloss-report` emits `modes`, `defaultMode`, and `modeSwitchField`
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/generated_scene/analyzer.rs frontend/scene-generator/generator-runner.js frontend/scene-generator/llm-client.js frontend/scene-generator/server.js
|
||||||
|
git commit -m "feat(generator): add hybrid deterministic and llm workflow extraction"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 4: Add Workflow Archetype Classification
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/generated_scene/analyzer.rs`
|
||||||
|
- Modify: `src/generated_scene/ir.rs`
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||||
|
|
||||||
|
**Goal:** Reliably classify scenes so the correct compiler path is chosen.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add archetype enum support**
|
||||||
|
|
||||||
|
Define and support these initial archetypes:
|
||||||
|
|
||||||
|
- `single_request_table`
|
||||||
|
- `multi_mode_request`
|
||||||
|
- `paginated_enrichment`
|
||||||
|
- `page_state_eval`
|
||||||
|
|
||||||
|
- [ ] **Step 2: Implement classification rules**
|
||||||
|
|
||||||
|
Classification logic should prefer:
|
||||||
|
|
||||||
|
1. `multi_mode_request` when explicit mode-switch branching exists
|
||||||
|
2. `paginated_enrichment` when paginated list fetch plus secondary requests are detected
|
||||||
|
3. `page_state_eval` when page-state judgment dominates
|
||||||
|
4. `single_request_table` as fallback with lower confidence
|
||||||
|
|
||||||
|
- [ ] **Step 3: Expose classification confidence**
|
||||||
|
|
||||||
|
Add confidence and evidence fields to the preview payload so UI can show why a scene was classified into an archetype.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Add manual override support in UI**
|
||||||
|
|
||||||
|
Allow users to override archetype in `frontend/scene-generator/sg_scene_generator.html` before final generation, but preserve the original detected result and confidence.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Verify reference classifications**
|
||||||
|
|
||||||
|
Expected:
|
||||||
|
|
||||||
|
- `marketing-zero-consumer-report` => `paginated_enrichment`
|
||||||
|
- `tq-lineloss-report` => `multi_mode_request`
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/generated_scene/analyzer.rs src/generated_scene/ir.rs frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html
|
||||||
|
git commit -m "feat(generator): classify scenes by workflow archetype with confidence"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 5: Split Generator Into Archetype Compilers
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/generated_scene/generator.rs`
|
||||||
|
- Optionally create: `src/generated_scene/compiler_single_request.rs`
|
||||||
|
- Optionally create: `src/generated_scene/compiler_multi_mode.rs`
|
||||||
|
- Optionally create: `src/generated_scene/compiler_paginated_enrichment.rs`
|
||||||
|
- Optionally create: `src/generated_scene/compiler_page_state.rs`
|
||||||
|
|
||||||
|
**Goal:** Replace the single generic report template with explicit compiler paths.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add compiler routing by archetype**
|
||||||
|
|
||||||
|
Update `src/generated_scene/generator.rs` so generation dispatches on `workflowArchetype`.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Implement `single_request_table` compiler**
|
||||||
|
|
||||||
|
Generate:
|
||||||
|
|
||||||
|
- minimal `scene.toml`
|
||||||
|
- direct request browser script
|
||||||
|
- artifact output for simple table/list data
|
||||||
|
|
||||||
|
- [ ] **Step 3: Implement `multi_mode_request` compiler**
|
||||||
|
|
||||||
|
Generate:
|
||||||
|
|
||||||
|
- mode detection
|
||||||
|
- mode-specific request builders
|
||||||
|
- mode-specific column definitions
|
||||||
|
- mode-specific response extraction
|
||||||
|
- unified artifact output
|
||||||
|
|
||||||
|
Reference target: `tq-lineloss-report`
|
||||||
|
|
||||||
|
- [ ] **Step 4: Implement `paginated_enrichment` compiler**
|
||||||
|
|
||||||
|
Generate:
|
||||||
|
|
||||||
|
- paginated list loop
|
||||||
|
- per-item or batched secondary requests
|
||||||
|
- aggregation and transform steps
|
||||||
|
- business filters
|
||||||
|
- final artifact or export output
|
||||||
|
|
||||||
|
Reference target: `marketing-zero-consumer-report`
|
||||||
|
|
||||||
|
- [ ] **Step 5: Implement `page_state_eval` compiler**
|
||||||
|
|
||||||
|
Generate:
|
||||||
|
|
||||||
|
- state-check script skeleton
|
||||||
|
- light artifact semantics for monitoring or status checks
|
||||||
|
|
||||||
|
- [ ] **Step 6: Verify generated outputs by archetype**
|
||||||
|
|
||||||
|
Validate that generated scripts no longer:
|
||||||
|
|
||||||
|
- define multiple API endpoints but use only the first
|
||||||
|
- collapse mode-aware scenes into one request body
|
||||||
|
- flatten paginated enrichment scenes into one-step normalization
|
||||||
|
|
||||||
|
- [ ] **Step 7: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/generated_scene/generator.rs src/generated_scene/compiler_*.rs
|
||||||
|
git commit -m "feat(generator): split scene generation into workflow archetype compilers"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 6: Align Runtime Resolver Contracts
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/compat/scene_platform/resolvers.rs`
|
||||||
|
- Modify: `src/generated_scene/generator.rs`
|
||||||
|
- Modify: `src/generated_scene/ir.rs`
|
||||||
|
|
||||||
|
**Goal:** Ensure generated parameter contracts are either executable by the runtime or explicitly flagged as unsupported.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Audit current resolver coverage**
|
||||||
|
|
||||||
|
Document which current contracts are already supported, including:
|
||||||
|
|
||||||
|
- `dictionary_entity`
|
||||||
|
- `month_week_period`
|
||||||
|
- `fixed_enum`
|
||||||
|
- `literal_passthrough`
|
||||||
|
|
||||||
|
- [ ] **Step 2: Add missing resolver types or gate them**
|
||||||
|
|
||||||
|
Choose one of these paths per parameter type:
|
||||||
|
|
||||||
|
1. implement new runtime resolver support
|
||||||
|
2. downgrade generation to an existing supported resolver
|
||||||
|
3. block generation with explicit readiness warning
|
||||||
|
|
||||||
|
Recommended additions:
|
||||||
|
|
||||||
|
- `mode_enum`
|
||||||
|
- `date_range`
|
||||||
|
- `org_tree`
|
||||||
|
- `page_size`
|
||||||
|
- `hidden_static`
|
||||||
|
- `derived_param`
|
||||||
|
|
||||||
|
- [ ] **Step 3: Reflect runtime compatibility in generated metadata**
|
||||||
|
|
||||||
|
Generated output should clearly indicate:
|
||||||
|
|
||||||
|
- supported params
|
||||||
|
- unresolved params
|
||||||
|
- manual-completion requirements
|
||||||
|
|
||||||
|
- [ ] **Step 4: Add tests for resolver alignment**
|
||||||
|
|
||||||
|
Extend tests to ensure a generated skill cannot claim runnable readiness when its params require unsupported resolver behavior.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add src/compat/scene_platform/resolvers.rs src/generated_scene/generator.rs src/generated_scene/ir.rs tests/scene_generator_test.rs
|
||||||
|
git commit -m "feat(runtime): align generated scene contracts with resolver support"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 7: Add Readiness Gates And Generation Report
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `frontend/scene-generator/server.js`
|
||||||
|
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||||
|
- Modify: `src/generated_scene/ir.rs`
|
||||||
|
- Modify: `src/generated_scene/generator.rs`
|
||||||
|
|
||||||
|
**Goal:** Make generation output self-describing so users know whether a skill is ready for internal-network trial.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add static readiness checks**
|
||||||
|
|
||||||
|
Implement checks for:
|
||||||
|
|
||||||
|
- entrypoint detection
|
||||||
|
- request-chain completeness
|
||||||
|
- bootstrap plausibility
|
||||||
|
- param/runtime compatibility
|
||||||
|
- archetype compiler completeness
|
||||||
|
|
||||||
|
- [ ] **Step 2: Add readiness levels**
|
||||||
|
|
||||||
|
Define:
|
||||||
|
|
||||||
|
- `A` = ready for direct internal-network trial
|
||||||
|
- `B` = structurally correct, human review recommended
|
||||||
|
- `C` = draft only, manual completion required
|
||||||
|
|
||||||
|
- [ ] **Step 3: Generate human-readable report**
|
||||||
|
|
||||||
|
Each analysis or generation result should include:
|
||||||
|
|
||||||
|
- archetype
|
||||||
|
- confidence
|
||||||
|
- key evidence
|
||||||
|
- detected risks
|
||||||
|
- missing pieces
|
||||||
|
- readiness level
|
||||||
|
|
||||||
|
- [ ] **Step 4: Display readiness in Web UI**
|
||||||
|
|
||||||
|
Show the readiness grade before generation and after generation, with explicit warnings for internal-network execution risk.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Verify readiness outcomes**
|
||||||
|
|
||||||
|
Expected baseline:
|
||||||
|
|
||||||
|
- `tq-lineloss-report` should reach `A` or high-confidence `B`
|
||||||
|
- `marketing-zero-consumer-report` should not be labeled runnable unless pagination and secondary-request logic are correctly represented
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/ir.rs src/generated_scene/generator.rs
|
||||||
|
git commit -m "feat(generator): add readiness grading and generation risk reporting"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 8: Add Regression Coverage For Reference Scenes
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `tests/scene_generator_test.rs`
|
||||||
|
- Create/Modify: scene generator fixtures as needed
|
||||||
|
|
||||||
|
**Goal:** Lock in the two reference scenes as ongoing regression cases.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add marketing classification fixture coverage**
|
||||||
|
|
||||||
|
Test that the marketing source scene is classified as `paginated_enrichment` and contains evidence for:
|
||||||
|
|
||||||
|
- paginated list request
|
||||||
|
- secondary request
|
||||||
|
- filter rule
|
||||||
|
- export step
|
||||||
|
|
||||||
|
- [ ] **Step 2: Add tq classification fixture coverage**
|
||||||
|
|
||||||
|
Test that the tq source scene is classified as `multi_mode_request` and contains evidence for:
|
||||||
|
|
||||||
|
- month mode
|
||||||
|
- week mode
|
||||||
|
- distinct request templates
|
||||||
|
- distinct column definitions
|
||||||
|
|
||||||
|
- [ ] **Step 3: Add generation-shape assertions**
|
||||||
|
|
||||||
|
Assert that generated outputs differ by archetype and do not collapse to a single generic template shape.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Run verification**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cargo test --test scene_generator_test -- --nocapture
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: both reference cases pass and guard against regression.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add tests/scene_generator_test.rs tests/fixtures
|
||||||
|
git commit -m "test(generator): add regression coverage for marketing and tq reference scenes"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Delivery Sequence
|
||||||
|
|
||||||
|
Recommended implementation order:
|
||||||
|
|
||||||
|
1. Task 1: hard failure fixes
|
||||||
|
2. Task 2: `Scene IR`
|
||||||
|
3. Task 3: hybrid extraction
|
||||||
|
4. Task 4: archetype classification
|
||||||
|
5. Task 5: compiler split
|
||||||
|
6. Task 6: resolver alignment
|
||||||
|
7. Task 7: readiness gates
|
||||||
|
8. Task 8: regression coverage
|
||||||
|
|
||||||
|
Rationale:
|
||||||
|
|
||||||
|
- Task 1 stops current bad outputs early.
|
||||||
|
- Tasks 2 to 5 establish the new compiler backbone.
|
||||||
|
- Tasks 6 and 7 prevent false claims of runnability.
|
||||||
|
- Task 8 locks the new architecture against regression.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Verification Strategy
|
||||||
|
|
||||||
|
### Static Verification
|
||||||
|
|
||||||
|
- `cargo check`
|
||||||
|
- `cargo test --test scene_generator_test -- --nocapture`
|
||||||
|
- `node --check frontend/scene-generator/llm-client.js`
|
||||||
|
- `node --check frontend/scene-generator/generator-runner.js`
|
||||||
|
- `node --check frontend/scene-generator/server.js`
|
||||||
|
|
||||||
|
### Functional Verification
|
||||||
|
|
||||||
|
For `marketing-zero-consumer-report`:
|
||||||
|
|
||||||
|
- detected as `paginated_enrichment`
|
||||||
|
- bootstrap resolves to business domain, not static script host
|
||||||
|
- generated workflow includes pagination and secondary requests
|
||||||
|
- generation is not labeled runnable if those steps are missing
|
||||||
|
|
||||||
|
For `tq-lineloss-report`:
|
||||||
|
|
||||||
|
- detected as `multi_mode_request`
|
||||||
|
- month and week logic remain distinct
|
||||||
|
- request templates and column definitions are mode-specific
|
||||||
|
|
||||||
|
### UI Verification
|
||||||
|
|
||||||
|
Confirm the scene generator UI now shows:
|
||||||
|
|
||||||
|
- detected archetype
|
||||||
|
- confidence
|
||||||
|
- bootstrap
|
||||||
|
- key params
|
||||||
|
- readiness grade
|
||||||
|
- risk notes
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
This plan is complete when all of the following are true:
|
||||||
|
|
||||||
|
1. `sg_scene_generate` consumes a unified `Scene IR`.
|
||||||
|
2. The analysis pipeline can distinguish at least `single_request_table`, `multi_mode_request`, `paginated_enrichment`, and `page_state_eval`.
|
||||||
|
3. `tq-lineloss-report` is generated through the multi-mode compiler path.
|
||||||
|
4. `marketing-zero-consumer-report` is generated through the paginated-enrichment compiler path.
|
||||||
|
5. Generated `scene.toml` no longer injects unrelated default org/period assumptions.
|
||||||
|
6. Bootstrap resolution no longer mistakes external script hosts for business target domains.
|
||||||
|
7. Runtime resolver compatibility is explicit, not implicit.
|
||||||
|
8. Generation results include readiness grading and risk reporting before internal-network trial.
|
||||||
|
|
||||||
@@ -0,0 +1,193 @@
|
|||||||
|
# G1 边界收敛与家族重排实施计划
|
||||||
|
> Date: 2026-04-18
|
||||||
|
> Status: Draft
|
||||||
|
> Source:
|
||||||
|
> - `docs/superpowers/specs/2026-04-17-scene-skill-60-to-90-roadmap-design.md`
|
||||||
|
> - `examples/g1_batch_round1/`
|
||||||
|
|
||||||
|
## 1. Plan Intent
|
||||||
|
|
||||||
|
本计划用于处理 `G1` 通用单页报表组边界过宽的问题。
|
||||||
|
|
||||||
|
通过对以下 4 个边界样本的实测与结构分析,已经确认当前 `G1` 分类存在误收问题:
|
||||||
|
|
||||||
|
1. `高低压新增报装容量月度统计表`
|
||||||
|
2. `电能表现场检验完成率指标报表`
|
||||||
|
3. `计量资产库存统计`
|
||||||
|
4. `95598供电服务月报`
|
||||||
|
|
||||||
|
结论不是“是否继续观察”,而是“必须整改”:
|
||||||
|
|
||||||
|
1. `G1` 的定义必须收紧
|
||||||
|
2. 这 4 个样本必须重排
|
||||||
|
3. 后续实施必须按新边界推进,不能继续把这 4 个样本混在同一类里
|
||||||
|
|
||||||
|
## 2. Rectification Objective
|
||||||
|
|
||||||
|
本轮整改目标固定为:
|
||||||
|
|
||||||
|
1. 收紧 `G1` 定义,避免继续污染 `single_request_table`
|
||||||
|
2. 将 4 个边界样本重新分配到正确家族
|
||||||
|
3. 为后续实现提供明确顺序,不再把边界样本混做“通用报表”
|
||||||
|
|
||||||
|
## 3. Final Reassignment Decision
|
||||||
|
|
||||||
|
本计划执行时,4 个样本的正式归类结论固定如下:
|
||||||
|
|
||||||
|
1. `高低压新增报装容量月度统计表`
|
||||||
|
- 保留在 `G1`
|
||||||
|
- 子型标记为:`G1-E 轻量补查汇总型`
|
||||||
|
2. `电能表现场检验完成率指标报表`
|
||||||
|
- 从 `G1` 拆出
|
||||||
|
- 新家族标记为:`G6 宿主桥接多步查询型`
|
||||||
|
3. `计量资产库存统计`
|
||||||
|
- 从 `G1` 拆出
|
||||||
|
- 新家族标记为:`G7 多接口盘点汇总型`
|
||||||
|
4. `95598供电服务月报`
|
||||||
|
- 从 `G1` 拆出
|
||||||
|
- 新家族标记为:`G8 抓取落库分析出文档型`
|
||||||
|
|
||||||
|
## 4. Scope Guardrails
|
||||||
|
|
||||||
|
本计划边界固定如下:
|
||||||
|
|
||||||
|
1. 不修改线损家族 `G2`
|
||||||
|
2. 不扩展到全部 `102` 个场景同步重排
|
||||||
|
3. 只处理 `G1` 边界定义与这 4 个边界样本
|
||||||
|
4. 不在本计划内直接实现 `G6/G7/G8` 全部能力
|
||||||
|
5. 本计划优先产出“边界收敛 + 家族重排 + 实施顺序”
|
||||||
|
|
||||||
|
## 5. Phase Overview
|
||||||
|
|
||||||
|
执行顺序固定为:
|
||||||
|
|
||||||
|
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3`
|
||||||
|
|
||||||
|
### Phase 0: 冻结整改口径
|
||||||
|
|
||||||
|
目标:
|
||||||
|
|
||||||
|
1. 冻结 `G1` 修订定义
|
||||||
|
2. 冻结 4 个样本的正式重排结论
|
||||||
|
|
||||||
|
退出标准:
|
||||||
|
|
||||||
|
1. 后续不再把这 4 个样本同时作为 `G1` 候选讨论
|
||||||
|
|
||||||
|
### Phase 1: 收紧 G1 边界
|
||||||
|
|
||||||
|
目标:
|
||||||
|
|
||||||
|
1. 将 `G1` 明确收敛为“通用单页报表”
|
||||||
|
2. 把不属于 `G1` 的结构特征显式列为排除条件
|
||||||
|
|
||||||
|
必须落地的对象:
|
||||||
|
|
||||||
|
1. `G1` 修订定义
|
||||||
|
2. `G1` 进入条件
|
||||||
|
3. `G1` 排除条件
|
||||||
|
4. `G1-E` 作为上边界子型的说明
|
||||||
|
|
||||||
|
退出标准:
|
||||||
|
|
||||||
|
1. `single_request_table` 不再承接宿主桥接型、盘点型、落库分析型场景
|
||||||
|
|
||||||
|
### Phase 2: 样本重排与家族建档
|
||||||
|
|
||||||
|
目标:
|
||||||
|
|
||||||
|
1. 把 4 个样本正式移到对应家族
|
||||||
|
2. 为 `G6/G7/G8` 建立最小定义
|
||||||
|
|
||||||
|
必须落地的对象:
|
||||||
|
|
||||||
|
1. 样本重排表
|
||||||
|
2. `G6` 最小定义
|
||||||
|
3. `G7` 最小定义
|
||||||
|
4. `G8` 最小定义
|
||||||
|
|
||||||
|
退出标准:
|
||||||
|
|
||||||
|
1. 4 个样本不再处于“G1 模糊候选”状态
|
||||||
|
|
||||||
|
### Phase 3: 后续实施顺序固定
|
||||||
|
|
||||||
|
目标:
|
||||||
|
|
||||||
|
1. 确定后续开发顺序
|
||||||
|
2. 避免多家族并发扩散
|
||||||
|
|
||||||
|
固定顺序:
|
||||||
|
|
||||||
|
1. 先继续推进 `高低压新增报装容量月度统计表`
|
||||||
|
- 作为 `G1-E`
|
||||||
|
2. 再单开 `G6`
|
||||||
|
- `电能表现场检验完成率指标报表`
|
||||||
|
3. 再评估 `G7`
|
||||||
|
- `计量资产库存统计`
|
||||||
|
4. 最后评估 `G8`
|
||||||
|
- `95598供电服务月报`
|
||||||
|
|
||||||
|
退出标准:
|
||||||
|
|
||||||
|
1. 后续任务顺序明确
|
||||||
|
2. `G1` 不再继续吞入新边界样本
|
||||||
|
|
||||||
|
## 6. Family-Level Rectification Rules
|
||||||
|
|
||||||
|
### 6.1 G1 修订规则
|
||||||
|
|
||||||
|
`G1` 仅保留以下场景:
|
||||||
|
|
||||||
|
1. 单系统、单页面承载
|
||||||
|
2. 存在相对清晰的主请求链
|
||||||
|
3. 请求模板与响应路径可直接恢复
|
||||||
|
4. 最终结果为单表或单次统计汇总
|
||||||
|
5. 不依赖复杂宿主桥接
|
||||||
|
6. 不依赖本地落库与 SQL 分析
|
||||||
|
|
||||||
|
### 6.2 G1 排除规则
|
||||||
|
|
||||||
|
出现以下特征之一,即不再归入 `G1`:
|
||||||
|
|
||||||
|
1. `BrowserAction / sgBrowserExcuteJsCode` 主导业务请求推进
|
||||||
|
2. 存在明显多轮 callback 串联 workflow
|
||||||
|
3. 同场景内存在多个业务 endpoint 分类型扫数
|
||||||
|
4. 报表前需要本地落库、二次分析或 SQL 聚合
|
||||||
|
5. 输出以 Word 文档流水线而非直接表格结果为主
|
||||||
|
|
||||||
|
## 7. Implementation Priority
|
||||||
|
|
||||||
|
优先级固定如下:
|
||||||
|
|
||||||
|
1. `P0`
|
||||||
|
- `高低压新增报装容量月度统计表`
|
||||||
|
- 目标:验证 `G1-E` 是否可作为 `G1` 上边界稳定成立
|
||||||
|
2. `P1`
|
||||||
|
- `电能表现场检验完成率指标报表`
|
||||||
|
- 目标:验证 `G6` 的最小 workflow 定义
|
||||||
|
3. `P2`
|
||||||
|
- `计量资产库存统计`
|
||||||
|
- 目标:验证 `G7` 的多 endpoint 聚合边界
|
||||||
|
4. `P3`
|
||||||
|
- `95598供电服务月报`
|
||||||
|
- 目标:验证 `G8` 的抓取落库分析链路边界
|
||||||
|
|
||||||
|
## 8. Deliverables
|
||||||
|
|
||||||
|
本计划完成时至少产出:
|
||||||
|
|
||||||
|
1. `G1` 边界修订文案
|
||||||
|
2. 4 个边界样本重排表
|
||||||
|
3. `G6/G7/G8` 最小家族定义
|
||||||
|
4. 后续实施优先级清单
|
||||||
|
|
||||||
|
## 9. Completion Criteria
|
||||||
|
|
||||||
|
本计划完成的标志是:
|
||||||
|
|
||||||
|
1. `G1` 定义被正式收紧
|
||||||
|
2. 4 个边界样本完成正式重排
|
||||||
|
3. `高低压新增报装容量月度统计表` 被确定为 `G1-E`
|
||||||
|
4. `电能表现场检验完成率指标报表`、`计量资产库存统计`、`95598供电服务月报` 不再继续作为 `G1` 样本使用
|
||||||
|
5. 后续开发顺序固定,不再反复讨论边界归属
|
||||||
@@ -0,0 +1,212 @@
|
|||||||
|
# G1-E Light Enrichment Report Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-18
|
||||||
|
> Status: Draft
|
||||||
|
> Source:
|
||||||
|
> - `docs/superpowers/specs/2026-04-18-g1-e-light-enrichment-report-design.md`
|
||||||
|
> - `docs/superpowers/specs/2026-04-17-scene-skill-60-to-90-roadmap-design.md`
|
||||||
|
> - `docs/superpowers/reports/2026-04-18-g1-boundary-reassignment-report.md`
|
||||||
|
|
||||||
|
## 1. Plan Intent
|
||||||
|
|
||||||
|
本计划用于把 `G1-E 轻量补查汇总型` 从概念边界推进到可实施状态。
|
||||||
|
|
||||||
|
本轮只解决一个问题:
|
||||||
|
|
||||||
|
1. 让生成器能够对“单主请求 + 少量补查 + 单次汇总输出”的场景,恢复出可编译的三段式业务语义。
|
||||||
|
|
||||||
|
本计划不处理 `G6/G7/G8`,也不扩展到其它家族。
|
||||||
|
|
||||||
|
## 2. Scope
|
||||||
|
|
||||||
|
本计划纳入范围的对象只有三类:
|
||||||
|
|
||||||
|
1. `G1-E` 证据层补齐
|
||||||
|
2. `G1-E` 三段式 `Scene IR` / compiler gate 落地
|
||||||
|
3. `高低压新增报装容量月度统计表` 的 P0 样板验证
|
||||||
|
|
||||||
|
本计划明确排除:
|
||||||
|
|
||||||
|
1. `G6 宿主桥接多步查询型`
|
||||||
|
2. `G7 多接口盘点汇总型`
|
||||||
|
3. `G8 抓取落库分析出文档型`
|
||||||
|
4. `102` 个场景的大规模家族扩展
|
||||||
|
|
||||||
|
## 3. Fixed Sample
|
||||||
|
|
||||||
|
本计划的唯一 P0 样板固定为:
|
||||||
|
|
||||||
|
1. `高低压新增报装容量月度统计表`
|
||||||
|
|
||||||
|
该样板的冻结目标是:
|
||||||
|
|
||||||
|
1. 主请求:`getWkorderAll`
|
||||||
|
2. 补查请求:
|
||||||
|
- `queryElectCustInfo`
|
||||||
|
- `queryBusAcpt`
|
||||||
|
- `getBatchPerCust97`
|
||||||
|
3. 最终恢复为主请求、补查请求、并回规则三段式结构
|
||||||
|
|
||||||
|
在本计划完成前,不新增第二个 `G1-E` 样板。
|
||||||
|
|
||||||
|
## 4. Phase Overview
|
||||||
|
|
||||||
|
执行顺序固定为:
|
||||||
|
|
||||||
|
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3`
|
||||||
|
|
||||||
|
### Phase 0: Freeze Contract
|
||||||
|
|
||||||
|
目标:
|
||||||
|
|
||||||
|
1. 冻结 `G1-E` 最小定义
|
||||||
|
2. 冻结 P0 样板的主链、补查链、并回链目标口径
|
||||||
|
|
||||||
|
必须落地的对象:
|
||||||
|
|
||||||
|
1. `G1-E` spec
|
||||||
|
2. P0 样板目标结构说明
|
||||||
|
3. 失败分类口径
|
||||||
|
|
||||||
|
退出标准:
|
||||||
|
|
||||||
|
1. 后续实现不再回退成普通 `G1 single_request_table`
|
||||||
|
|
||||||
|
### Phase 1: Evidence Layer Completion
|
||||||
|
|
||||||
|
目标:
|
||||||
|
|
||||||
|
1. 让提取链路可以显式产出 `main_request` 证据
|
||||||
|
2. 让提取链路可以显式产出 `enrichment_request` 证据
|
||||||
|
3. 让提取链路可以显式产出 `merge_plan` 证据
|
||||||
|
|
||||||
|
必须落地的对象:
|
||||||
|
|
||||||
|
1. `main_request` 证据 schema
|
||||||
|
2. `enrichment_request` 证据 schema
|
||||||
|
3. `merge_plan` 证据 schema
|
||||||
|
4. 对应的越界识别信号
|
||||||
|
|
||||||
|
退出标准:
|
||||||
|
|
||||||
|
1. P0 样板不再只落到 `page_state_eval`
|
||||||
|
2. 提取结果中能看见主请求、补查请求、并回规则候选
|
||||||
|
|
||||||
|
### Phase 2: Scene IR And Compiler Gates
|
||||||
|
|
||||||
|
目标:
|
||||||
|
|
||||||
|
1. 在 `Scene IR` 中承载三段式结构
|
||||||
|
2. 在 compiler 中增加 `G1-E` 专属 gate
|
||||||
|
3. 防止缺失补查契约的结果误判为普通 `G1` 成功
|
||||||
|
|
||||||
|
必须落地的对象:
|
||||||
|
|
||||||
|
1. `main_request`
|
||||||
|
2. `enrichment_requests[]`
|
||||||
|
3. `merge_plan`
|
||||||
|
4. `main_request_resolved`
|
||||||
|
5. `enrichment_requests_resolved`
|
||||||
|
6. `merge_plan_resolved`
|
||||||
|
7. `g1e_scope_compatible`
|
||||||
|
|
||||||
|
退出标准:
|
||||||
|
|
||||||
|
1. `G1-E` 可以独立于 `single_request_table` 被判定
|
||||||
|
2. 越界样本会被阻断,而不是伪成功
|
||||||
|
|
||||||
|
### Phase 3: P0 Validation
|
||||||
|
|
||||||
|
目标:
|
||||||
|
|
||||||
|
1. 用 `高低压新增报装容量月度统计表` 验证 `G1-E` 最小闭环
|
||||||
|
2. 冻结第一版验收基线
|
||||||
|
|
||||||
|
必须落地的对象:
|
||||||
|
|
||||||
|
1. P0 样板生成结果
|
||||||
|
2. P0 样板验证记录
|
||||||
|
3. P0 样板失败归因记录
|
||||||
|
|
||||||
|
退出标准:
|
||||||
|
|
||||||
|
1. 主请求、补查请求、并回规则均能稳定恢复
|
||||||
|
2. 结果不再是空壳 `params=[] / requestEntries=[] / columnDefs=[]`
|
||||||
|
3. 缺证据时能 fail-closed
|
||||||
|
|
||||||
|
## 5. Work Breakdown
|
||||||
|
|
||||||
|
### Task Group A: G1-E Evidence Modeling
|
||||||
|
|
||||||
|
任务目标:
|
||||||
|
|
||||||
|
1. 定义主请求证据对象
|
||||||
|
2. 定义补查请求证据对象
|
||||||
|
3. 定义并回规则证据对象
|
||||||
|
4. 明确越界到 `G6/G7/G8` 的识别信号
|
||||||
|
|
||||||
|
完成标志:
|
||||||
|
|
||||||
|
1. `G1-E` 不再依赖“全文像不像报表”的模糊判断
|
||||||
|
|
||||||
|
### Task Group B: G1-E IR / Compiler Integration
|
||||||
|
|
||||||
|
任务目标:
|
||||||
|
|
||||||
|
1. 为 `G1-E` 建立三段式 `Scene IR`
|
||||||
|
2. 增加 `G1-E` gate
|
||||||
|
3. 切断“补查缺失但仍按普通 G1 成功”的通道
|
||||||
|
|
||||||
|
完成标志:
|
||||||
|
|
||||||
|
1. `G1-E` 与 `G1` 的成功条件正式分离
|
||||||
|
|
||||||
|
### Task Group C: P0 Sample Verification
|
||||||
|
|
||||||
|
任务目标:
|
||||||
|
|
||||||
|
1. 重新生成 `高低压新增报装容量月度统计表`
|
||||||
|
2. 核对主请求、补查请求、并回规则是否完整
|
||||||
|
3. 输出验证报告
|
||||||
|
|
||||||
|
完成标志:
|
||||||
|
|
||||||
|
1. `高低压新增报装容量月度统计表` 成为 `G1-E` 第一版标准样板
|
||||||
|
|
||||||
|
## 6. Deliverables
|
||||||
|
|
||||||
|
本计划完成时至少产出:
|
||||||
|
|
||||||
|
1. `G1-E` 证据层实现
|
||||||
|
2. `G1-E` 三段式 `Scene IR`
|
||||||
|
3. `G1-E` compiler gate
|
||||||
|
4. `高低压新增报装容量月度统计表` 的 P0 生成与验证结果
|
||||||
|
5. 对应整改报告或验证报告
|
||||||
|
|
||||||
|
## 7. Acceptance Criteria
|
||||||
|
|
||||||
|
本计划完成的标志是:
|
||||||
|
|
||||||
|
1. `G1-E` 已从文档定义进入可实现、可验证状态
|
||||||
|
2. `高低压新增报装容量月度统计表` 不再被误生成为普通 `G1` 空壳 skill
|
||||||
|
3. 生成器能够显式恢复:
|
||||||
|
- 主请求
|
||||||
|
- 补查请求
|
||||||
|
- 并回规则
|
||||||
|
4. 当证据不足或结构越界时,系统会阻断并说明原因
|
||||||
|
|
||||||
|
## 8. Execution Guardrails
|
||||||
|
|
||||||
|
执行过程中必须遵守以下边界:
|
||||||
|
|
||||||
|
1. 不把 `G6/G7/G8` 的能力提前混入 `G1-E`
|
||||||
|
2. 不扩展第二个 `G1-E` 样板
|
||||||
|
3. 不为了“先生成一个 skill”而放松 gate
|
||||||
|
4. 不把 `G1-E` 再退化回普通 `single_request_table`
|
||||||
|
|
||||||
|
## 9. Next Plan
|
||||||
|
|
||||||
|
本计划完成后,后续顺序固定为:
|
||||||
|
|
||||||
|
1. 若 `G1-E` P0 验证通过,再决定是否补第二个 `G1-E` 样板
|
||||||
|
2. 然后再进入 `G6` 的独立 spec / plan
|
||||||
304
docs/superpowers/plans/2026-04-18-g2-family-expansion-plan.md
Normal file
304
docs/superpowers/plans/2026-04-18-g2-family-expansion-plan.md
Normal file
@@ -0,0 +1,304 @@
|
|||||||
|
# G2 家族扩展整改计划
|
||||||
|
|
||||||
|
> **Status:** Draft
|
||||||
|
> **Date:** 2026-04-18
|
||||||
|
> **Author:** Codex
|
||||||
|
> **Upstream Inputs:**
|
||||||
|
> [2026-04-18-g2-remediation-plan.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/plans/2026-04-18-g2-remediation-plan.md)
|
||||||
|
> [2026-04-18-g2-second-round-remediation-report.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/reports/2026-04-18-g2-second-round-remediation-report.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
本计划用于承接上一轮 `G2` 主样本整改结果,把目标从“修通 `tq` 主样本”推进到“扩展 `G2` 家族变体覆盖”。
|
||||||
|
|
||||||
|
上一轮已经证明:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析` 可以进入候选验证名单
|
||||||
|
2. `G2` 主样本链路已经具备可编译性
|
||||||
|
3. `白银线损周报`
|
||||||
|
4. `线损同期差异报表`
|
||||||
|
|
||||||
|
这两份剩余真实样本仍然稳定 `fail-close`
|
||||||
|
|
||||||
|
因此,本计划的核心目标不是重做上一轮主样本整改,而是补齐 `G2` 家族内部剩余两类变体的识别与合同恢复能力。
|
||||||
|
|
||||||
|
## Success Baseline
|
||||||
|
|
||||||
|
本计划完成后的最低成功口径固定为:
|
||||||
|
|
||||||
|
1. `白银线损周报` 不再因为 `G2` 合同缺失而直接阻断
|
||||||
|
2. `线损同期差异报表` 不再被粗暴套入 `tq` 主报表模板
|
||||||
|
3. 生成器能够明确区分至少两类新增 `G2` 家族子型
|
||||||
|
4. 新增子型具备各自最小可解释合同
|
||||||
|
5. 对证据不足的样本继续 `fail-close`
|
||||||
|
6. readiness 与“是否达到候选验证名单”保持一致
|
||||||
|
7. 输出第三轮 `G2` 家族扩展回归报告
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
执行过程中保持以下边界不变:
|
||||||
|
|
||||||
|
1. 不切换到 `G1`
|
||||||
|
2. 不切换到 `G3`
|
||||||
|
3. 不展开统一登录、隐藏域登录或宿主 transport 重构
|
||||||
|
4. 不扩展到 102 个全量场景
|
||||||
|
5. 不把本计划扩散成通用 scene skill 平台重写
|
||||||
|
6. 不否定上一轮 `tq` 主样本已经收敛的口径
|
||||||
|
|
||||||
|
## Target Samples
|
||||||
|
|
||||||
|
本计划只围绕以下三份 `G2` 家族真实样本执行:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
2. `白银线损周报`
|
||||||
|
3. `线损同期差异报表`
|
||||||
|
|
||||||
|
其中角色区分为:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
作用:`G2-A` 主样本基线,不允许回退
|
||||||
|
2. `白银线损周报`
|
||||||
|
作用:`G2-B` 周报单侧 mode 变体
|
||||||
|
3. `线损同期差异报表`
|
||||||
|
作用:`G2-C` 混合联动变体
|
||||||
|
|
||||||
|
## Family Expansion Hypothesis
|
||||||
|
|
||||||
|
基于上一轮报告,本计划先将 `G2` 家族收束为三类:
|
||||||
|
|
||||||
|
1. `G2-A`
|
||||||
|
定义:`tq` 主报表型,具备稳定的 `month/week + cols1/cols2 + mode-specific request/response`
|
||||||
|
当前状态:已进入候选验证名单
|
||||||
|
2. `G2-B`
|
||||||
|
定义:周报偏单侧 mode 变体,存在 `week/tjzq` 与线损主接口,但缺少与主样本同等级的双模式列合同
|
||||||
|
当前代表:`白银线损周报`
|
||||||
|
3. `G2-C`
|
||||||
|
定义:线损主链路与外部系统联动混合变体,存在线损接口和联动接口并存的情况
|
||||||
|
当前代表:`线损同期差异报表`
|
||||||
|
|
||||||
|
本计划的整改原则是:
|
||||||
|
|
||||||
|
1. 不强行把 `G2-B/G2-C` 编造成 `G2-A`
|
||||||
|
2. 先把三类子型边界立住
|
||||||
|
3. 再让每类子型各自拥有最小合同
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
本计划拆为五条工作流:
|
||||||
|
|
||||||
|
1. `WS1` G2 子型分层与判定收束
|
||||||
|
2. `WS2` G2-B 周报变体合同补齐
|
||||||
|
3. `WS3` G2-C 混合联动变体隔离
|
||||||
|
4. `WS4` G2 家族 readiness 分级重整
|
||||||
|
5. `WS5` 真实样本第三轮回归与报告
|
||||||
|
|
||||||
|
## Phase Overview
|
||||||
|
|
||||||
|
本计划按四个阶段推进:
|
||||||
|
|
||||||
|
1. Phase 0:冻结家族扩展目标
|
||||||
|
2. Phase 1:建立 `G2-A/G2-B/G2-C` 子型边界
|
||||||
|
3. Phase 2:分别补齐 `G2-B/G2-C` 最小合同
|
||||||
|
4. Phase 3:回归三份真实样本并输出扩展报告
|
||||||
|
|
||||||
|
执行顺序固定为:
|
||||||
|
|
||||||
|
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3`
|
||||||
|
|
||||||
|
## Phase 0:冻结扩展目标
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把上一轮已经收敛出来的家族事实冻结下来,避免再次把问题表述成“主样本没修好”。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 固化 `G2-A` 已达标口径
|
||||||
|
2. 固化 `G2-B` 与 `G2-C` 的直接 blocker
|
||||||
|
3. 固化本计划只补家族扩展,不回退主样本链路
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. `tq` 主样本被视为基线,不再作为待整改对象
|
||||||
|
2. 家族扩展问题被明确表述为“变体支持缺失”
|
||||||
|
|
||||||
|
## Phase 1:建立子型边界
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
让系统能区分 `G2-A/G2-B/G2-C`,而不是所有 `G2` 一律走同一套路。
|
||||||
|
|
||||||
|
### WS1:G2 子型分层与判定收束
|
||||||
|
|
||||||
|
#### Task 1
|
||||||
|
|
||||||
|
审计当前 `G2` 真实样本信号差异,明确以下边界:
|
||||||
|
|
||||||
|
1. 哪些信号属于 `G2-A`
|
||||||
|
2. 哪些信号属于 `G2-B`
|
||||||
|
3. 哪些信号属于 `G2-C`
|
||||||
|
|
||||||
|
#### Task 2
|
||||||
|
|
||||||
|
为 `G2` 增加子型判定规则,至少能区分:
|
||||||
|
|
||||||
|
1. 双模式主报表型
|
||||||
|
2. 周报单侧 mode 型
|
||||||
|
3. 混合联动型
|
||||||
|
|
||||||
|
#### Task 3
|
||||||
|
|
||||||
|
补充 fixture 与回归测试,证明:
|
||||||
|
|
||||||
|
1. `G2-A` 不回退
|
||||||
|
2. `G2-B` 不再误套 `G2-A`
|
||||||
|
3. `G2-C` 不再误套 `G2-A`
|
||||||
|
|
||||||
|
### Phase 1 Exit Criteria
|
||||||
|
|
||||||
|
1. `G2` 家族内部已可分层
|
||||||
|
2. 生成路径不再默认所有 `G2` 都是 `tq` 主报表
|
||||||
|
|
||||||
|
## Phase 2:补齐变体最小合同
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
分别为 `G2-B` 和 `G2-C` 建立“足够小但可解释”的合同。
|
||||||
|
|
||||||
|
### WS2:G2-B 周报变体合同补齐
|
||||||
|
|
||||||
|
#### Task 4
|
||||||
|
|
||||||
|
定义 `G2-B` 的最小合同,至少包括:
|
||||||
|
|
||||||
|
1. 主 mode 或主周期字段
|
||||||
|
2. 对应 request template
|
||||||
|
3. 对应 response path
|
||||||
|
4. 对应 column/required fields
|
||||||
|
|
||||||
|
#### Task 5
|
||||||
|
|
||||||
|
修改 analyzer / generator / scene ir 组装逻辑,使 `白银线损周报` 能输出非空合同,而不是继续因合同缺失直接阻断。
|
||||||
|
|
||||||
|
#### Task 6
|
||||||
|
|
||||||
|
新增或更新测试,证明 `G2-B` 可以独立成立,不依赖 `month/week` 双模式完整结构。
|
||||||
|
|
||||||
|
### WS3:G2-C 混合联动变体隔离
|
||||||
|
|
||||||
|
#### Task 7
|
||||||
|
|
||||||
|
审计 `线损同期差异报表` 中:
|
||||||
|
|
||||||
|
1. 线损主链路
|
||||||
|
2. 同期系统联动链路
|
||||||
|
3. 哪一部分属于主报表合同
|
||||||
|
|
||||||
|
#### Task 8
|
||||||
|
|
||||||
|
为 `G2-C` 建立隔离规则,避免混合联动接口污染主报表生成。
|
||||||
|
|
||||||
|
#### Task 9
|
||||||
|
|
||||||
|
定义 `G2-C` 的最小可编译合同,允许:
|
||||||
|
|
||||||
|
1. 主链路进入候选验证
|
||||||
|
2. 联动链路作为风险或扩展证据保留
|
||||||
|
|
||||||
|
而不是全部混在一起后直接失败。
|
||||||
|
|
||||||
|
#### Task 10
|
||||||
|
|
||||||
|
新增或更新测试,证明 `G2-C` 至少能稳定输出“主链路 + 联动风险”的结构化结果。
|
||||||
|
|
||||||
|
### WS4:G2 家族 readiness 分级重整
|
||||||
|
|
||||||
|
#### Task 11
|
||||||
|
|
||||||
|
为 `G2-A/G2-B/G2-C` 增加子型级 readiness gate。
|
||||||
|
|
||||||
|
#### Task 12
|
||||||
|
|
||||||
|
调整 readiness 评级逻辑,保证:
|
||||||
|
|
||||||
|
1. `G2-A` 满足完整双模式合同时可以进入 `A`
|
||||||
|
2. `G2-B` 满足其最小合同时可以进入候选验证等级
|
||||||
|
3. `G2-C` 若仅主链路闭合,也能获得可解释等级
|
||||||
|
4. 证据不足时继续 `fail-close`
|
||||||
|
|
||||||
|
#### Task 13
|
||||||
|
|
||||||
|
补充测试,证明 readiness 不会再用 `G2-A` 的标准去误判全部 `G2` 子型。
|
||||||
|
|
||||||
|
### Phase 2 Exit Criteria
|
||||||
|
|
||||||
|
1. `G2-B` 具备最小合同
|
||||||
|
2. `G2-C` 具备隔离后的最小合同
|
||||||
|
3. readiness 与子型口径一致
|
||||||
|
|
||||||
|
## Phase 3:真实样本第三轮回归
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
基于扩展后的家族能力,重新回归三份真实样本并输出正式结论。
|
||||||
|
|
||||||
|
### WS5:真实样本第三轮回归与报告
|
||||||
|
|
||||||
|
#### Task 14
|
||||||
|
|
||||||
|
重新生成以下三份真实样本:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
2. `白银线损周报`
|
||||||
|
3. `线损同期差异报表`
|
||||||
|
|
||||||
|
#### Task 15
|
||||||
|
|
||||||
|
按统一口径对比:
|
||||||
|
|
||||||
|
1. 子型判定
|
||||||
|
2. bootstrap
|
||||||
|
3. request contract
|
||||||
|
4. response / column / normalize contract
|
||||||
|
5. readiness
|
||||||
|
6. 是否进入候选验证名单
|
||||||
|
|
||||||
|
#### Task 16
|
||||||
|
|
||||||
|
输出第三轮 `G2` 家族扩展整改报告,至少说明:
|
||||||
|
|
||||||
|
1. `G2-A` 是否保持稳定
|
||||||
|
2. `G2-B` 是否进入候选验证名单
|
||||||
|
3. `G2-C` 是否进入候选验证名单或仍需 fail-close
|
||||||
|
4. 剩余 blocker 是否已经从“主样本不可生成”转移为“少数变体待扩展”
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `G2` 家族扩展回归测试
|
||||||
|
2. `G2-B/G2-C` 对应 fixture
|
||||||
|
3. 第三轮真实样本生成结果
|
||||||
|
4. 第三轮 `G2` 家族扩展整改报告
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `G2-A` 不回退
|
||||||
|
2. `G2-B` 至少达到可解释合同或候选验证等级
|
||||||
|
3. `G2-C` 至少达到主链路隔离成功,不能继续被整包噪声污染
|
||||||
|
4. 三份样本不再被单一 `G2-A` 模型粗暴处理
|
||||||
|
|
||||||
|
## File-Level Targets
|
||||||
|
|
||||||
|
本计划执行时,至少会触达以下资产类型:
|
||||||
|
|
||||||
|
1. `src/generated_scene/` 下的 analyzer / generator / readiness 相关实现
|
||||||
|
2. `tests/fixtures/generated_scene/` 下的 `G2-B/G2-C` fixture
|
||||||
|
3. `tests/` 下与 scene generator / readiness / family regression 相关的测试
|
||||||
|
4. `docs/superpowers/reports/` 下的第三轮家族扩展报告
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
本计划完成的标志是:
|
||||||
|
|
||||||
|
1. `G2` 已从“单主样本修通”推进到“至少三类子型可区分”
|
||||||
|
2. `白银线损周报` 与 `线损同期差异报表` 不再只是被动 fail-close
|
||||||
|
3. 下一步是否继续扩到更多线损变体,可以建立在第三轮家族扩展报告上
|
||||||
331
docs/superpowers/plans/2026-04-18-g2-remediation-plan.md
Normal file
331
docs/superpowers/plans/2026-04-18-g2-remediation-plan.md
Normal file
@@ -0,0 +1,331 @@
|
|||||||
|
# G2 家族整改计划
|
||||||
|
|
||||||
|
> **Status:** Draft
|
||||||
|
> **Date:** 2026-04-18
|
||||||
|
> **Author:** Codex
|
||||||
|
> **Upstream Spec:** [2026-04-18-g2-remediation-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-g2-remediation-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
本计划用于把 `G2` 家族整改设计拆解为可执行任务,目标是把当前线损多模式报表家族从“信号能抓到但主链重建失败”,推进到“至少第一份样本达到候选验证门槛”。
|
||||||
|
|
||||||
|
本计划严格限定在 `G2` 家族整改,不扩展到:
|
||||||
|
|
||||||
|
1. `G1`
|
||||||
|
2. `G3`
|
||||||
|
3. 更大范围真实场景迁移
|
||||||
|
4. 登录恢复、宿主协议重构或运行时 transport 改造
|
||||||
|
|
||||||
|
## Success Baseline
|
||||||
|
|
||||||
|
整改阶段的最低成功口径固定为:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析` 不再坍缩为 `paginated_enrichment`
|
||||||
|
2. 至少该样本能生成 `multi_mode_request` 结构
|
||||||
|
3. `bootstrap` 落到线损主业务承载面
|
||||||
|
4. `modes` 至少恢复 `month` 与 `week`
|
||||||
|
5. mode-specific `request/response/column/normalize` 合同不再为空
|
||||||
|
6. readiness 不再在核心合同缺失时给出虚高 `A`
|
||||||
|
7. 样本结果达到“可进入候选验证”门槛
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
执行过程中保持以下边界不变:
|
||||||
|
|
||||||
|
1. 不切换到 `G1/G3` 样本执行
|
||||||
|
2. 不继续补更多同类 `G2` 观察样本
|
||||||
|
3. 不在本计划中展开内网人工验证
|
||||||
|
4. 不在本计划中处理统一登录与隐藏域登录恢复
|
||||||
|
5. 不发散到 scene skill 平台通用重构
|
||||||
|
|
||||||
|
## Target Samples
|
||||||
|
|
||||||
|
本计划整改与回归只围绕以下三份 `G2` 样本:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
2. `白银线损周报`
|
||||||
|
3. `线损同期差异报表`
|
||||||
|
|
||||||
|
对应产物路径:
|
||||||
|
|
||||||
|
1. `examples/real_scene_batch_round1/skills/real-tq-lineloss-report-r1`
|
||||||
|
2. `examples/real_scene_batch_round1/skills/real-baiyin-lineloss-weekly-r1`
|
||||||
|
3. `examples/real_scene_batch_round1/skills/real-lineloss-period-diff-r1`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
本计划拆为五条工作流,与上游 `spec` 一一对应:
|
||||||
|
|
||||||
|
1. `WS1` G2 archetype 纠偏
|
||||||
|
2. `WS2` bootstrap 纠偏
|
||||||
|
3. `WS3` mode contract 重建
|
||||||
|
4. `WS4` endpoint 去污染
|
||||||
|
5. `WS5` readiness 收紧
|
||||||
|
|
||||||
|
## Phase Overview
|
||||||
|
|
||||||
|
本计划按四个阶段推进:
|
||||||
|
|
||||||
|
1. Phase 0:冻结整改基线
|
||||||
|
2. Phase 1:修正识别与选择
|
||||||
|
3. Phase 2:重建 `G2` 合同
|
||||||
|
4. Phase 3:回归真实样本并产出整改报告
|
||||||
|
|
||||||
|
执行顺序固定为:
|
||||||
|
|
||||||
|
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3`
|
||||||
|
|
||||||
|
其中 `Phase 1` 先于 `Phase 2`,避免在错误 archetype 和错误 bootstrap 上继续堆模板逻辑。
|
||||||
|
|
||||||
|
## Phase 0:冻结整改基线
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把当前 `G2` 家族首轮 blocker、对标口径和验收门槛冻结,避免整改过程中边界漂移。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 固化三份 `G2` 样本的当前失败画像
|
||||||
|
2. 固化 `tq-lineloss-report` 作为 `G2` 主锚点参考
|
||||||
|
3. 固化 `G2` 候选验证门槛
|
||||||
|
4. 固化整改阶段只围绕 `G2` 的边界
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 本计划
|
||||||
|
2. 已存在的 `G2` blocker 汇总
|
||||||
|
3. 已存在的第一轮迁移与候选验证报告
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. 后续执行不再追加同类 `G2` 观察样本
|
||||||
|
2. 不再用“先去内网试试”替代整改闭环
|
||||||
|
|
||||||
|
## Phase 1:修正识别与选择
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
先把 `G2` 主链判定修正过来,解决 archetype、bootstrap 与 endpoint 污染这三个上游问题。
|
||||||
|
|
||||||
|
### WS1:G2 Archetype Rectification
|
||||||
|
|
||||||
|
#### Task 1
|
||||||
|
|
||||||
|
审计当前 `G2` archetype 误判来源,确认:
|
||||||
|
|
||||||
|
1. 哪些分页信号在夺权
|
||||||
|
2. 哪些 mode 信号没有进入主判定
|
||||||
|
3. 当前 `multi_mode_request` 与 `paginated_enrichment` 的优先级冲突点在哪里
|
||||||
|
|
||||||
|
#### Task 2
|
||||||
|
|
||||||
|
修改 `G2` archetype 判定逻辑,使以下信号在 `G2` 中具备更高权重:
|
||||||
|
|
||||||
|
1. `month/week`
|
||||||
|
2. `mode`
|
||||||
|
3. `tjzq`
|
||||||
|
4. 同一场景内多组线损接口
|
||||||
|
5. 模式切换分支字段
|
||||||
|
|
||||||
|
#### Task 3
|
||||||
|
|
||||||
|
新增或更新回归测试,证明:
|
||||||
|
|
||||||
|
1. 当前 `G2` fixture 不再判成 `paginated_enrichment`
|
||||||
|
2. `G2` 相关修正不会误伤现有 `G3` fixture
|
||||||
|
|
||||||
|
### WS2:Bootstrap Rectification
|
||||||
|
|
||||||
|
#### Task 4
|
||||||
|
|
||||||
|
审计当前 bootstrap 选择逻辑,确认为什么三份样本都稳定落到 `20.77.115.36:31051`。
|
||||||
|
|
||||||
|
#### Task 5
|
||||||
|
|
||||||
|
为 `G2` 引入更严格的 bootstrap 选择约束:
|
||||||
|
|
||||||
|
1. 优先真实线损业务承载页
|
||||||
|
2. 排除页面壳入口与错误主域
|
||||||
|
3. 继续排除 `localhost:*`、第三方库 URL、静态资源 URL
|
||||||
|
|
||||||
|
#### Task 6
|
||||||
|
|
||||||
|
新增或更新测试,证明:
|
||||||
|
|
||||||
|
1. `G2` 主样本 bootstrap 不再落到错误入口
|
||||||
|
2. `localhost:*` 仍只作为宿主依赖证据保留
|
||||||
|
|
||||||
|
### WS4:Endpoint Purification
|
||||||
|
|
||||||
|
#### Task 7
|
||||||
|
|
||||||
|
审计当前 endpoint 提取污染来源,明确以下类别如何被误收进业务候选:
|
||||||
|
|
||||||
|
1. 第三方依赖库
|
||||||
|
2. 文档外链
|
||||||
|
3. 静态资源 URL
|
||||||
|
4. 其他业务系统遗留接口
|
||||||
|
|
||||||
|
#### Task 8
|
||||||
|
|
||||||
|
收紧 endpoint 候选过滤与排序规则,使 `G2` 样本中:
|
||||||
|
|
||||||
|
1. 线损主业务接口排在前列
|
||||||
|
2. 外链与依赖库 URL 不再进入主业务候选
|
||||||
|
3. 其他业务系统 endpoint 不再轻易抢占主链
|
||||||
|
|
||||||
|
#### Task 9
|
||||||
|
|
||||||
|
补充测试,证明:
|
||||||
|
|
||||||
|
1. `G2` 主 endpoint 排序明显改善
|
||||||
|
2. 噪声 endpoint 不再污染生成主脚本
|
||||||
|
|
||||||
|
### Phase 1 Exit Criteria
|
||||||
|
|
||||||
|
1. `G2` fixture archetype 判定修正
|
||||||
|
2. `G2` bootstrap 选择修正
|
||||||
|
3. `G2` endpoint 候选排序修正
|
||||||
|
|
||||||
|
## Phase 2:重建 G2 合同
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
在主链判定正确后,恢复 `G2` 必需的 mode-specific 合同与更严格的 readiness。
|
||||||
|
|
||||||
|
### WS3:Mode Contract Reconstruction
|
||||||
|
|
||||||
|
#### Task 10
|
||||||
|
|
||||||
|
为 `G2` 定义最小 mode contract,至少包括:
|
||||||
|
|
||||||
|
1. `modes[]`
|
||||||
|
2. `defaultMode`
|
||||||
|
3. `modeSwitchField`
|
||||||
|
4. per-mode `requestTemplate`
|
||||||
|
5. per-mode `responsePath`
|
||||||
|
6. per-mode `columnDefs`
|
||||||
|
7. per-mode `normalizeRules`
|
||||||
|
|
||||||
|
#### Task 11
|
||||||
|
|
||||||
|
修改 `Scene IR` 组装或生成逻辑,让 `G2` 样本在证据充分时真正输出 `modes[]`,而不是只保留空壳默认字段。
|
||||||
|
|
||||||
|
#### Task 12
|
||||||
|
|
||||||
|
修改 `G2` 生成脚本模板或编译路径,避免继续退化成通用:
|
||||||
|
|
||||||
|
- `paginate -> secondary_request -> filter`
|
||||||
|
|
||||||
|
要求生成结果能体现:
|
||||||
|
|
||||||
|
1. `month` 模式
|
||||||
|
2. `week` 模式
|
||||||
|
3. 不同模式的请求差异
|
||||||
|
4. 不同模式的列差异
|
||||||
|
|
||||||
|
#### Task 13
|
||||||
|
|
||||||
|
新增或更新测试,证明:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析` 可输出非空 `modes`
|
||||||
|
2. 至少一个 `G2` fixture 恢复出 mode-specific contract
|
||||||
|
|
||||||
|
### WS5:Readiness Tightening
|
||||||
|
|
||||||
|
#### Task 14
|
||||||
|
|
||||||
|
为 `G2` 新增或收紧 gate,至少覆盖:
|
||||||
|
|
||||||
|
1. `g2_archetype_resolved`
|
||||||
|
2. `g2_bootstrap_resolved`
|
||||||
|
3. `g2_modes_present`
|
||||||
|
4. `g2_request_contract_complete`
|
||||||
|
5. `g2_response_contract_complete`
|
||||||
|
|
||||||
|
#### Task 15
|
||||||
|
|
||||||
|
调整 readiness 评级逻辑,保证以下情况不再给出高等级:
|
||||||
|
|
||||||
|
1. `modes = []`
|
||||||
|
2. `requestTemplate = null`
|
||||||
|
3. `columnDefs = []`
|
||||||
|
4. archetype 误判
|
||||||
|
|
||||||
|
#### Task 16
|
||||||
|
|
||||||
|
新增或更新测试,证明:
|
||||||
|
|
||||||
|
1. 不闭合 `G2` 样本会被降级或阻断
|
||||||
|
2. readiness 与候选验证门槛一致
|
||||||
|
|
||||||
|
### Phase 2 Exit Criteria
|
||||||
|
|
||||||
|
1. 至少 `G2` 主样本拥有可解释的 mode contract
|
||||||
|
2. readiness 不再虚高
|
||||||
|
3. `G2` 生成结果在结构上具备进入候选门槛的可能
|
||||||
|
|
||||||
|
## Phase 3:回归真实样本并产出整改报告
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
在整改完成后,重新生成三份 `G2` 真实样本,并输出第二轮正式结论。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
#### Task 17
|
||||||
|
|
||||||
|
重新生成以下三份 `G2` 样本:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
2. `白银线损周报`
|
||||||
|
3. `线损同期差异报表`
|
||||||
|
|
||||||
|
#### Task 18
|
||||||
|
|
||||||
|
按与第一轮完全一致的口径,对比以下项目:
|
||||||
|
|
||||||
|
1. archetype
|
||||||
|
2. bootstrap
|
||||||
|
3. modes
|
||||||
|
4. request contract
|
||||||
|
5. response / column / normalize contract
|
||||||
|
6. readiness
|
||||||
|
|
||||||
|
#### Task 19
|
||||||
|
|
||||||
|
输出整改后的第二轮报告,至少包含:
|
||||||
|
|
||||||
|
1. 哪些 blocker 被修掉
|
||||||
|
2. 哪些 blocker 仍存在
|
||||||
|
3. 哪些样本进入候选验证名单
|
||||||
|
4. 哪些样本仍需 fail-closed
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 第二轮 `G2` 真实样本生成结果
|
||||||
|
2. 第二轮 `G2` 整改回归报告
|
||||||
|
3. 更新后的候选验证名单
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析` 至少进入候选验证名单
|
||||||
|
2. 三份样本不再统一坍缩成 `paginated_enrichment`
|
||||||
|
3. readiness 与真实业务闭合程度基本一致
|
||||||
|
|
||||||
|
## File-Level Targets
|
||||||
|
|
||||||
|
本计划执行时,至少会触达以下类型资产:
|
||||||
|
|
||||||
|
1. `src/generated_scene/` 下的 analyzer / generator / readiness 相关实现
|
||||||
|
2. `tests/fixtures/generated_scene/` 下的 `G2` fixture 或 canonical 资产
|
||||||
|
3. `tests/` 下与 scene generator / canonical / readiness 相关的回归测试
|
||||||
|
4. `docs/superpowers/reports/` 下的第二轮整改报告
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
本计划完成的标志是:
|
||||||
|
|
||||||
|
1. `G2` 主样本达到候选验证门槛
|
||||||
|
2. `G2` 家族 blocker 从“稳定复现”转为“部分修复且可解释”
|
||||||
|
3. 后续是否切换到 `G1/G3`,可以建立在整改后二轮报告上,而不是继续依赖第一轮失败画像
|
||||||
@@ -0,0 +1,458 @@
|
|||||||
|
# G3 Paginated Enrichment Plan
|
||||||
|
|
||||||
|
> **Status:** Draft
|
||||||
|
> **Date:** 2026-04-18
|
||||||
|
> **Author:** Codex
|
||||||
|
> **Upstream Spec:** [2026-04-18-g3-paginated-enrichment-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-g3-paginated-enrichment-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
本计划用于把 `G3` 分页补数家族设计拆解为可执行任务,目标是把当前 `paginated_enrichment` 从“宽泛的复杂 workflow 标签”推进到“具备证据层、最小合同、canonical baseline 和 fail-closed 判定”的正式主线 archetype。
|
||||||
|
|
||||||
|
本计划严格限定在 `G3 / P0-3` 落地,不扩展到:
|
||||||
|
|
||||||
|
1. `G6/G7/G8`
|
||||||
|
2. 全量 `95598` 家族并发整改
|
||||||
|
3. 登录恢复或宿主 transport 重构
|
||||||
|
4. 102 个场景大规模铺开
|
||||||
|
|
||||||
|
## Success Baseline
|
||||||
|
|
||||||
|
本计划完成后的最低成功口径固定为:
|
||||||
|
|
||||||
|
1. `95598工单明细表` 不再只是“复杂工单类”模糊样本
|
||||||
|
2. 生成链能够显式恢复:
|
||||||
|
- `main request`
|
||||||
|
- `pagination plan`
|
||||||
|
- `enrichment requests`
|
||||||
|
- `export plan`
|
||||||
|
3. `localhost:*`、宿主注入和 BrowserAction 不再被误判为业务主链
|
||||||
|
4. `G3` 具备最小可编译合同和独立 gate
|
||||||
|
5. 证据不足时结果稳定 `fail-closed`
|
||||||
|
6. `95598、12398、流程超期风险工单明细` 能作为第一扩展样板进入复用验证
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
执行过程中保持以下边界不变:
|
||||||
|
|
||||||
|
1. 不把 `G3` 回退为普通分页表识别
|
||||||
|
2. 不把宿主桥接能力提前混入 `G3` 合同
|
||||||
|
3. 不为了先生成 skill 而放松 gate
|
||||||
|
4. 不并发展开 `G6/G7/G8`
|
||||||
|
5. 不在本计划中做真实内网人工验证
|
||||||
|
|
||||||
|
## Target Samples
|
||||||
|
|
||||||
|
本计划整改与回归只围绕以下两个样板:
|
||||||
|
|
||||||
|
1. `95598工单明细表`
|
||||||
|
2. `95598、12398、流程超期风险工单明细`
|
||||||
|
|
||||||
|
其中角色固定为:
|
||||||
|
|
||||||
|
1. `95598工单明细表`
|
||||||
|
- 作用:`P0-3` 主样板
|
||||||
|
- 目标:冻结 `G3 canonical`
|
||||||
|
2. `95598、12398、流程超期风险工单明细`
|
||||||
|
- 作用:第一扩展样板
|
||||||
|
- 目标:验证 `G3` 合同与证据层是否可复用
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
本计划拆为五条工作流:
|
||||||
|
|
||||||
|
1. `WS1` G3 边界冻结与样板建档
|
||||||
|
2. `WS2` G3 证据层建模
|
||||||
|
3. `WS3` G3 Scene IR / compiler gate / readiness 建设
|
||||||
|
4. `WS4` G3 P0 canonical 与失败 taxonomy 冻结
|
||||||
|
5. `WS5` G3 真实样本回归与报告
|
||||||
|
|
||||||
|
## Phase Overview
|
||||||
|
|
||||||
|
本计划按五个阶段推进:
|
||||||
|
|
||||||
|
1. Phase 0:冻结 `G3` 边界与样板
|
||||||
|
2. Phase 1:建立 `G3` 证据层
|
||||||
|
3. Phase 2:建立 `G3` 最小合同与 gate
|
||||||
|
4. Phase 3:冻结 `P0-3 canonical`
|
||||||
|
5. Phase 4:回归真实样本并输出首轮报告
|
||||||
|
|
||||||
|
执行顺序固定为:
|
||||||
|
|
||||||
|
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3 -> Phase 4`
|
||||||
|
|
||||||
|
## Phase 0:冻结 G3 边界与样板
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
先把 `G3` 的问题边界、主样板和扩展样板固定下来,避免开发过程中把工单类、宿主桥接类和导出分析类重新混在一起。
|
||||||
|
|
||||||
|
### WS1:G3 边界冻结与样板建档
|
||||||
|
|
||||||
|
#### Task 1
|
||||||
|
|
||||||
|
冻结 `G3` 正式定义:
|
||||||
|
|
||||||
|
1. 不是普通分页表
|
||||||
|
2. 不是宿主桥接型
|
||||||
|
3. 而是“主查询链 + 分页链 + 补数链 + 导出链”并存的复杂 workflow 报表
|
||||||
|
|
||||||
|
#### Task 2
|
||||||
|
|
||||||
|
冻结 `95598工单明细表` 为唯一 `P0-3` 主样板。
|
||||||
|
|
||||||
|
#### Task 3
|
||||||
|
|
||||||
|
冻结 `95598、12398、流程超期风险工单明细` 为第一扩展样板。
|
||||||
|
|
||||||
|
#### Task 4
|
||||||
|
|
||||||
|
固化 `G3` 进入条件:
|
||||||
|
|
||||||
|
1. 存在主查询链候选
|
||||||
|
2. 存在分页控制证据
|
||||||
|
3. 存在补数或关联详情链
|
||||||
|
4. 最终结果依赖分页拉全、补齐、导出或汇总
|
||||||
|
|
||||||
|
#### Task 5
|
||||||
|
|
||||||
|
固化 `G3` 排除条件:
|
||||||
|
|
||||||
|
1. 单请求即可完成的普通报表
|
||||||
|
2. 仅靠 BrowserAction 推进、无稳定业务主链
|
||||||
|
3. 以本地落库分析或文档产物为主体
|
||||||
|
4. `localhost:*` 或宿主依赖明显压过业务证据
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `G3` family definition
|
||||||
|
2. `G3` 样板清单
|
||||||
|
3. `G3` 进入条件与排除条件
|
||||||
|
4. `G3` 与其它家族边界说明
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. `95598工单明细表` 不再作为模糊工单样本讨论
|
||||||
|
2. `G3` 不再与宿主桥接型、文档产物型场景混淆
|
||||||
|
|
||||||
|
## Phase 1:建立 G3 证据层
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把源码直接压成 `Scene IR` 的路径升级为:先形成 `G3` 可裁决证据,再归约成 `Scene IR`。
|
||||||
|
|
||||||
|
### WS2:G3 证据层建模
|
||||||
|
|
||||||
|
#### Task 6
|
||||||
|
|
||||||
|
定义 `main_request_candidate`,承载:
|
||||||
|
|
||||||
|
1. 主查询 endpoint
|
||||||
|
2. 查询参数模板
|
||||||
|
3. 时间范围或主过滤条件
|
||||||
|
|
||||||
|
#### Task 7
|
||||||
|
|
||||||
|
定义 `pagination_candidate`,承载:
|
||||||
|
|
||||||
|
1. 页码字段
|
||||||
|
2. pageSize 字段
|
||||||
|
3. 翻页终止条件
|
||||||
|
4. 滚动窗口或区间推进规则
|
||||||
|
|
||||||
|
#### Task 8
|
||||||
|
|
||||||
|
定义 `enrichment_request_candidate`,承载:
|
||||||
|
|
||||||
|
1. 详情补查
|
||||||
|
2. 二次接口
|
||||||
|
3. 关联补数
|
||||||
|
|
||||||
|
#### Task 9
|
||||||
|
|
||||||
|
定义 `join_key_candidate`,承载:
|
||||||
|
|
||||||
|
1. 工单号
|
||||||
|
2. 流程号
|
||||||
|
3. 用户号
|
||||||
|
4. 设备号
|
||||||
|
5. 其它主补链关联键
|
||||||
|
|
||||||
|
#### Task 10
|
||||||
|
|
||||||
|
定义 `export_candidate`,承载:
|
||||||
|
|
||||||
|
1. 导出接口
|
||||||
|
2. 导出参数
|
||||||
|
3. 导出前置动作
|
||||||
|
4. 产物类型
|
||||||
|
|
||||||
|
#### Task 11
|
||||||
|
|
||||||
|
定义 `workflow_step_candidate`,承载:
|
||||||
|
|
||||||
|
1. 主查
|
||||||
|
2. 翻页
|
||||||
|
3. 补查
|
||||||
|
4. 聚合
|
||||||
|
5. 导出
|
||||||
|
|
||||||
|
之间的顺序关系。
|
||||||
|
|
||||||
|
#### Task 12
|
||||||
|
|
||||||
|
定义 `dedupe_or_merge_rule_candidate`,承载:
|
||||||
|
|
||||||
|
1. 去重规则
|
||||||
|
2. 主从并回规则
|
||||||
|
3. 跨页累积规则
|
||||||
|
|
||||||
|
#### Task 13
|
||||||
|
|
||||||
|
定义 `host_bridge_candidate` 与 `localhost_dependency_candidate`,确保宿主链只作为独立证据保留。
|
||||||
|
|
||||||
|
#### Task 14
|
||||||
|
|
||||||
|
建立证据归并与冲突裁决规则,明确:
|
||||||
|
|
||||||
|
1. 哪些属于业务主链
|
||||||
|
2. 哪些属于宿主桥接
|
||||||
|
3. 哪些属于结果导出链
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `G3` evidence schema
|
||||||
|
2. `G3` evidence type dictionary
|
||||||
|
3. 证据归并规则
|
||||||
|
4. `95598工单明细表` 第一版证据样例
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. 主链、分页链、补链、导出链、宿主链能够分槽呈现
|
||||||
|
2. `localhost:*` 不再混入业务主链
|
||||||
|
|
||||||
|
## Phase 2:建立 G3 最小合同与 Gate
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把 `G3` 的判定标准从“看起来像分页补数场景”升级为“最小业务合同是否成立”。
|
||||||
|
|
||||||
|
### WS3:G3 Scene IR / compiler gate / readiness 建设
|
||||||
|
|
||||||
|
#### Task 15
|
||||||
|
|
||||||
|
定义 `G3` 最小合同,至少包括:
|
||||||
|
|
||||||
|
1. `main_request`
|
||||||
|
2. `pagination_plan`
|
||||||
|
3. `enrichment_requests[]`
|
||||||
|
4. `join_keys[]`
|
||||||
|
5. `export_plan`
|
||||||
|
6. `merge_or_dedupe_rules`
|
||||||
|
|
||||||
|
#### Task 16
|
||||||
|
|
||||||
|
在 `Scene IR` 中承载 `G3` 专属结构,不再退化成普通 `paginated_enrichment` 空壳字段。
|
||||||
|
|
||||||
|
#### Task 17
|
||||||
|
|
||||||
|
增加 `G3` gate,至少包括:
|
||||||
|
|
||||||
|
1. `g3_main_request_resolved`
|
||||||
|
2. `g3_pagination_contract_complete`
|
||||||
|
3. `g3_enrichment_contract_complete`
|
||||||
|
4. `g3_join_key_resolved`
|
||||||
|
5. `g3_export_path_identified`
|
||||||
|
6. `g3_runtime_scope_compatible`
|
||||||
|
|
||||||
|
#### Task 18
|
||||||
|
|
||||||
|
定义 blocker / readiness 判定口径,要求能区分:
|
||||||
|
|
||||||
|
1. 业务证据不足
|
||||||
|
2. 分页合同不闭合
|
||||||
|
3. 补数合同不闭合
|
||||||
|
4. 导出链依赖宿主
|
||||||
|
5. 运行时依赖未满足
|
||||||
|
|
||||||
|
#### Task 19
|
||||||
|
|
||||||
|
落地 `fail-closed` 规则:
|
||||||
|
|
||||||
|
1. 主请求链缺失,阻断
|
||||||
|
2. 分页链存在但终止条件不明,阻断
|
||||||
|
3. 补数链存在但 join key 不明,阻断
|
||||||
|
4. 只有导出动作没有业务主链,阻断
|
||||||
|
5. 宿主桥接证据明显多于业务证据,阻断
|
||||||
|
|
||||||
|
#### Task 20
|
||||||
|
|
||||||
|
补充测试,证明未闭合 `G3` 样本不能伪装成 runnable skill。
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `G3` minimal contract table
|
||||||
|
2. `G3` gate table
|
||||||
|
3. `G3` blocker / readiness table
|
||||||
|
4. `G3` Scene IR example
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. `G3` 已拥有独立 gate
|
||||||
|
2. 未闭合结果会准确阻断
|
||||||
|
3. `compiler` 不再吞入未闭合 `G3 IR`
|
||||||
|
|
||||||
|
## Phase 3:冻结 P0-3 Canonical
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把 `95598工单明细表` 做成 `G3` 的第一版标准答案、关键证据基线和失败 taxonomy 基线。
|
||||||
|
|
||||||
|
### WS4:G3 P0 canonical 与失败 taxonomy 冻结
|
||||||
|
|
||||||
|
#### Task 21
|
||||||
|
|
||||||
|
冻结 `95598工单明细表` 的 canonical `Scene IR`。
|
||||||
|
|
||||||
|
#### Task 22
|
||||||
|
|
||||||
|
冻结关键证据清单,至少包括:
|
||||||
|
|
||||||
|
1. 主请求链
|
||||||
|
2. 分页链
|
||||||
|
3. 补数链
|
||||||
|
4. join key
|
||||||
|
5. 导出链
|
||||||
|
6. 宿主依赖
|
||||||
|
|
||||||
|
#### Task 23
|
||||||
|
|
||||||
|
冻结验收检查表,至少检查:
|
||||||
|
|
||||||
|
1. 主链是否恢复
|
||||||
|
2. 分页链是否恢复
|
||||||
|
3. 补链是否恢复
|
||||||
|
4. join key 是否恢复
|
||||||
|
5. 导出链是否恢复
|
||||||
|
6. 宿主链是否被隔离
|
||||||
|
7. readiness 是否与真实闭合程度一致
|
||||||
|
|
||||||
|
#### Task 24
|
||||||
|
|
||||||
|
冻结失败 taxonomy,至少包括:
|
||||||
|
|
||||||
|
1. `main_chain_missing`
|
||||||
|
2. `pagination_incomplete`
|
||||||
|
3. `enrichment_incomplete`
|
||||||
|
4. `join_key_missing`
|
||||||
|
5. `export_only_without_business_chain`
|
||||||
|
6. `host_bridge_pollution`
|
||||||
|
7. `runtime_dependency_unresolved`
|
||||||
|
|
||||||
|
#### Task 25
|
||||||
|
|
||||||
|
建立“生成结果 vs canonical”对齐方式。
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `G3` P0 canonical `Scene IR`
|
||||||
|
2. `G3` P0 evidence baseline
|
||||||
|
3. `G3` acceptance checklist
|
||||||
|
4. `G3` failure taxonomy table
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. `95598工单明细表` 成为 `G3` 第一版统一校准源
|
||||||
|
2. 后续 `G3` 回归都可以对照固定 taxonomy
|
||||||
|
|
||||||
|
## Phase 4:真实样本回归与首轮报告
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
先用 `P0` 主样板建立闭环,再用一个扩展样板验证 `G3` 合同是否具备复用性。
|
||||||
|
|
||||||
|
### WS5:G3 真实样本回归与报告
|
||||||
|
|
||||||
|
#### Task 26
|
||||||
|
|
||||||
|
重新生成 `95598工单明细表`。
|
||||||
|
|
||||||
|
#### Task 27
|
||||||
|
|
||||||
|
按统一口径检查:
|
||||||
|
|
||||||
|
1. archetype
|
||||||
|
2. bootstrap
|
||||||
|
3. main request
|
||||||
|
4. pagination plan
|
||||||
|
5. enrichment requests
|
||||||
|
6. join keys
|
||||||
|
7. export plan
|
||||||
|
8. localhost / host bridge separation
|
||||||
|
9. readiness / blocker
|
||||||
|
|
||||||
|
#### Task 28
|
||||||
|
|
||||||
|
输出 `G3 P0 validation report`,结论只允许以下三种:
|
||||||
|
|
||||||
|
1. `通过`
|
||||||
|
2. `Fail-closed 且理由准确`
|
||||||
|
3. `误判,需要整改`
|
||||||
|
|
||||||
|
#### Task 29
|
||||||
|
|
||||||
|
重新生成 `95598、12398、流程超期风险工单明细`。
|
||||||
|
|
||||||
|
#### Task 30
|
||||||
|
|
||||||
|
对比其与 `P0` 样板之间:
|
||||||
|
|
||||||
|
1. 哪些合同可复用
|
||||||
|
2. 哪些 blocker 是家族共性
|
||||||
|
3. 哪些是扩展样板特有复杂度
|
||||||
|
|
||||||
|
#### Task 31
|
||||||
|
|
||||||
|
输出 `G3 first-round family expansion report`。
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `G3` P0 样板生成结果
|
||||||
|
2. `G3` P0 验证报告
|
||||||
|
3. `G3` 扩展样板生成结果
|
||||||
|
4. `G3` 首轮家族扩展报告
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `95598工单明细表` 至少达到“结构恢复完整”或“Fail-closed 理由准确”
|
||||||
|
2. 扩展样板不会再被粗暴压成普通分页表
|
||||||
|
3. `G3` 失败结果具备可解释性
|
||||||
|
4. `G3` 至少形成第一版家族复用口径
|
||||||
|
|
||||||
|
## File-Level Targets
|
||||||
|
|
||||||
|
本计划执行时,至少会触达以下资产类型:
|
||||||
|
|
||||||
|
1. `docs/superpowers/specs/`
|
||||||
|
2. `docs/superpowers/plans/`
|
||||||
|
3. `docs/superpowers/reports/`
|
||||||
|
4. `src/generated_scene/` 下与证据层、合同层、readiness 相关实现
|
||||||
|
5. `tests/fixtures/generated_scene/`
|
||||||
|
6. `tests/`
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
本计划完成的标志是:
|
||||||
|
|
||||||
|
1. `G3` 已拥有正式边界定义
|
||||||
|
2. `G3` 已拥有最小证据层与最小合同
|
||||||
|
3. `G3` 已拥有独立 gate 与 fail-closed 口径
|
||||||
|
4. `95598工单明细表` 已成为 `P0-3 canonical`
|
||||||
|
5. `G3` 首轮真实样本回归已经给出正式结论
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
本计划完成后,后续顺序固定为:
|
||||||
|
|
||||||
|
1. 若 `G3` 的 `P0` 与首轮扩展样板稳定,再决定是否补第二个 `G1-E` 样板
|
||||||
|
2. 然后再决定是否进入 `G6` 的独立设计与计划
|
||||||
@@ -0,0 +1,77 @@
|
|||||||
|
# G6 Host Bridge Workflow Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-18
|
||||||
|
> Status: Initial implementation slice
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Start the `G6` line after `G1-E` second-sample reuse has been validated.
|
||||||
|
|
||||||
|
This plan implements the first safe slice only: classification, evidence separation, readiness gates, and fail-closed behavior.
|
||||||
|
|
||||||
|
## Phase 0: Boundary Freeze
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. keep `电能表现场检验完成率指标报表` as the P0 boundary sample
|
||||||
|
2. define the repo-local representative fixture
|
||||||
|
3. keep `G6` separate from `G1`, `G1-E`, `G3`, `G7`, and `G8`
|
||||||
|
|
||||||
|
Deliverables:
|
||||||
|
|
||||||
|
1. `G6` design doc
|
||||||
|
2. `G6` plan doc
|
||||||
|
3. repo-local representative fixture
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. `G6` is no longer discussed as a `G1` candidate
|
||||||
|
2. `G6` is not treated as a generic localhost-pollution case
|
||||||
|
|
||||||
|
## Phase 1: Analyzer Classification
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. add `host_bridge_workflow` as a workflow archetype
|
||||||
|
2. detect explicit host bridge actions
|
||||||
|
3. keep `localhost:*` as supporting host-runtime evidence
|
||||||
|
4. ensure explicit host bridge signals outrank `G1-E`
|
||||||
|
5. ensure ordinary localhost export noise does not become `G6`
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. `g6_host_bridge_workflow` fixture classifies as `host_bridge_workflow`
|
||||||
|
2. `bootstrap_localhost_pollution` remains a non-G6 business scene
|
||||||
|
|
||||||
|
## Phase 2: Fail-Closed Gate
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. add readiness risks for missing or unsupported G6 contract
|
||||||
|
2. add `g6_host_bridge_detected`
|
||||||
|
3. add `g6_fail_closed`
|
||||||
|
4. block generation before runnable output
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. `G6` generation returns a controlled error
|
||||||
|
2. error message includes `host_bridge_workflow`
|
||||||
|
3. no pseudo-runnable skill is produced
|
||||||
|
|
||||||
|
## Phase 3: Regression
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. run scene generator regression
|
||||||
|
2. run family regression
|
||||||
|
3. run family policy regression
|
||||||
|
4. run canonical regression
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. all target regressions pass
|
||||||
|
2. no `G1-E/G3/G2` behavior regresses
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this safe G6 slice, continue to `G7 多接口盘点汇总型` boundary assessment unless G6 runtime implementation becomes the selected priority.
|
||||||
@@ -0,0 +1,68 @@
|
|||||||
|
# G7 Multi Endpoint Inventory Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-18
|
||||||
|
> Status: Initial implementation slice
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Start `G7` after the safe `G6` classification slice.
|
||||||
|
|
||||||
|
This plan only establishes boundary classification and fail-closed behavior. It does not implement runnable multi-endpoint inventory aggregation.
|
||||||
|
|
||||||
|
## Phase 0: Boundary Freeze
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. use `计量资产库存统计` as the P0 boundary sample
|
||||||
|
2. define a repo-local representative fixture
|
||||||
|
3. keep `G7` separate from `G1`, `G1-E`, `G6`, and `G8`
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. `G7` is no longer a `G1` candidate
|
||||||
|
2. `G7` is not confused with host bridge workflow
|
||||||
|
|
||||||
|
## Phase 1: Analyzer Classification
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. add `multi_endpoint_inventory` as a workflow archetype
|
||||||
|
2. detect inventory endpoint families
|
||||||
|
3. classify scenes with three or more inventory endpoints as `G7`
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. `g7_multi_endpoint_inventory` fixture classifies as `multi_endpoint_inventory`
|
||||||
|
2. inventory endpoint names include `assetStatsQueryMeter` and `assetStatsQueryJlGnModule`
|
||||||
|
|
||||||
|
## Phase 2: Fail-Closed Gate
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. add `g7_inventory_endpoints_detected`
|
||||||
|
2. add `g7_fail_closed`
|
||||||
|
3. block generation before runnable output
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. generation returns a controlled error
|
||||||
|
2. error message includes `multi_endpoint_inventory`
|
||||||
|
3. no pseudo-runnable skill is produced
|
||||||
|
|
||||||
|
## Phase 3: Regression
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. run scene generator regression
|
||||||
|
2. run family regression
|
||||||
|
3. run family policy regression
|
||||||
|
4. run canonical regression
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. all target regressions pass
|
||||||
|
2. no existing family baseline regresses
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this safe G7 slice, continue to `G8 抓取落库分析出文档型` boundary assessment.
|
||||||
@@ -0,0 +1,70 @@
|
|||||||
|
# G8 Local Document Pipeline Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-18
|
||||||
|
> Status: Initial implementation slice
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Start `G8` after the safe `G7` classification slice.
|
||||||
|
|
||||||
|
This plan only establishes boundary classification and fail-closed behavior. It does not implement runnable local storage, SQL, or document generation orchestration.
|
||||||
|
|
||||||
|
## Phase 0: Boundary Freeze
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. use `95598供电服务月报` as the P0 boundary sample
|
||||||
|
2. define a repo-local representative fixture
|
||||||
|
3. keep `G8` separate from `G1`, `G1-E`, `G6`, `G7`, and `G3`
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. `G8` is no longer a `G1` candidate
|
||||||
|
2. `G8` is not collapsed into generic host bridge workflow
|
||||||
|
|
||||||
|
## Phase 1: Analyzer Classification
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. add `local_doc_pipeline` as a workflow archetype
|
||||||
|
2. detect `definedSqlQuery`
|
||||||
|
3. detect `docExport`
|
||||||
|
4. detect `selectData` / local config service persistence
|
||||||
|
5. prioritize `G8` over `G6` when both signals exist
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. `g8_local_doc_pipeline` fixture classifies as `local_doc_pipeline`
|
||||||
|
2. local pipeline actions are visible in deterministic facts
|
||||||
|
|
||||||
|
## Phase 2: Fail-Closed Gate
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. add `g8_local_doc_pipeline_detected`
|
||||||
|
2. add `g8_fail_closed`
|
||||||
|
3. block generation before runnable output
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. generation returns a controlled error
|
||||||
|
2. error message includes `local_doc_pipeline`
|
||||||
|
3. no pseudo-runnable skill is produced
|
||||||
|
|
||||||
|
## Phase 3: Regression
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. run scene generator regression
|
||||||
|
2. run family regression
|
||||||
|
3. run family policy regression
|
||||||
|
4. run canonical regression
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
1. all target regressions pass
|
||||||
|
2. no existing family baseline regresses
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this safe G8 slice, the boundary-reassignment sequence has a code-backed fail-closed guard for `G1-E`, `G6`, `G7`, and `G8`.
|
||||||
@@ -0,0 +1,215 @@
|
|||||||
|
# 线损家族变体扩展计划
|
||||||
|
|
||||||
|
> **Status:** Draft
|
||||||
|
> **Date:** 2026-04-18
|
||||||
|
> **Author:** Codex
|
||||||
|
> **Upstream Inputs:**
|
||||||
|
> [2026-04-18-g2-family-expansion-plan.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/plans/2026-04-18-g2-family-expansion-plan.md)
|
||||||
|
> [2026-04-18-g2-family-expansion-third-round-report.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/reports/2026-04-18-g2-family-expansion-third-round-report.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
本计划用于承接当前已经收敛出的 `G2-A/G2-B/G2-C` 三类线损子型,把目标从“修通三个代表样本”推进到“可复制扩展更多线损变体”。
|
||||||
|
|
||||||
|
当前已经证明:
|
||||||
|
|
||||||
|
1. `G2-A` 双模式主报表型可生成
|
||||||
|
2. `G2-B` 周报单侧 mode 型可生成
|
||||||
|
3. `G2-C` 混合联动型可生成
|
||||||
|
|
||||||
|
因此,下一阶段不再围绕这三个样本反复微调,而是要把“线损场景 -> 子型 -> 最小合同 -> 候选验证”这条复制链做出来。
|
||||||
|
|
||||||
|
## Success Baseline
|
||||||
|
|
||||||
|
本计划完成后的最低成功口径固定为:
|
||||||
|
|
||||||
|
1. 新增一批线损真实场景能够被归入现有子型或新子型
|
||||||
|
2. 每个新增子型都有最小合同标准
|
||||||
|
3. 至少每类新增子型有 2 到 3 个真实样本完成迁移验证
|
||||||
|
4. 不能归类或合同不足的样本继续 `fail-close`
|
||||||
|
5. 形成一份“线损家族实施映射表”
|
||||||
|
6. 输出一轮新的线损家族扩展报告
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
执行过程中保持以下边界不变:
|
||||||
|
|
||||||
|
1. 不扩展到非线损报表家族
|
||||||
|
2. 不处理统一登录、隐藏域登录或宿主 transport 重构
|
||||||
|
3. 不在本计划中做真实内网人工验证
|
||||||
|
4. 不把本计划扩散成 102 个全量场景一次性铺开
|
||||||
|
5. 不回头推翻已经收敛的 `G2-A/G2-B/G2-C` 结果
|
||||||
|
|
||||||
|
## Phase Overview
|
||||||
|
|
||||||
|
本计划按五个阶段推进:
|
||||||
|
|
||||||
|
1. Phase 0:冻结线损扩展基线
|
||||||
|
2. Phase 1:建立线损变体分组清单
|
||||||
|
3. Phase 2:为新增变体建立最小合同标准
|
||||||
|
4. Phase 3:按分组扩展 fixture / 判定 / 生成链路
|
||||||
|
5. Phase 4:回归真实样本并输出扩展报告
|
||||||
|
|
||||||
|
执行顺序固定为:
|
||||||
|
|
||||||
|
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3 -> Phase 4`
|
||||||
|
|
||||||
|
## Phase 0:冻结扩展基线
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把当前已经达成的线损家族基线冻结下来,作为后续横向复制的起点。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 固化 `G2-A/G2-B/G2-C` 当前口径
|
||||||
|
2. 固化这三类子型的最小合同事实
|
||||||
|
3. 固化本计划不再回到“主样本修通”阶段
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. `G2-A/G2-B/G2-C` 被视为已建立的家族基线
|
||||||
|
2. 扩展工作被明确表述为“更多线损变体复制”
|
||||||
|
|
||||||
|
## Phase 1:建立线损变体分组清单
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
先把“还要扩哪些线损场景”分组,而不是直接零散补样本。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 从现有线损场景中筛出最接近当前家族的候选样本
|
||||||
|
2. 按结构而不是按名称分组,至少分成:
|
||||||
|
- 双模式主报表型
|
||||||
|
- 周报/日报单侧模式型
|
||||||
|
- 排行/明细主链路型
|
||||||
|
- 线损主链路 + 外部系统联动型
|
||||||
|
- 异常诊断/详情下钻型
|
||||||
|
3. 每组先挑 2 到 3 个代表样本
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 线损变体分组清单
|
||||||
|
2. 每组代表样本名单
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. 不再按单个场景零散推进
|
||||||
|
2. 后续整改对象以“分组”为单位推进
|
||||||
|
|
||||||
|
## Phase 2:建立新增变体最小合同标准
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
为每一组新增线损变体先定义“什么叫最低可用”,再动生成器。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 为每组定义最小合同,至少明确:
|
||||||
|
- 主 endpoint
|
||||||
|
- request template
|
||||||
|
- response path
|
||||||
|
- 关键字段或 column defs
|
||||||
|
- normalize / required fields
|
||||||
|
2. 明确哪些链路属于主合同
|
||||||
|
3. 明确哪些链路属于扩展证据或风险证据
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 线损变体最小合同表
|
||||||
|
2. 每组的候选验证门槛
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. 每组都有统一判定口径
|
||||||
|
2. 后续开发不再靠单样本临时拍脑袋
|
||||||
|
|
||||||
|
## Phase 3:按分组扩展生成链路
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把新增变体分组逐类接入 analyzer / generator / readiness。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 每一组先补 fixture
|
||||||
|
2. 每一组先补测试
|
||||||
|
3. 再补子型判定
|
||||||
|
4. 再补最小合同恢复
|
||||||
|
5. 再补 readiness 分级
|
||||||
|
|
||||||
|
### Rules
|
||||||
|
|
||||||
|
1. 任何一组都必须先有 fixture,再改逻辑
|
||||||
|
2. 不允许多个组同时无边界并行扩散
|
||||||
|
3. 一组完成后再推进下一组
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 新增线损变体 fixture
|
||||||
|
2. 新增家族回归测试
|
||||||
|
3. 对应 analyzer / generator / readiness 扩展实现
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. 至少新增 1 到 2 类线损变体可生成
|
||||||
|
2. 原有 `G2-A/G2-B/G2-C` 不回退
|
||||||
|
|
||||||
|
## Phase 4:真实样本回归与扩展报告
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
把扩展后的线损家族能力回到真实样本上验证,而不是停在 fixture 层。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 重新生成各组代表样本
|
||||||
|
2. 对比:
|
||||||
|
- 子型判定
|
||||||
|
- bootstrap
|
||||||
|
- request contract
|
||||||
|
- response / column / normalize contract
|
||||||
|
- readiness
|
||||||
|
- 是否进入候选验证名单
|
||||||
|
3. 输出线损家族扩展回归报告
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 真实样本生成结果
|
||||||
|
2. 线损家族扩展回归报告
|
||||||
|
3. 更新后的候选验证名单
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. 至少 2 个以上新增线损变体组进入候选验证阶段
|
||||||
|
2. 不能归类的场景继续 `fail-close`
|
||||||
|
3. 原有三类 `G2-A/G2-B/G2-C` 不回退
|
||||||
|
|
||||||
|
## Workstream Breakdown
|
||||||
|
|
||||||
|
本计划建议按以下工作流落地:
|
||||||
|
|
||||||
|
1. `WS1` 线损变体盘点与分组
|
||||||
|
2. `WS2` 新增变体最小合同设计
|
||||||
|
3. `WS3` fixture / 回归测试扩展
|
||||||
|
4. `WS4` analyzer / generator / readiness 扩展
|
||||||
|
5. `WS5` 真实样本回归与报告
|
||||||
|
|
||||||
|
## File-Level Targets
|
||||||
|
|
||||||
|
执行本计划时,预计触达以下资产类型:
|
||||||
|
|
||||||
|
1. `docs/superpowers/plans/`
|
||||||
|
2. `docs/superpowers/reports/`
|
||||||
|
3. `tests/fixtures/generated_scene/`
|
||||||
|
4. `tests/`
|
||||||
|
5. `src/generated_scene/`
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
本计划完成的标志是:
|
||||||
|
|
||||||
|
1. 线损家族不再只有三个代表样本可解释
|
||||||
|
2. 已建立“按分组复制”的扩展方法,而不是单样本修修补补
|
||||||
|
3. 后续是否继续向更广场景扩展,可以建立在这份线损家族扩展结果上
|
||||||
@@ -0,0 +1,237 @@
|
|||||||
|
# Scene Generator Ops Console Plan
|
||||||
|
|
||||||
|
> **Status:** Draft
|
||||||
|
> **Date:** 2026-04-18
|
||||||
|
> **Author:** Codex
|
||||||
|
> **Upstream Spec:** [2026-04-18-scene-generator-ops-console-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-scene-generator-ops-console-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
本计划用于将 scene generator 页面从“开发调试控制台”收敛为“面向运维的场景 Skill 生成工作台”,并把上游 `spec` 中已经明确的信息架构、中文化、显隐分层和交互流程拆解为可执行的实施步骤。
|
||||||
|
|
||||||
|
本计划只覆盖前端页面层与页面交互层的收敛,不扩展到 scene generator 后端分析逻辑或生成协议改造。
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
本计划执行过程中,以下边界保持不变:
|
||||||
|
|
||||||
|
1. 不修改 scene generator 后端接口协议
|
||||||
|
2. 不重写分析算法或 Skill 生成逻辑
|
||||||
|
3. 不删除现有调试信息,只调整默认显隐与展示层次
|
||||||
|
4. 不把本计划扩展成新的前端设计系统建设
|
||||||
|
|
||||||
|
## Primary Outcome
|
||||||
|
|
||||||
|
本计划的直接目标是让运维人员不需要理解 `Scene IR`、`workflowArchetype`、`requestTemplate` 等底层术语,也能完成:
|
||||||
|
|
||||||
|
1. 选择场景目录
|
||||||
|
2. 启动分析
|
||||||
|
3. 判断是否可生成
|
||||||
|
4. 启动生成
|
||||||
|
5. 查看结果目录或失败原因
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
本计划拆分为四条工作流:
|
||||||
|
|
||||||
|
1. `WS1` 信息架构与页面分层收敛
|
||||||
|
2. `WS2` 中文化与业务态映射
|
||||||
|
3. `WS3` 日志、结果与风险摘要收敛
|
||||||
|
4. `WS4` 调试信息折叠与双层体验收口
|
||||||
|
|
||||||
|
## Phase Overview
|
||||||
|
|
||||||
|
计划按五个阶段推进:
|
||||||
|
|
||||||
|
1. Phase 0:冻结页面目标与口径
|
||||||
|
2. Phase 1:完成信息架构重组
|
||||||
|
3. Phase 2:完成中文化和业务态映射
|
||||||
|
4. Phase 3:完成日志与结果区收敛
|
||||||
|
5. Phase 4:完成调试层折叠和整体验收
|
||||||
|
|
||||||
|
## Phase 0:冻结页面目标与口径
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
先冻结该页面服务对象、默认使用模式、主状态表达与一级/二级/三级信息边界,避免实施过程中一边改布局一边改定位。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 固化页面角色定义:运维执行者优先,开发 / 调试者次级
|
||||||
|
2. 固化页面定位:运维工作台,而不是开发调试台
|
||||||
|
3. 固化默认模式:默认运维模式,技术详情折叠
|
||||||
|
4. 固化一级/二级/三级信息边界
|
||||||
|
5. 固化状态表达、场景类型映射和可执行性映射口径
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 页面角色说明
|
||||||
|
2. 信息层级边界说明
|
||||||
|
3. 状态与场景类型映射表
|
||||||
|
4. 显隐策略说明
|
||||||
|
|
||||||
|
### Exit Criteria
|
||||||
|
|
||||||
|
1. 页面默认服务对象不再摇摆
|
||||||
|
2. 一级信息与技术详情边界不再摇摆
|
||||||
|
3. 中文状态和类型映射口径冻结
|
||||||
|
|
||||||
|
## Phase 1:完成信息架构重组
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
将当前“配置区 + 分析区 + 生成日志 + 技术字段混排”的页面结构,重组为运维可理解的工作台结构。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 重组顶部总览区
|
||||||
|
2. 重组左侧主操作区
|
||||||
|
3. 重组右侧结果摘要区
|
||||||
|
4. 重组底部执行过程区
|
||||||
|
5. 预留技术详情区并默认折叠
|
||||||
|
|
||||||
|
### Required Sections
|
||||||
|
|
||||||
|
首屏结构固定为:
|
||||||
|
|
||||||
|
1. 顶部总览区
|
||||||
|
2. 左侧主操作区
|
||||||
|
3. 右侧结果摘要区
|
||||||
|
4. 底部执行过程区
|
||||||
|
5. 技术详情区
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 页面区块结构实现
|
||||||
|
2. 区块标题与区块顺序实现
|
||||||
|
3. 一级流程的视觉主路径
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. 首屏不再同时暴露大量技术细节
|
||||||
|
2. 运维默认流程可以按“选择目录 -> 分析 -> 生成 -> 查看结果”完成
|
||||||
|
3. 页面结构从“调试面板”转为“工作台”
|
||||||
|
|
||||||
|
## Phase 2:完成中文化和业务态映射
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
将当前页面的大量英文标题、按钮和技术术语替换为面向运维的中文表述,并将底层技术状态映射为业务可读状态。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 替换页面标题、副标题和区块标题
|
||||||
|
2. 替换按钮文案和输入框占位文案
|
||||||
|
3. 替换日志标签文案
|
||||||
|
4. 建立 `Readiness` 中文映射
|
||||||
|
5. 建立 archetype 中文映射
|
||||||
|
|
||||||
|
### Required Mappings
|
||||||
|
|
||||||
|
最小映射集合包括:
|
||||||
|
|
||||||
|
1. `Readiness A/B/C -> 可直接生成 / 可生成但需确认 / 暂不建议生成`
|
||||||
|
2. `single_request_table -> 单页报表`
|
||||||
|
3. `multi_mode_request -> 多模式报表`
|
||||||
|
4. `paginated_enrichment -> 分页明细`
|
||||||
|
5. `page_state_eval -> 页面检测`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 中文标题与按钮实现
|
||||||
|
2. 中文状态映射实现
|
||||||
|
3. 中文场景类型映射实现
|
||||||
|
4. 中文风险与结果文案实现
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. 首屏不再出现大面积未翻译英文
|
||||||
|
2. 运维可直接理解主要状态和场景类型
|
||||||
|
3. 技术术语不再作为首页主文案
|
||||||
|
|
||||||
|
## Phase 3:完成日志与结果区收敛
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
让页面日志和结果区优先服务“执行与排障”,而不是原始流式调试输出。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 将 `Generation Log` 改为 `执行过程`
|
||||||
|
2. 将 `status / log / complete / error` 标签中文化
|
||||||
|
3. 将原始流日志优先收敛为中文摘要日志
|
||||||
|
4. 完善 `生成结果` 区的成功/失败状态展示
|
||||||
|
5. 强化输出目录和结果文件入口
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 中文摘要日志
|
||||||
|
2. 生成结果卡片
|
||||||
|
3. 失败原因摘要
|
||||||
|
4. 输出目录入口
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. 运维无需阅读底层 SSE 技术消息也能理解执行过程
|
||||||
|
2. 成功时能快速找到结果目录
|
||||||
|
3. 失败时能快速看到中文失败原因
|
||||||
|
|
||||||
|
## Phase 4:完成调试层折叠和整体验收
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
保留开发与排障能力,但让其默认下沉为调试层,不干扰运维首屏使用。
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. 将 `Scene IR`、`requestTemplate`、`evidence`、`workflow steps` 等收入口技术详情区
|
||||||
|
2. 将 `scene-id`、`scene-kind`、`targetUrl override`、`workflow archetype override` 收入口高级设置
|
||||||
|
3. 校验默认显隐逻辑
|
||||||
|
4. 校验运维模式与调试模式体验边界
|
||||||
|
5. 完成最终页面口径验收
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. 高级设置折叠区
|
||||||
|
2. 技术详情折叠区
|
||||||
|
3. 最终页面显隐策略实现
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. 运维首页只承载状态摘要、操作与结果
|
||||||
|
2. 开发调试仍可通过折叠区查看完整技术信息
|
||||||
|
3. 不再出现“默认首屏就是技术调试面板”的体验
|
||||||
|
|
||||||
|
## File-Level Planning Targets
|
||||||
|
|
||||||
|
本计划后续实施至少覆盖以下资产:
|
||||||
|
|
||||||
|
1. [sg_scene_generator.html](D:/data/ideaSpace/rust/sgClaw/claw-new/frontend/scene-generator/sg_scene_generator.html)
|
||||||
|
2. 与页面展示文案和显隐逻辑相关的前端脚本
|
||||||
|
3. 与页面标题、区块结构和状态映射相关的前端样式与渲染逻辑
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
本计划完成的标志为:
|
||||||
|
|
||||||
|
1. 页面默认形态已从“开发调试控制台”转为“运维工作台”
|
||||||
|
2. 首屏已完成中文化和业务态映射
|
||||||
|
3. 运维默认流程可在首屏完成,不依赖技术详情区
|
||||||
|
4. 调试信息仍保留,但不再默认淹没首页
|
||||||
|
5. 失败原因、风险提示和结果目录对运维可直接理解
|
||||||
|
|
||||||
|
## Risks and Control Points
|
||||||
|
|
||||||
|
1. 若只改文案不改信息架构,页面仍会保持臃肿
|
||||||
|
2. 若只隐藏字段不重做结果摘要,运维仍无法快速判断是否可生成
|
||||||
|
3. 若过度删除技术信息,会削弱开发与排障效率
|
||||||
|
4. 若状态映射不统一,页面会出现中文标题下仍夹杂底层技术语义的割裂感
|
||||||
|
|
||||||
|
## Out of Plan
|
||||||
|
|
||||||
|
以下事项不属于本计划直接交付范围:
|
||||||
|
|
||||||
|
1. scene generator 后端分析逻辑重构
|
||||||
|
2. Skill 生成协议变更
|
||||||
|
3. 页面服务端接口新增
|
||||||
|
4. 运维权限、账号体系或多角色权限控制
|
||||||
@@ -0,0 +1,277 @@
|
|||||||
|
# sgClaw Scene Skill Post-Roadmap Execution Plan
|
||||||
|
|
||||||
|
> **Status:** Draft
|
||||||
|
> **Date:** 2026-04-18
|
||||||
|
> **Author:** Codex
|
||||||
|
> **Upstream Spec:** [2026-04-18-scene-skill-post-roadmap-execution-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-scene-skill-post-roadmap-execution-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan starts after the closure of the current `60-to-90 roadmap`.
|
||||||
|
|
||||||
|
Its purpose is not to reopen `G1/G2/G3` implementation, but to:
|
||||||
|
|
||||||
|
1. unify current execution state
|
||||||
|
2. start real-sample validation
|
||||||
|
3. plan the next bounded roadmap
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. Do not reopen completed `G1/G2/G3` repo-local baseline implementation.
|
||||||
|
2. Do not keep expanding fixtures as the primary mode of progress.
|
||||||
|
3. Do not silently pull `G4/G5` into implementation.
|
||||||
|
4. Do not directly implement unified login recovery in this plan.
|
||||||
|
5. Do not treat the old roadmap as still open-ended.
|
||||||
|
6. Phase 1 execution-board work must stay minimal and exist only to support Phase 2 real-sample validation.
|
||||||
|
7. Once `G2`, `G1-E`, and `G3` each have at least one mappable real sample, execution must move immediately into Phase 2.
|
||||||
|
8. Any new asset that does not directly support real-sample validation is deferred to Phase 3 or Phase 4.
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Current Execution Board Unification
|
||||||
|
2. `WS2` Real Sample Validation
|
||||||
|
3. `WS3` Boundary and Runtime Gap Planning
|
||||||
|
4. `WS4` Next Roadmap Definition
|
||||||
|
|
||||||
|
## Phase Overview
|
||||||
|
|
||||||
|
1. Phase 0: Freeze Handover Boundary
|
||||||
|
2. Phase 1: Build Current Execution Board
|
||||||
|
3. Phase 2: Start Real Sample Validation
|
||||||
|
4. Phase 3: Define Boundary and Runtime Entry Rules
|
||||||
|
5. Phase 4: Publish the Next Roadmap
|
||||||
|
|
||||||
|
Execution order is fixed as:
|
||||||
|
|
||||||
|
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3 -> Phase 4`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Handover Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the boundary between the completed roadmap and the next-stage work.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Freeze current roadmap completion status.
|
||||||
|
2. Freeze current mainline family status for `G2`, `G1-E`, and `G3`.
|
||||||
|
3. Freeze current boundary family status for `G6/G7/G8`.
|
||||||
|
4. Freeze current deferred status for `G4/G5`.
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. roadmap handover snapshot
|
||||||
|
2. next-stage scope statement
|
||||||
|
3. current family-state matrix
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. old and new roadmap boundaries are explicit
|
||||||
|
2. next-stage work is no longer mixed into the old roadmap
|
||||||
|
|
||||||
|
## Phase 1: Build Current Execution Board
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Create the minimum authoritative execution board required to start real-sample validation for the current `102-scene` status.
|
||||||
|
|
||||||
|
### WS1
|
||||||
|
|
||||||
|
#### Task 1
|
||||||
|
|
||||||
|
Build one `102-scene current execution board`.
|
||||||
|
|
||||||
|
#### Task 2
|
||||||
|
|
||||||
|
Define the stable scene status vocabulary:
|
||||||
|
|
||||||
|
1. `promoted-baseline`
|
||||||
|
2. `promoted-expansion`
|
||||||
|
3. `boundary-family`
|
||||||
|
4. `deferred`
|
||||||
|
5. `degraded`
|
||||||
|
6. `unvalidated`
|
||||||
|
|
||||||
|
#### Task 3
|
||||||
|
|
||||||
|
Map current `G2/G1-E/G3` scene promotions into the board.
|
||||||
|
|
||||||
|
#### Task 4
|
||||||
|
|
||||||
|
Generate a snapshot-vs-current diff asset.
|
||||||
|
|
||||||
|
#### Task 5
|
||||||
|
|
||||||
|
Stop Phase 1 immediately after `G2`, `G1-E`, and `G3` each have at least one mappable real sample entry in the board.
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `102-scene current execution board`
|
||||||
|
2. snapshot-vs-current diff report
|
||||||
|
3. scene-to-family status mapping
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. every scene has one current-state label
|
||||||
|
2. promoted states are visible without reading multiple assets
|
||||||
|
3. board status matches current family assets
|
||||||
|
4. the board is limited to the minimum fields needed by Phase 2 validation records
|
||||||
|
5. no Phase 1 asset is added unless it directly supports real-sample validation
|
||||||
|
|
||||||
|
## Phase 2: Start Real Sample Validation
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Create the next quality layer above fixture success.
|
||||||
|
|
||||||
|
### WS2
|
||||||
|
|
||||||
|
#### Task 5
|
||||||
|
|
||||||
|
Choose the first real-sample validation set for:
|
||||||
|
|
||||||
|
1. `G2`
|
||||||
|
2. `G1-E`
|
||||||
|
3. `G3`
|
||||||
|
|
||||||
|
#### Task 6
|
||||||
|
|
||||||
|
Freeze validation criteria:
|
||||||
|
|
||||||
|
1. compile success
|
||||||
|
2. readiness correctness
|
||||||
|
3. data correctness
|
||||||
|
4. output correctness
|
||||||
|
5. fail-closed correctness
|
||||||
|
|
||||||
|
#### Task 7
|
||||||
|
|
||||||
|
Create a real-sample validation record template.
|
||||||
|
|
||||||
|
#### Task 8
|
||||||
|
|
||||||
|
Record first-round real-sample results.
|
||||||
|
|
||||||
|
#### Task 9
|
||||||
|
|
||||||
|
Write mismatches back into the execution board.
|
||||||
|
|
||||||
|
#### Task 10
|
||||||
|
|
||||||
|
Reject requests for new board-only assets that do not unblock current validation execution.
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. real-sample validation plan
|
||||||
|
2. real-sample record template
|
||||||
|
3. first-round validation records
|
||||||
|
4. mismatch taxonomy
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. each mainline family has at least one real-sample record
|
||||||
|
2. real-sample status is separated from fixture status
|
||||||
|
3. mismatch reasons are explicit and reusable
|
||||||
|
4. Phase 2 begins as soon as `G2`, `G1-E`, and `G3` each have one mappable real sample
|
||||||
|
|
||||||
|
## Phase 3: Define Boundary and Runtime Entry Rules
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Prepare the next bounded execution scope instead of drifting into it.
|
||||||
|
|
||||||
|
### WS3
|
||||||
|
|
||||||
|
#### Task 11
|
||||||
|
|
||||||
|
Assess `G6/G7/G8` boundary-family readiness for future expansion.
|
||||||
|
|
||||||
|
#### Task 12
|
||||||
|
|
||||||
|
Define formal entry criteria for `G4/G5`.
|
||||||
|
|
||||||
|
#### Task 13
|
||||||
|
|
||||||
|
Build a runtime-gap matrix for:
|
||||||
|
|
||||||
|
1. login recovery
|
||||||
|
2. host-runtime integration
|
||||||
|
3. transport/runtime gaps
|
||||||
|
4. local document and attachment workflows
|
||||||
|
|
||||||
|
#### Task 14
|
||||||
|
|
||||||
|
Separate:
|
||||||
|
|
||||||
|
1. archetype-family gaps
|
||||||
|
2. runtime-platform gaps
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. boundary readiness note
|
||||||
|
2. deferred family entry criteria
|
||||||
|
3. runtime gap matrix
|
||||||
|
4. prioritization note
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `G4/G5` do not enter the next build round without documented criteria
|
||||||
|
2. runtime gaps are tracked separately from family expansion
|
||||||
|
3. next implementation scope has an explicit reason
|
||||||
|
|
||||||
|
## Phase 4: Publish the Next Roadmap
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Replace open-ended continuation with a new bounded roadmap.
|
||||||
|
|
||||||
|
### WS4
|
||||||
|
|
||||||
|
#### Task 15
|
||||||
|
|
||||||
|
Write the next-stage design.
|
||||||
|
|
||||||
|
#### Task 16
|
||||||
|
|
||||||
|
Write the next-stage plan.
|
||||||
|
|
||||||
|
#### Task 17
|
||||||
|
|
||||||
|
Define milestone ordering.
|
||||||
|
|
||||||
|
#### Task 18
|
||||||
|
|
||||||
|
Define next-stage completion criteria.
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. post-roadmap design
|
||||||
|
2. post-roadmap plan
|
||||||
|
3. milestone table
|
||||||
|
4. completion criteria
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. new implementation work has a new roadmap
|
||||||
|
2. the old roadmap is no longer implicitly extended
|
||||||
|
3. next-stage completion can be judged independently
|
||||||
|
|
||||||
|
## Milestone Order
|
||||||
|
|
||||||
|
1. Freeze the handover boundary
|
||||||
|
2. Unify the execution board
|
||||||
|
3. Start real-sample validation
|
||||||
|
4. Freeze boundary/runtime entry rules
|
||||||
|
5. Publish the next roadmap
|
||||||
|
|
||||||
|
No new implementation round should begin before milestones 1 to 4 are complete.
|
||||||
|
No Phase 1 expansion should continue after the minimum board needed for milestone 3 is available.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. the current roadmap is explicitly closed
|
||||||
|
2. the execution board is unified
|
||||||
|
3. real-sample validation is formally underway
|
||||||
|
4. a new bounded roadmap exists for post-roadmap work
|
||||||
@@ -0,0 +1,128 @@
|
|||||||
|
# sgClaw Scene Skill Real Sample Validation Roadmap Plan
|
||||||
|
|
||||||
|
> **Status:** Draft
|
||||||
|
> **Date:** 2026-04-18
|
||||||
|
> **Author:** Codex
|
||||||
|
> **Upstream Spec:** [2026-04-18-scene-skill-real-sample-validation-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-scene-skill-real-sample-validation-roadmap-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan starts after the post-roadmap execution board and first-round validation layer are in place.
|
||||||
|
|
||||||
|
Its purpose is to:
|
||||||
|
|
||||||
|
1. execute selected real samples for `G2`, `G1-E`, and `G3`
|
||||||
|
2. use validation outcomes to decide the next bounded implementation scope
|
||||||
|
3. avoid drifting back into fixture-first or asset-first work
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. Do not reopen completed repo-local baseline implementation for `G1/G2/G3`.
|
||||||
|
2. Do not create new board-only assets unless they unblock current validation execution.
|
||||||
|
3. Do not open `G4/G5` implementation before formal entry decisions are documented.
|
||||||
|
4. Do not pull `G6/G7/G8` into the next build round without explicit validation pressure.
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Mainline Real Sample Execution
|
||||||
|
2. `WS2` Validation Result Triage
|
||||||
|
3. `WS3` Boundary Runtime Entry Decision
|
||||||
|
4. `WS4` Deferred Family Entry Decision
|
||||||
|
|
||||||
|
## Phase 0: Execute Mainline Real Samples
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Convert selected `G2`, `G1-E`, and `G3` anchors into executed real-sample records.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Execute `G2` anchor validation updates from the current mismatch baseline.
|
||||||
|
2. Keep `G1-E` real pass anchor as the current positive baseline.
|
||||||
|
3. Execute the pending `G3` real sample.
|
||||||
|
4. Write all outcomes into the validation record layer.
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. updated real-sample validation records
|
||||||
|
2. updated mismatch taxonomy usage
|
||||||
|
3. updated execution-board validation statuses
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `G2`, `G1-E`, and `G3` each have executed real-sample records
|
||||||
|
2. `selected-not-yet-run` no longer remains for current mainline anchors
|
||||||
|
|
||||||
|
## Phase 1: Triage Results Into Scope Decisions
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Use validation results, not fixture status, to choose the next bounded scope.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. classify each mainline family result as `stable`, `mismatch-driven`, or `blocked-by-runtime`
|
||||||
|
2. identify which problems are compiler-family gaps and which are runtime gaps
|
||||||
|
3. define the next recommended scope from validation evidence
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. validation triage report
|
||||||
|
2. next-scope recommendation
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next scope is justified by executed validation evidence
|
||||||
|
2. repo-local success no longer acts as the sole decision signal
|
||||||
|
|
||||||
|
## Phase 2: Boundary Runtime Entry Decision
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Decide whether `G6/G7/G8` should stay boundary-only or enter a runtime-focused roadmap.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. compare boundary-family runtime gaps against executed validation pressure
|
||||||
|
2. decide whether any boundary family should enter the next roadmap
|
||||||
|
3. document non-entry decisions explicitly when scope stays closed
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. boundary runtime decision note
|
||||||
|
2. next-roadmap inclusion or exclusion list
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `G6/G7/G8` entry decisions are explicit
|
||||||
|
2. no boundary family enters by drift
|
||||||
|
|
||||||
|
## Phase 3: Deferred Family Entry Decision
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Decide whether `G4/G5` should remain closed or enter a later roadmap.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. compare deferred-family criteria against current validation pressure
|
||||||
|
2. confirm whether `G4/G5` remain deferred or degraded
|
||||||
|
3. record the decision before any new implementation starts
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. deferred family decision note
|
||||||
|
2. updated next-roadmap scope boundary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `G4/G5` entry decisions are explicit
|
||||||
|
2. deferred families do not enter implementation implicitly
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. all selected mainline anchors have executed real-sample records
|
||||||
|
2. the next implementation scope is selected from validation outcomes
|
||||||
|
3. boundary and deferred family entry decisions are documented
|
||||||
@@ -0,0 +1,51 @@
|
|||||||
|
# 102 Final Coverage Status Rollup Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Layer: `Layer E`
|
||||||
|
> Status: Active
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Create the final 102-scene coverage rollup after residual 13 closure. This plan publishes a candidate/status view only.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/full_coverage_reconciliation_candidates_2026-04-19.json`
|
||||||
|
2. `tests/fixtures/generated_scene/residual_13_reconciliation_candidates_2026-04-19.json`
|
||||||
|
3. `tests/fixtures/generated_scene/boundary_residual_hold_decision_2026-04-19.json`
|
||||||
|
4. `tests/fixtures/generated_scene/bootstrap_target_residual_isolation_2026-04-19.json`
|
||||||
|
5. `tests/fixtures/generated_scene/promotion_board_reconciliation_policy_2026-04-19.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/final_coverage_status_rollup_2026-04-19.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-19-102-final-coverage-status-rollup-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `src/generated_scene/analyzer.rs`
|
||||||
|
3. `src/generated_scene/generator.rs`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Load the 102-scene full coverage reconciliation candidate view.
|
||||||
|
2. Load the residual 13 reconciliation candidate view.
|
||||||
|
3. Replace matching residual scenes in the 102 view with residual follow-up candidate statuses.
|
||||||
|
4. Attach boundary/bootstrap overlay decisions where present.
|
||||||
|
5. Produce final coverage summary.
|
||||||
|
6. Publish the rollup JSON.
|
||||||
|
7. Publish the rollup report.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. Final rollup contains `102` scenes.
|
||||||
|
2. Final summary has `95` framework auto-pass candidates and `7` structured fail-closed candidates.
|
||||||
|
3. There are `0` source-unreadable, unsupported-family, missing-source, and misclassified-unresolved records.
|
||||||
|
4. Official execution board is not modified.
|
||||||
|
5. Report names the next bounded step.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the final coverage rollup JSON and report are published. Do not update the official execution board under this plan.
|
||||||
@@ -0,0 +1,42 @@
|
|||||||
|
# 102 Framework Closure Rollup Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Sequence: `2026-04-19-final-2-residual-child-plan-sequence-plan.md`
|
||||||
|
> Status: Draft
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Publish the final 102-scene framework closure rollup after the final-2 residual roadmaps and board refresh are complete.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. optional `tests/fixtures/generated_scene/final_2_official_board_reconciliation_refresh_2026-04-19.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_framework_closure_rollup_2026-04-19.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-19-scene-skill-102-framework-closure-rollup-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Load official board.
|
||||||
|
2. Count framework statuses.
|
||||||
|
3. List any remaining structured fail-closed scenes and their named next actions.
|
||||||
|
4. Verify unresolved count is zero.
|
||||||
|
5. Publish closure rollup JSON and report.
|
||||||
|
|
||||||
|
## Expected Delta
|
||||||
|
|
||||||
|
No implementation delta. This is the final reporting layer.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the 102 framework closure rollup. Do not start another runtime roadmap under this plan.
|
||||||
@@ -0,0 +1,62 @@
|
|||||||
|
# 102 Full Coverage Follow-Up Sweep And Reconciliation Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Layer: `Layer E`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-102-full-coverage-followup-sweep-and-reconciliation-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Run one fixed full 102-scene follow-up sweep after Route 2 through Route 6 have closed, then publish a policy-governed reconciliation candidate view.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/g3_enrichment_request_closure_followup_2026-04-19.json`
|
||||||
|
3. `tests/fixtures/generated_scene/g3_export_plan_closure_followup_2026-04-19.json`
|
||||||
|
4. `tests/fixtures/generated_scene/g3_residual_contract_closure_2026-04-19.json`
|
||||||
|
5. `tests/fixtures/generated_scene/g2_remaining_fail_closed_closure_followup_2026-04-19.json`
|
||||||
|
6. `tests/fixtures/generated_scene/g1e_remaining_fail_closed_closure_followup_2026-04-19.json`
|
||||||
|
7. `tests/fixtures/generated_scene/boundary_fail_closed_decision_2026-04-19.json`
|
||||||
|
8. `tests/fixtures/generated_scene/promotion_board_reconciliation_policy_2026-04-19.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. follow-up sweep JSON asset
|
||||||
|
2. reconciliation candidate JSON asset
|
||||||
|
3. follow-up sweep report
|
||||||
|
4. reconciliation candidate report
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
4. family implementation assets
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. run fixed 102-scene follow-up sweep
|
||||||
|
2. classify raw sweep result
|
||||||
|
3. apply Route 5 route decisions where applicable
|
||||||
|
4. apply Route 6 promotion policy to build reconciliation candidate view
|
||||||
|
5. publish coverage delta and remaining-gap report
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
The plan should quantify cumulative delta after Routes 2, 3, and 4.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. total scene count is 102
|
||||||
|
2. every scene has one raw sweep status
|
||||||
|
3. every scene has one reconciliation candidate status
|
||||||
|
4. coverage delta is reported
|
||||||
|
5. official execution board is not modified
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the follow-up sweep and reconciliation candidate reports.
|
||||||
|
|
||||||
|
Do not start a new implementation route under this plan.
|
||||||
197
docs/superpowers/plans/2026-04-19-102-full-sweep-dry-run-plan.md
Normal file
197
docs/superpowers/plans/2026-04-19-102-full-sweep-dry-run-plan.md
Normal file
@@ -0,0 +1,197 @@
|
|||||||
|
# 102 Full Sweep Dry-Run Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-102-full-sweep-dry-run-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-102-full-sweep-dry-run-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Run one bounded, read-only full sweep over the `102` scene ledger to measure actual generic `scene -> skill` coverage.
|
||||||
|
|
||||||
|
The plan answers:
|
||||||
|
|
||||||
|
`how many of the 102 scenes can the current generic analyzer/generator handle today?`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not change analyzer logic
|
||||||
|
2. do not change generator logic
|
||||||
|
3. do not promote scenes into `scene_execution_board_2026-04-18.json`
|
||||||
|
4. do not add new family baselines
|
||||||
|
5. do not create new family implementation plans
|
||||||
|
6. do not fix failures during this dry-run
|
||||||
|
7. do not run outside the fixed `102` scene set
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. execution board: `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. scene root: `D:/desk/智能体资料/全量业务场景/一平台场景`
|
||||||
|
3. generator command: `cargo run --bin sg_scene_generate`
|
||||||
|
|
||||||
|
## Fixed Outputs
|
||||||
|
|
||||||
|
1. dry-run result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
|
||||||
|
2. dry-run output root: `examples/full_sweep_dry_run_2026-04-19`
|
||||||
|
3. report: `docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-report.md`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Build Scene Inventory
|
||||||
|
2. `WS2` Run Analyzer/Generator Dry-Run
|
||||||
|
3. `WS3` Classify Results
|
||||||
|
4. `WS4` Publish Coverage Report
|
||||||
|
|
||||||
|
## Phase 0: Freeze Dry-Run Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make the dry-run a measurement exercise only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze the execution board input
|
||||||
|
2. freeze the local scene root
|
||||||
|
3. freeze the dry-run output paths
|
||||||
|
4. explicitly mark the run as read-only with respect to generator behavior and board status
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. fixed input statement
|
||||||
|
2. fixed output statement
|
||||||
|
3. dry-run no-promotion statement
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no analyzer/generator implementation file is edited for this dry-run
|
||||||
|
2. `scene_execution_board_2026-04-18.json` is not modified by dry-run results
|
||||||
|
3. failures are recorded, not fixed
|
||||||
|
|
||||||
|
## Phase 1: Build Scene Inventory
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Construct a deterministic inventory of all `102` scene names and expected source directories.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. read `scene_execution_board_2026-04-18.json`
|
||||||
|
2. extract all scene entries
|
||||||
|
3. map each scene name to `D:/desk/智能体资料/全量业务场景/一平台场景/<sceneName>`
|
||||||
|
4. check whether each source directory exists
|
||||||
|
5. assign initial inventory status:
|
||||||
|
- `source-present`
|
||||||
|
- `missing-source`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. inventory section inside `full_sweep_dry_run_2026-04-19.json`
|
||||||
|
2. missing-source list
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. inventory count equals `102`
|
||||||
|
2. every scene has a source path
|
||||||
|
3. missing source does not stop the sweep
|
||||||
|
|
||||||
|
## Phase 2: Run Analyzer/Generator Dry-Run
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Attempt current generic generation for every source-present scene without fixing failures.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. generate a stable safe scene id for each scene
|
||||||
|
2. invoke `sg_scene_generate` for each source-present scene
|
||||||
|
3. write outputs under `examples/full_sweep_dry_run_2026-04-19`
|
||||||
|
4. for successful generation, read `references/generation-report.json`
|
||||||
|
5. for failed generation, capture stderr/stdout and exit code
|
||||||
|
6. continue until all `102` scenes are processed
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. per-scene dry-run execution record
|
||||||
|
2. generated output root for successful scenes
|
||||||
|
3. captured error messages for failed scenes
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. every source-present scene has a generator result
|
||||||
|
2. no failure aborts the full sweep
|
||||||
|
3. generator results are isolated under the dry-run output root
|
||||||
|
|
||||||
|
## Phase 3: Classify Results
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn raw dry-run output into actionable coverage categories.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. classify generated `A/B` readiness with no blocker as `auto-pass`
|
||||||
|
2. classify generator blocking with known gate/contract reason as `fail-closed-known`
|
||||||
|
3. classify obvious family mismatch as `misclassified`
|
||||||
|
4. classify evidence outside current families as `unsupported-family`
|
||||||
|
5. classify absent directories as `missing-source`
|
||||||
|
6. classify read/analyze failures as `source-unreadable`
|
||||||
|
7. compute top blockers by frequency
|
||||||
|
8. compute counts by inferred archetype
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. final dry-run status per scene
|
||||||
|
2. summary counts
|
||||||
|
3. by-archetype counts
|
||||||
|
4. top-blocker list
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. every scene has exactly one final status
|
||||||
|
2. total classified count equals `102`
|
||||||
|
3. every non-pass scene has a reason
|
||||||
|
|
||||||
|
## Phase 4: Publish Report
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Answer the coverage question without changing project state.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write `full_sweep_dry_run_2026-04-19.json`
|
||||||
|
2. write `2026-04-19-102-full-sweep-dry-run-report.md`
|
||||||
|
3. report these four headline numbers:
|
||||||
|
- `real-sample executed pass`
|
||||||
|
- `code-backed ledger coverage`
|
||||||
|
- `dry-run auto-pass`
|
||||||
|
- `dry-run actionable coverage`
|
||||||
|
4. list next recommended blocker, but do not start implementation
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. dry-run JSON
|
||||||
|
2. dry-run report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. report can answer actual generic coverage over `102` scenes
|
||||||
|
2. report separates proven coverage from predicted/dry-run coverage
|
||||||
|
3. report does not promote scene status
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. all `102` scenes are included in the dry-run result
|
||||||
|
2. the dry-run result has stable summary counts
|
||||||
|
3. the report explains the gap between `5/102`, `23/102`, and dry-run coverage
|
||||||
|
4. no generator logic or execution board status is modified
|
||||||
|
|
||||||
|
## Non-Negotiable Stop Rule
|
||||||
|
|
||||||
|
After this dry-run starts:
|
||||||
|
|
||||||
|
1. do not fix generator failures inside the sweep
|
||||||
|
2. do not create new family implementation plans from a single failure
|
||||||
|
3. do not update the execution board automatically
|
||||||
|
4. stop after publishing the dry-run result and report
|
||||||
@@ -0,0 +1,240 @@
|
|||||||
|
# 102 Full Sweep Dry-Run Triage Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: `docs/superpowers/specs/2026-04-19-102-full-sweep-dry-run-triage-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Turn the `62` non-pass records from the full sweep into concrete triage buckets while staying measurement-only.
|
||||||
|
|
||||||
|
The plan must not fix generator failures. It only explains them.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. dry-run result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
|
||||||
|
2. dry-run output root: `examples/full_sweep_dry_run_2026-04-19`
|
||||||
|
3. execution board: `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
4. scene root: `D:/desk/智能体资料/全量业务场景/一平台场景`
|
||||||
|
|
||||||
|
## Fixed Outputs
|
||||||
|
|
||||||
|
1. triage result: `tests/fixtures/generated_scene/full_sweep_dry_run_triage_2026-04-19.json`
|
||||||
|
2. triage report: `docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-triage-report.md`
|
||||||
|
|
||||||
|
## Non-Negotiable Scope Guardrails
|
||||||
|
|
||||||
|
1. do not edit analyzer implementation
|
||||||
|
2. do not edit generator implementation
|
||||||
|
3. do not update `scene_execution_board_2026-04-18.json`
|
||||||
|
4. do not promote any scene
|
||||||
|
5. do not add new family baselines
|
||||||
|
6. do not start implementation correction during triage
|
||||||
|
7. do not expand beyond the fixed `102` scene set
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Timeout Triage
|
||||||
|
2. `WS2` Misclassification Triage
|
||||||
|
3. `WS3` No-Report Failure Triage
|
||||||
|
4. `WS4` Publish Triage Result
|
||||||
|
|
||||||
|
## Phase 0: Freeze Triage Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make the triage a classification exercise only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. read the upstream dry-run result
|
||||||
|
2. verify the upstream result has `102` scenes
|
||||||
|
3. verify non-pass buckets are:
|
||||||
|
- `31` timeout records
|
||||||
|
- `5` misclassified records
|
||||||
|
- `25` no-report records
|
||||||
|
- `1` bootstrap-target record
|
||||||
|
4. freeze the triage order:
|
||||||
|
- timeout first
|
||||||
|
- misclassification second
|
||||||
|
- no-report third
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. frozen triage input statement
|
||||||
|
2. frozen non-pass bucket counts
|
||||||
|
3. frozen triage order
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. triage input count is stable
|
||||||
|
2. no code is changed
|
||||||
|
3. no board status is updated
|
||||||
|
|
||||||
|
## Phase 1: Timeout Triage
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Split the `31` timeout records into second-level reasons.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select records where `dryRunStatus = source-unreadable`
|
||||||
|
2. verify reason is `generator timeout after 30s`
|
||||||
|
3. collect source directory metadata:
|
||||||
|
- source directory exists
|
||||||
|
- file count
|
||||||
|
- total source bytes
|
||||||
|
- largest file path
|
||||||
|
- largest file bytes
|
||||||
|
4. collect dry-run artifact metadata:
|
||||||
|
- generated skill directory exists
|
||||||
|
- references directory exists
|
||||||
|
- generation report exists
|
||||||
|
5. preserve board context:
|
||||||
|
- current group
|
||||||
|
- current status
|
||||||
|
- current source asset
|
||||||
|
- real sample record id
|
||||||
|
6. optionally run one diagnostic longer-timeout attempt for classification only
|
||||||
|
7. assign one timeout label:
|
||||||
|
- `timeout-known-family-sample`
|
||||||
|
- `timeout-unvalidated-source`
|
||||||
|
- `timeout-large-source`
|
||||||
|
- `timeout-command-hang`
|
||||||
|
- `timeout-generator-slow-but-progressing`
|
||||||
|
- `timeout-undetermined`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `timeoutTriage[]` records in the triage JSON
|
||||||
|
2. timeout label summary
|
||||||
|
3. timeout size/source metadata summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `31` timeout records have a second-level label
|
||||||
|
2. no timeout is treated as unsupported family by default
|
||||||
|
3. no long-timeout rerun result promotes a scene
|
||||||
|
|
||||||
|
## Phase 2: Misclassification Triage
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Explain the `5` board-vs-archetype conflicts.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select records where `dryRunStatus = misclassified`
|
||||||
|
2. preserve:
|
||||||
|
- board expected group
|
||||||
|
- expected archetype
|
||||||
|
- inferred archetype
|
||||||
|
- current source asset
|
||||||
|
- real sample layer status
|
||||||
|
3. inspect existing dry-run report path when present
|
||||||
|
4. collect route-conflict evidence:
|
||||||
|
- whether host bridge evidence dominates
|
||||||
|
- whether G3 or G1-E evidence is still present
|
||||||
|
- whether current board expectation came from baseline or expansion
|
||||||
|
5. assign one routing triage label:
|
||||||
|
- `route-overprefer-host-bridge`
|
||||||
|
- `board-expectation-stale`
|
||||||
|
- `mixed-workflow-host-bridge-valid`
|
||||||
|
- `scene-family-split-needed`
|
||||||
|
- `misclassification-undetermined`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `misclassificationTriage[]` records in the triage JSON
|
||||||
|
2. routing conflict summary
|
||||||
|
3. high-priority routing risk list
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `5` misclassified records have a routing label
|
||||||
|
2. no routing code is changed
|
||||||
|
3. the report identifies whether implementation correction is justified later
|
||||||
|
|
||||||
|
## Phase 3: No-Report Failure Triage
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Split the `25` generic no-report failures into concrete failure stages.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select records where:
|
||||||
|
- `dryRunStatus = fail-closed-known`
|
||||||
|
- `reason = generator failed without generation report`
|
||||||
|
2. collect command artifacts:
|
||||||
|
- exit code
|
||||||
|
- stdout tail
|
||||||
|
- stderr tail
|
||||||
|
3. inspect output artifacts:
|
||||||
|
- skill directory exists
|
||||||
|
- references directory exists
|
||||||
|
- any report file exists
|
||||||
|
4. infer one failure stage:
|
||||||
|
- `source-scan`
|
||||||
|
- `analyzer`
|
||||||
|
- `ir-assembly`
|
||||||
|
- `readiness-before-report`
|
||||||
|
- `compiler-package-write`
|
||||||
|
- `panic-or-process-error`
|
||||||
|
- `unknown-no-report`
|
||||||
|
5. keep `bootstrap_target` failure separate
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `noReportFailureTriage[]` records in the triage JSON
|
||||||
|
2. `bootstrapTargetFailures[]` records in the triage JSON
|
||||||
|
3. failure-stage summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `25` no-report failures have an inferred failure stage
|
||||||
|
2. the `bootstrap_target` case is not hidden in the no-report bucket
|
||||||
|
3. every non-pass record remains explainable without implementation changes
|
||||||
|
|
||||||
|
## Phase 4: Publish Triage Result
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Publish a bounded triage result and stop.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write `full_sweep_dry_run_triage_2026-04-19.json`
|
||||||
|
2. write `2026-04-19-102-full-sweep-dry-run-triage-report.md`
|
||||||
|
3. include:
|
||||||
|
- timeout triage summary
|
||||||
|
- misclassification triage summary
|
||||||
|
- no-report triage summary
|
||||||
|
- recommended next blocker
|
||||||
|
4. explicitly state that the triage does not promote scenes or start fixes
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. triage JSON
|
||||||
|
2. triage report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `62` non-pass records are covered
|
||||||
|
2. every non-pass record has a second-level explanation
|
||||||
|
3. the report identifies the next blocker without implementing it
|
||||||
|
4. no generator/analyzer file is modified
|
||||||
|
5. `scene_execution_board_2026-04-18.json` is not modified
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. `31` timeout records have timeout labels
|
||||||
|
2. `5` misclassified records have routing labels
|
||||||
|
3. `25` no-report failures have failure stages
|
||||||
|
4. `1` bootstrap-target failure is separately tracked
|
||||||
|
5. the triage JSON and report are published
|
||||||
|
6. execution stops without implementation work
|
||||||
|
|
||||||
@@ -0,0 +1,305 @@
|
|||||||
|
# 102 Full Sweep Improvement Roadmap Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: `docs/superpowers/specs/2026-04-19-102-full-sweep-improvement-roadmap-design.md`
|
||||||
|
> Upstream Dry-Run Result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
|
||||||
|
> Upstream Triage Result: `tests/fixtures/generated_scene/full_sweep_dry_run_triage_2026-04-19.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Turn the `102` scene dry-run and triage findings into a governed improvement roadmap.
|
||||||
|
|
||||||
|
This plan is intentionally broad like the earlier `60-to-90` roadmap. It coordinates multiple bounded implementation tracks instead of starting isolated fixes from individual failures.
|
||||||
|
|
||||||
|
## Baseline
|
||||||
|
|
||||||
|
Current measured baseline:
|
||||||
|
|
||||||
|
| Metric | Count |
|
||||||
|
| --- | ---: |
|
||||||
|
| Real-sample executed pass | 5 / 102 |
|
||||||
|
| Code-backed ledger coverage | 23 / 102 |
|
||||||
|
| Dry-run auto-pass | 40 / 102 |
|
||||||
|
| Dry-run actionable coverage | 66 / 102 |
|
||||||
|
|
||||||
|
Current triage baseline:
|
||||||
|
|
||||||
|
| Bucket | Count | Triage conclusion |
|
||||||
|
| --- | ---: | --- |
|
||||||
|
| Timeout | 31 | `19 timeout-unvalidated-source`, `8 timeout-large-source`, `4 timeout-known-family-sample` |
|
||||||
|
| Misclassified | 5 | all `route-overprefer-host-bridge` |
|
||||||
|
| No-report failure | 25 | all `readiness-before-report` |
|
||||||
|
| Bootstrap target | 1 | separate `bootstrap_target` |
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not add new scene families
|
||||||
|
2. do not update `scene_execution_board_2026-04-18.json` inside this roadmap
|
||||||
|
3. do not promote scenes directly from diagnostic or dry-run results
|
||||||
|
4. do not reopen completed real-sample passes except as regression checks
|
||||||
|
5. do not start `G4/G5`
|
||||||
|
6. do not implement full login recovery
|
||||||
|
7. do not implement full host runtime transport
|
||||||
|
8. do not implement local document attachment runtime
|
||||||
|
9. do not create unbounded micro-plans from a single failure
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Timeout Diagnostics and Scan Budget
|
||||||
|
2. `WS2` Routing Boundary Correction
|
||||||
|
3. `WS3` Structured Fail-Closed Reporting
|
||||||
|
4. `WS4` Follow-Up Sweep and Coverage Delta
|
||||||
|
|
||||||
|
## Phase 0: Freeze Improvement Baseline
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the dry-run and triage outputs as the only accepted inputs to this roadmap.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `full_sweep_dry_run_2026-04-19.json`
|
||||||
|
2. freeze `full_sweep_dry_run_triage_2026-04-19.json`
|
||||||
|
3. freeze the four headline metrics:
|
||||||
|
- `5/102` real-sample pass
|
||||||
|
- `23/102` code-backed ledger coverage
|
||||||
|
- `40/102` dry-run auto-pass
|
||||||
|
- `66/102` dry-run actionable coverage
|
||||||
|
4. freeze the problem buckets:
|
||||||
|
- `4` known-family timeouts
|
||||||
|
- `8` large-source timeouts
|
||||||
|
- `19` unvalidated-source timeouts
|
||||||
|
- `5` host-bridge over-preference cases
|
||||||
|
- `25` readiness-before-report failures
|
||||||
|
- `1` bootstrap-target failure
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. baseline statement
|
||||||
|
2. frozen blocker inventory
|
||||||
|
3. roadmap entry criteria
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no additional scene is added to scope
|
||||||
|
2. no implementation starts before the baseline is frozen
|
||||||
|
3. dry-run and triage assets are treated as immutable inputs
|
||||||
|
|
||||||
|
## Phase 1: Known-Family Timeout Diagnostics
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Resolve the highest-priority ambiguity: known-family scenes that timed out in the full sweep.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select only records labeled `timeout-known-family-sample`
|
||||||
|
2. capture source scale metrics and previous family context
|
||||||
|
3. run bounded diagnostic attempts if needed
|
||||||
|
4. classify each record as:
|
||||||
|
- `known-family-rerun-pass`
|
||||||
|
- `known-family-source-scale-timeout`
|
||||||
|
- `known-family-generator-hotspot`
|
||||||
|
- `known-family-contract-blocked-after-long-run`
|
||||||
|
- `known-family-timeout-unresolved`
|
||||||
|
5. publish diagnostic result
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. known-family timeout diagnostic JSON
|
||||||
|
2. known-family timeout diagnostic report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `4` known-family timeout records are classified
|
||||||
|
2. no scene is promoted from diagnostic success
|
||||||
|
3. no generator logic is changed in the diagnostic step
|
||||||
|
|
||||||
|
## Phase 2: Source-Scale and Scan-Budget Improvement
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce timeout noise caused by oversized source directories and obvious vendor/library files.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. analyze `timeout-large-source` and `timeout-unvalidated-source`
|
||||||
|
2. define source scan budget policy
|
||||||
|
3. define vendor/library ignore policy
|
||||||
|
4. implement only bounded source scanning or timeout reporting changes
|
||||||
|
5. verify no canonical or real-sample regression is introduced
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. source scan budget policy
|
||||||
|
2. bounded scan implementation if approved by Phase 1 evidence
|
||||||
|
3. timeout reporting regression tests
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. large source directories no longer dominate the full sweep by accidental vendor-file scanning
|
||||||
|
2. known-family samples are not made worse
|
||||||
|
3. archetype semantics are unchanged
|
||||||
|
|
||||||
|
## Phase 3: Host-Bridge Route Over-Preference Correction
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Correct or formally adjudicate the five cases where `host_bridge_workflow` over-absorbed `G3` or `G1-E` expected scenes.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select the `5` `route-overprefer-host-bridge` records
|
||||||
|
2. compare business-chain evidence against host-bridge evidence
|
||||||
|
3. define routing precedence rules for:
|
||||||
|
- `G3` vs `G6`
|
||||||
|
- `G1-E` vs `G6`
|
||||||
|
4. implement bounded routing correction only if evidence supports it
|
||||||
|
5. preserve regressions for:
|
||||||
|
- `G3` real-sample pass
|
||||||
|
- `G1-E` real-sample pass
|
||||||
|
- `G6` real-sample pass
|
||||||
|
6. classify each case as:
|
||||||
|
- `route-corrected-to-g3`
|
||||||
|
- `route-corrected-to-g1e`
|
||||||
|
- `board-expectation-reclassified`
|
||||||
|
- `valid-host-bridge-workflow`
|
||||||
|
- `route-conflict-unresolved`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. route over-preference correction report
|
||||||
|
2. routing regression tests
|
||||||
|
3. updated dry-run classification for the five fixed records
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `5` route conflicts are adjudicated
|
||||||
|
2. `host_bridge_workflow` no longer wins solely because host evidence exists
|
||||||
|
3. existing `G6` pass remains stable
|
||||||
|
4. no broad routing rewrite is introduced
|
||||||
|
|
||||||
|
## Phase 4: Structured Fail-Closed Reporting
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Convert `readiness-before-report` failures into structured failure reports instead of process-level no-report failures.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select the `25` `readiness-before-report` records
|
||||||
|
2. identify where generation exits before report emission
|
||||||
|
3. define a minimal failure-report schema for pre-package fail-closed
|
||||||
|
4. emit structured failure records with:
|
||||||
|
- inferred archetype
|
||||||
|
- failed gate
|
||||||
|
- blocker reason
|
||||||
|
- missing contract pieces
|
||||||
|
- stderr summary if any
|
||||||
|
5. keep scenes failing unless their contracts are actually complete
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. pre-report fail-closed schema
|
||||||
|
2. implementation of structured failure report emission
|
||||||
|
3. regression covering at least one `paginated_enrichment`, one `local_doc_pipeline`, one `multi_mode_request`, and one `single_request_enrichment` pre-report failure
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no-report failures are reduced or eliminated as a category
|
||||||
|
2. failing scenes still fail closed
|
||||||
|
3. failure reasons become machine-readable
|
||||||
|
4. auto-pass count is not inflated by looser gates
|
||||||
|
|
||||||
|
## Phase 5: Bootstrap Target Isolation
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Keep the single `bootstrap_target` failure isolated and decide whether it belongs to later bootstrap normalization work.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. preserve `用户停电频次分析监测` as a separate bootstrap failure
|
||||||
|
2. inspect whether the failure is caused by missing target URL, domain mismatch, or unsupported bootstrap pattern
|
||||||
|
3. produce a bootstrap isolation note
|
||||||
|
4. do not implement login or bootstrap auto-recovery
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bootstrap target isolation note
|
||||||
|
2. decision whether the case enters a later bootstrap-normalization roadmap
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the bootstrap case does not pollute readiness-before-report work
|
||||||
|
2. no login recovery implementation is started
|
||||||
|
|
||||||
|
## Phase 6: Follow-Up Full Sweep and Coverage Delta
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Measure whether the bounded improvements improved generic coverage.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. rerun the fixed `102` scene full sweep with the same scene set
|
||||||
|
2. produce a new dry-run result
|
||||||
|
3. compare against the baseline:
|
||||||
|
- auto-pass delta
|
||||||
|
- actionable coverage delta
|
||||||
|
- timeout delta
|
||||||
|
- misclassification delta
|
||||||
|
- no-report delta
|
||||||
|
4. publish coverage delta report
|
||||||
|
5. decide whether to move to execution-board status sync or another bounded improvement cycle
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. follow-up full sweep JSON
|
||||||
|
2. coverage delta report
|
||||||
|
3. remaining blocker decision board
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. scene set remains exactly `102`
|
||||||
|
2. baseline and follow-up are comparable
|
||||||
|
3. improvements are quantified, not assumed
|
||||||
|
4. no execution board status is changed automatically
|
||||||
|
|
||||||
|
## Milestone Order
|
||||||
|
|
||||||
|
The order is fixed:
|
||||||
|
|
||||||
|
1. Phase 0: freeze baseline
|
||||||
|
2. Phase 1: known-family timeout diagnostics
|
||||||
|
3. Phase 2: source-scale and scan-budget improvement
|
||||||
|
4. Phase 3: host-bridge route over-preference correction
|
||||||
|
5. Phase 4: structured fail-closed reporting
|
||||||
|
6. Phase 5: bootstrap target isolation
|
||||||
|
7. Phase 6: follow-up full sweep and coverage delta
|
||||||
|
|
||||||
|
Do not start Phase 3 before Phase 1 is completed. Known-family timeout ambiguity affects the interpretation of current coverage.
|
||||||
|
|
||||||
|
Do not start Phase 6 before Phases 2-5 have either completed or been explicitly deferred with reasons.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This roadmap is complete when:
|
||||||
|
|
||||||
|
1. known-family timeouts are no longer mixed with generic timeout noise
|
||||||
|
2. host-bridge over-preference cases are adjudicated
|
||||||
|
3. readiness-before-report failures become structured fail-closed records
|
||||||
|
4. the bootstrap target case is isolated
|
||||||
|
5. a follow-up full sweep quantifies coverage delta
|
||||||
|
6. no new family is introduced as a shortcut around current blockers
|
||||||
|
|
||||||
|
## Out of Plan
|
||||||
|
|
||||||
|
1. new family implementation
|
||||||
|
2. `G4/G5` implementation
|
||||||
|
3. browser host runtime transport
|
||||||
|
4. login recovery
|
||||||
|
5. attachment/local document runtime
|
||||||
|
6. automatic execution board promotion
|
||||||
|
|
||||||
@@ -0,0 +1,140 @@
|
|||||||
|
# 102 Sweep Status Reconciliation Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: `docs/superpowers/specs/2026-04-19-102-sweep-status-reconciliation-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Reconcile the follow-up `102` sweep result with the final route-conflict decisions so the next roadmap uses a trustworthy status baseline.
|
||||||
|
|
||||||
|
This plan is a status reconciliation plan, not an implementation plan.
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not modify `src/generated_scene/analyzer.rs`
|
||||||
|
2. do not modify `src/generated_scene/generator.rs`
|
||||||
|
3. do not modify `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
4. do not promote any scene
|
||||||
|
5. do not add or modify family baselines
|
||||||
|
6. do not rerun the `102` sweep
|
||||||
|
7. do not implement fixes for fail-closed or timeout records
|
||||||
|
|
||||||
|
## Phase 0: Freeze Inputs
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the exact reconciliation inputs.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. read `full_sweep_improvement_followup_2026-04-19.json`
|
||||||
|
2. read `remaining_route_conflict_decisions_2026-04-19.json`
|
||||||
|
3. verify follow-up sweep scene count is `102`
|
||||||
|
4. verify route-decision conflict count is `4`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. input validation summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. reconciliation does not proceed if follow-up scene count is not `102`
|
||||||
|
2. reconciliation does not proceed if route-decision count is not `4`
|
||||||
|
|
||||||
|
## Phase 1: Merge Route Decisions
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Apply route-conflict decisions as a reconciliation overlay without changing raw sweep status.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. match route decisions by `sceneId`
|
||||||
|
2. for each matching scene, keep `dryRunStatus = misclassified`
|
||||||
|
3. add `routeDecision = valid-host-bridge-workflow`
|
||||||
|
4. set `reconciledStatus = adjudicated-valid-host-bridge`
|
||||||
|
5. preserve decision reason and evidence summary
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. route-decision overlay records
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `4` route decisions match a follow-up scene
|
||||||
|
2. all `4` are reconciled to `adjudicated-valid-host-bridge`
|
||||||
|
3. no broad status rewrite is performed
|
||||||
|
|
||||||
|
## Phase 2: Build Reconciled Status Counts
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Build the reconciled status summary for all `102` scenes.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. copy all follow-up scene records into a new reconciliation asset
|
||||||
|
2. assign `reconciledStatus` for every scene
|
||||||
|
3. count statuses:
|
||||||
|
- `auto-pass`
|
||||||
|
- `fail-closed-known`
|
||||||
|
- `adjudicated-valid-host-bridge`
|
||||||
|
- `source-unreadable`
|
||||||
|
- `missing-source`
|
||||||
|
- `unsupported-family`
|
||||||
|
- `misclassified-unresolved`
|
||||||
|
4. summarize fail-closed records by archetype and reason
|
||||||
|
5. preserve remaining timeout records as unresolved timeout inputs
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/full_sweep_status_reconciliation_2026-04-19.json`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. total scene count is `102`
|
||||||
|
2. reconciled status count total is `102`
|
||||||
|
3. unresolved misclassification count is `0`
|
||||||
|
4. timeout count remains `2`
|
||||||
|
|
||||||
|
## Phase 3: Publish Reconciliation Report
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make the reconciled state readable and actionable.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. summarize raw follow-up counts
|
||||||
|
2. summarize reconciled counts
|
||||||
|
3. list `4` valid-host-bridge adjudications
|
||||||
|
4. list `2` remaining timeout inputs
|
||||||
|
5. summarize `48` fail-closed-known records as the next implementation-analysis candidate
|
||||||
|
6. state explicitly that the execution board was not changed
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `docs/superpowers/reports/2026-04-19-102-sweep-status-reconciliation-report.md`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. report explains why raw `misclassified = 4` no longer means unresolved route bugs
|
||||||
|
2. report identifies the next likely roadmap input without starting it
|
||||||
|
3. report confirms no code or execution-board changes
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. reconciliation JSON exists
|
||||||
|
2. reconciliation report exists
|
||||||
|
3. all `4` route conflicts are represented as adjudicated valid host-bridge workflows
|
||||||
|
4. no unresolved misclassification remains
|
||||||
|
5. `2` timeouts and `48` fail-closed records remain visible as separate future inputs
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the reconciliation JSON and report.
|
||||||
|
|
||||||
|
Do not start the next roadmap in this plan.
|
||||||
@@ -0,0 +1,44 @@
|
|||||||
|
# Bootstrap Target Normalization Roadmap Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Sequence: `2026-04-19-final-2-residual-child-plan-sequence-plan.md`
|
||||||
|
> Fixed Scene: `sweep-091-scene`
|
||||||
|
> Status: Draft
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Run a bounded bootstrap target normalization slice for the single remaining `page_state_eval` residual.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
1. `sweep-091-scene`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/scene_generator_test.rs`
|
||||||
|
4. `tests/fixtures/generated_scene/bootstrap_target_normalization_followup_2026-04-19.json`
|
||||||
|
5. `tests/fixtures/generated_scene/bootstrap_target_normalization_reconciliation_candidates_2026-04-19.json`
|
||||||
|
6. `docs/superpowers/reports/2026-04-19-bootstrap-target-normalization-roadmap-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Freeze the current `sweep-091-scene` generation report.
|
||||||
|
2. Identify whether the failure is a missing target URL, target-domain ambiguity, or policy-held navigation dependency.
|
||||||
|
3. Implement at most one bounded bootstrap target normalization slice if the target can be recovered from deterministic source evidence.
|
||||||
|
4. Rerun only `sweep-091-scene`.
|
||||||
|
5. Publish follow-up and reconciliation candidate assets.
|
||||||
|
|
||||||
|
## Expected Delta
|
||||||
|
|
||||||
|
Target delta is `+1 framework-auto-pass-candidate` if deterministic bootstrap target recovery is possible. Otherwise the delta is `0`, with a narrower named hold.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the single-scene follow-up and reconciliation candidates are published. Do not update the official board under this plan.
|
||||||
@@ -0,0 +1,38 @@
|
|||||||
|
# Bootstrap Target Residual Isolation Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Plan: `docs/superpowers/plans/2026-04-19-structured-fail-closed-residual-13-closure-plan.md`
|
||||||
|
> Parent Route: `Residual Route D`
|
||||||
|
> Parent Layer: `Layer D`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Isolate the remaining page-state/bootstrap-target residual without starting login recovery or runtime navigation implementation.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
1. `sweep-091-scene` / `用户停电频次分析监测`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. isolation JSON asset
|
||||||
|
2. isolation report
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. login/runtime implementation files
|
||||||
|
4. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. preserve the residual as bootstrap-target isolated;
|
||||||
|
2. publish isolation report;
|
||||||
|
3. do not implement login recovery.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after isolation assets are published.
|
||||||
|
|
||||||
@@ -0,0 +1,55 @@
|
|||||||
|
# Boundary Fail-Closed Decision Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Route: `Route 5: boundary-family fail-closed`
|
||||||
|
> Parent Layer: `Layer C + Layer D`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-boundary-fail-closed-decision-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Publish a decision for the remaining boundary-family fail-closed buckets after mainline routes are complete or deferred.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
1. `local_doc_pipeline = 5`
|
||||||
|
2. `host_bridge_workflow = 1`
|
||||||
|
3. `page_state_eval/bootstrap_target = 1`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. boundary decision JSON assets
|
||||||
|
2. boundary decision report assets
|
||||||
|
3. optional next bounded boundary plan docs
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. freeze the Route 5 bucket state
|
||||||
|
2. inspect each boundary subgroup
|
||||||
|
3. decide defer/hold/open-slice
|
||||||
|
4. publish Route 5 decision report
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
Decision-only delta:
|
||||||
|
|
||||||
|
1. unresolved boundary ambiguity should go to zero
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. every Route 5 subgroup has a named decision
|
||||||
|
2. any follow-up bounded plan is explicit and optional
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the Route 5 decision report is published.
|
||||||
|
|
||||||
|
Do not begin boundary implementation under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,139 @@
|
|||||||
|
# Boundary Family Real-Sample Entry Roadmap Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-boundary-family-real-sample-entry-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-boundary-family-real-sample-entry-roadmap-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This roadmap determines the next bounded step after `G1-E / G2 / G3` have all closed as executed real-sample passes.
|
||||||
|
|
||||||
|
Its only purpose is:
|
||||||
|
|
||||||
|
`decide whether one boundary family may enter real-sample execution scope next`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not reopen `G1-E / G2 / G3`
|
||||||
|
2. do not implement runtime-platform prerequisites under this roadmap
|
||||||
|
3. do not execute real samples for more than one boundary family
|
||||||
|
4. do not open `G4 / G5`
|
||||||
|
5. do not turn this work into a new family-asset expansion program
|
||||||
|
|
||||||
|
## Candidate Boundary Families
|
||||||
|
|
||||||
|
The only candidates under this roadmap are:
|
||||||
|
|
||||||
|
1. `G6`
|
||||||
|
2. `G7`
|
||||||
|
3. `G8`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze the Post-Mainline Starting State
|
||||||
|
2. `WS2` Evaluate Boundary-Family Entry Readiness
|
||||||
|
3. `WS3` Select One Next Candidate or Hold All
|
||||||
|
4. `WS4` Publish the Next Bounded Execution Slice
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Starting State
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the roadmap start point so the decision does not drift back into old mainline work.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G1-E / G2 / G3` as closed executed passes
|
||||||
|
2. freeze `G6 / G7 / G8` as held boundary families
|
||||||
|
3. freeze `G4 / G5` as out of scope
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. starting-state note
|
||||||
|
2. fixed candidate list
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no mainline or deferred family work is reopened under this roadmap
|
||||||
|
|
||||||
|
## Phase 1: Evaluate Boundary-Family Entry Readiness
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Compare `G6 / G7 / G8` against explicit entry criteria instead of intuition.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. restate the current entry condition for each boundary family
|
||||||
|
2. compare the required runtime gap for each family
|
||||||
|
3. estimate which family needs the smallest new capability to enter real-sample scope
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. boundary-family comparison matrix
|
||||||
|
2. smallest-entry-cost summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next candidate family can be justified with explicit criteria
|
||||||
|
2. the rejected families have explicit hold reasons
|
||||||
|
|
||||||
|
## Phase 2: Select One Next Candidate or Hold All
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce the next-step ambiguity to a single bounded decision.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select exactly one family as the next real-sample entry candidate
|
||||||
|
2. or explicitly conclude that all boundary families remain held
|
||||||
|
3. record why the non-selected families remain out of scope
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. boundary-family entry decision
|
||||||
|
2. hold reasons for non-selected families
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no more than one next family is opened
|
||||||
|
2. the decision is bounded and defensible
|
||||||
|
|
||||||
|
## Phase 3: Publish the Next Bounded Execution Slice
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the decision into the next actionable bounded plan.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. if one family is selected, write a bounded `design + plan` for its minimum real-sample entry slice
|
||||||
|
2. if none is selected, write a bounded prerequisites plan instead
|
||||||
|
3. update the decision report layer
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. next-family bounded `design`
|
||||||
|
2. next-family bounded `plan`
|
||||||
|
3. roadmap closure report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step is ready to execute without reopening roadmap scope
|
||||||
|
2. only one bounded direction is emitted
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This roadmap is complete when:
|
||||||
|
|
||||||
|
1. the post-mainline next step is reduced to one bounded direction
|
||||||
|
2. `G6 / G7 / G8` no longer compete ambiguously for priority
|
||||||
|
3. a single follow-up `design + plan` exists for the selected direction
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this roadmap completes:
|
||||||
|
|
||||||
|
1. execute the selected family-entry slice if one family is admitted
|
||||||
|
2. otherwise execute the bounded prerequisites slice before any boundary family enters real-sample scope
|
||||||
@@ -0,0 +1,38 @@
|
|||||||
|
# Boundary Residual Hold Decision Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Plan: `docs/superpowers/plans/2026-04-19-structured-fail-closed-residual-13-closure-plan.md`
|
||||||
|
> Parent Route: `Residual Route C`
|
||||||
|
> Parent Layer: `Layer D`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Decide whether the remaining `local_doc_pipeline` and `host_bridge_workflow` residual records should remain held or enter a future runtime roadmap.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
1. five `local_doc_pipeline` residual records
|
||||||
|
2. one `host_bridge_workflow` residual record
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. decision JSON asset
|
||||||
|
2. decision report
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. classify each boundary residual as hold/defer/runtime-roadmap-input;
|
||||||
|
2. do not implement runtime support;
|
||||||
|
3. publish decision report.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after decision assets are published.
|
||||||
|
|
||||||
@@ -0,0 +1,123 @@
|
|||||||
|
# Boundary Runtime Prerequisites Roadmap Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-boundary-runtime-prerequisites-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-boundary-runtime-prerequisites-roadmap-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This roadmap determines the next bounded prerequisites slice after the post-`G7` boundary decision concludes that direct `G6` or `G8` execution should not start yet.
|
||||||
|
|
||||||
|
Its only purpose is:
|
||||||
|
|
||||||
|
`select one bounded prerequisite direction before the next boundary-family real-sample attempt`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not execute `G6` or `G8`
|
||||||
|
2. do not reopen `G7`
|
||||||
|
3. do not reopen `G1-E / G2 / G3`
|
||||||
|
4. do not implement host-runtime or local-doc runtime under this roadmap
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Candidate Prerequisite Directions
|
||||||
|
|
||||||
|
The only candidates under this roadmap are:
|
||||||
|
|
||||||
|
1. `G6 host-bridge prerequisites`
|
||||||
|
2. `G8 local-doc prerequisites`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze the Post-G7 Boundary Hold State
|
||||||
|
2. `WS2` Compare G6 and G8 Prerequisite Burden
|
||||||
|
3. `WS3` Select One Prerequisite Direction
|
||||||
|
4. `WS4` Publish the Next Bounded Prerequisites Slice
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Starting State
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the roadmap start point so no closed family work is reopened.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G7` as closed
|
||||||
|
2. freeze `G6` and `G8` as held pending prerequisites
|
||||||
|
3. freeze `G1-E / G2 / G3` as closed
|
||||||
|
4. freeze `G4 / G5` as out of scope
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. starting-state note
|
||||||
|
2. fixed prerequisite candidate list
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no family execution begins under this roadmap
|
||||||
|
|
||||||
|
## Phase 1: Compare Prerequisite Burden
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Compare `G6` and `G8` at the prerequisite level instead of at the execution level.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. restate the smallest blocked capability for `G6`
|
||||||
|
2. restate the smallest blocked capability for `G8`
|
||||||
|
3. compare which prerequisite can be isolated more cleanly
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. prerequisite comparison matrix
|
||||||
|
2. smallest-prerequisite summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the selected prerequisite direction is justified explicitly
|
||||||
|
|
||||||
|
## Phase 2: Select One Prerequisite Direction
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce the post-`G7` prerequisite ambiguity to one bounded decision.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select exactly one direction:
|
||||||
|
- `G6 host-bridge prerequisites`
|
||||||
|
- or `G8 local-doc prerequisites`
|
||||||
|
2. record why the other direction remains held
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. prerequisite direction decision
|
||||||
|
2. hold reason for the non-selected direction
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. only one next direction is opened
|
||||||
|
2. the decision is bounded and defensible
|
||||||
|
|
||||||
|
## Phase 3: Publish the Next Bounded Slice
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the decision into the next executable bounded artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write one bounded follow-up design and plan for the selected prerequisite direction
|
||||||
|
2. publish a roadmap closure report
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. next bounded `design`
|
||||||
|
2. next bounded `plan`
|
||||||
|
3. roadmap closure report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step is ready without extending this roadmap
|
||||||
|
2. only one bounded direction is emitted
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
# Final 2 Official Board Reconciliation Refresh Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Sequence: `2026-04-19-final-2-residual-child-plan-sequence-plan.md`
|
||||||
|
> Status: Draft
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Refresh official board framework fields after one or both final-2 residual roadmaps publish reconciliation candidates.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
At least one of:
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/bootstrap_target_normalization_reconciliation_candidates_2026-04-19.json`
|
||||||
|
2. `tests/fixtures/generated_scene/host_bridge_runtime_reconciliation_candidates_2026-04-19.json`
|
||||||
|
|
||||||
|
Also required:
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/promotion_board_reconciliation_policy_2026-04-19.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/final_2_official_board_reconciliation_refresh_2026-04-19.json`
|
||||||
|
3. `docs/superpowers/reports/2026-04-19-final-2-official-board-reconciliation-refresh-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Load candidate assets that exist.
|
||||||
|
2. Verify each candidate belongs to `sweep-085-scene` or `sweep-091-scene`.
|
||||||
|
3. Match board rows by `sceneId`.
|
||||||
|
4. Update only framework-layer fields.
|
||||||
|
5. Recompute board framework summary.
|
||||||
|
6. Publish reconciliation refresh JSON and report.
|
||||||
|
|
||||||
|
## Expected Delta
|
||||||
|
|
||||||
|
Delta depends on candidate assets:
|
||||||
|
|
||||||
|
1. one closed residual: `framework-auto-pass +1`, `framework-structured-fail-closed -1`
|
||||||
|
2. both closed residuals: `framework-auto-pass +2`, `framework-structured-fail-closed -2`
|
||||||
|
3. held residuals: no count delta, but narrower next action / hold reason
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the final-2 board reconciliation refresh JSON and report are published.
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
# Final 2 Residual Child Plan Sequence Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Layer: `Layer E / Route 5 + Route 6`
|
||||||
|
> Status: Draft
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Create the remaining child-plan sequence for the last two framework structured fail-closed residuals. This plan only defines the sequence and child plan boundaries; it does not execute implementation.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/local_doc_official_board_reconciliation_refresh_2026-04-19.json`
|
||||||
|
|
||||||
|
## Fixed Residual Bucket
|
||||||
|
|
||||||
|
1. `sweep-085-scene`: `host_bridge_workflow`, `future-host-bridge-runtime-roadmap-input`
|
||||||
|
2. `sweep-091-scene`: `page_state_eval`, `future-bootstrap-target-normalization-roadmap-input`
|
||||||
|
|
||||||
|
## Child Plans
|
||||||
|
|
||||||
|
1. `2026-04-19-final-2-residual-roadmap-prioritization-plan.md`
|
||||||
|
2. `2026-04-19-bootstrap-target-normalization-roadmap-plan.md`
|
||||||
|
3. `2026-04-19-host-bridge-runtime-roadmap-plan.md`
|
||||||
|
4. `2026-04-19-final-2-official-board-reconciliation-refresh-plan.md`
|
||||||
|
5. `2026-04-19-102-framework-closure-rollup-plan.md`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. Do not modify `analyzer.rs`.
|
||||||
|
2. Do not modify `generator.rs`.
|
||||||
|
3. Do not update the official board under this sequence-definition plan.
|
||||||
|
4. Do not run a full 102 sweep under this plan.
|
||||||
|
5. Do not reopen G1-E, G2, G3, or local-doc runtime work.
|
||||||
|
6. Do not continue the old G6 micro-plan chain.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. The final-2 residual child plan sequence exists.
|
||||||
|
2. Each child plan declares parent route, fixed input bucket, allowed files, forbidden files, expected delta, and stop statement.
|
||||||
|
3. The next executable child plan is the prioritization plan.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the final-2 child plan sequence is created. Do not execute any child plan under this sequence-definition plan.
|
||||||
@@ -0,0 +1,43 @@
|
|||||||
|
# Final 2 Residual Roadmap Prioritization Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Sequence: `2026-04-19-final-2-residual-child-plan-sequence-plan.md`
|
||||||
|
> Status: Draft
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Select the next residual roadmap from the final two structured fail-closed records.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
1. `sweep-085-scene`: host-bridge runtime residual
|
||||||
|
2. `sweep-091-scene`: bootstrap target normalization residual
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/final_2_residual_roadmap_prioritization_2026-04-19.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-19-final-2-residual-roadmap-prioritization-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Load the current official board.
|
||||||
|
2. Extract the two residuals.
|
||||||
|
3. Score bootstrap normalization vs host-bridge runtime.
|
||||||
|
4. Select exactly one first roadmap.
|
||||||
|
5. Publish decision JSON.
|
||||||
|
6. Publish decision report.
|
||||||
|
|
||||||
|
## Expected Delta
|
||||||
|
|
||||||
|
No coverage delta. This is a decision-only plan.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the prioritization asset and report are published. Do not start the selected roadmap under this plan.
|
||||||
@@ -0,0 +1,55 @@
|
|||||||
|
# G1-E Remaining Fail-Closed Closure Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Route: `Route 4: G1-E / single_request_enrichment`
|
||||||
|
> Parent Layer: `Layer C + Layer D`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-g1e-remaining-fail-closed-closure-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement one bounded correction slice for the remaining Route 4 `G1-E` fail-closed records.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
`single_request_enrichment = 2`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `tests/scene_generator_test.rs`
|
||||||
|
5. Route 4 local inventory and report assets
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. Route 2 and Route 3 assets
|
||||||
|
3. Route 5+ assets
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. freeze the two Route 4 records
|
||||||
|
2. confirm the repeated missing contract
|
||||||
|
3. implement one bounded `G1-E` correction slice
|
||||||
|
4. rerun bounded validation
|
||||||
|
5. publish Route 4 delta
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. reduce the `G1-E` fail-closed bucket
|
||||||
|
2. preserve current `G1-E` real-sample pass and canonical stability
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. Route 4 bucket has measured before/after status
|
||||||
|
2. Route 4 is closed or deferred
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after Route 4 delta is measured.
|
||||||
|
|
||||||
|
Do not begin Route 5 under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,185 @@
|
|||||||
|
# G2 Real Sample Contract Correction Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g2-real-sample-contract-correction-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g2-real-sample-contract-correction-design.md)
|
||||||
|
> Trigger Record: `rsv-g2-001`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan implements one bounded mainline correction slice:
|
||||||
|
|
||||||
|
`G2 real-sample contract correction`
|
||||||
|
|
||||||
|
Its purpose is to reduce the current real-sample `G2` mismatch from the broad bundle:
|
||||||
|
|
||||||
|
1. `bootstrap_mismatch`
|
||||||
|
2. `request_contract_missing`
|
||||||
|
3. `column_defs_missing`
|
||||||
|
4. `output correctness not closed`
|
||||||
|
|
||||||
|
into either:
|
||||||
|
|
||||||
|
1. a verified pass
|
||||||
|
2. or a smaller named contract mismatch
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not reopen completed `G2` family expansion work
|
||||||
|
2. do not add new `G2` fixtures or promote new `G2` candidates
|
||||||
|
3. do not reopen `G3`, `G1-E`, or boundary families
|
||||||
|
4. do not turn this work into login recovery or broader runtime-platform implementation
|
||||||
|
5. do not update validation assets until the real-sample outcome becomes narrower than the current broad mismatch bundle
|
||||||
|
|
||||||
|
## Fixed Verification Anchor
|
||||||
|
|
||||||
|
The only anchor under this plan is:
|
||||||
|
|
||||||
|
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
|
||||||
|
Mapped real-sample record:
|
||||||
|
|
||||||
|
1. `rsv-g2-001`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Real-Sample Contract Differential
|
||||||
|
2. `WS2` Bootstrap and Request Contract Narrowing
|
||||||
|
3. `WS3` Column and Output Contract Narrowing
|
||||||
|
4. `WS4` Regression, Rerun, and Validation Closure
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Correction Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the scope to the fixed `G2` real sample and its remaining contract gaps.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `rsv-g2-001` as the only real-sample correction target
|
||||||
|
2. freeze the current mismatch bundle from the validation layer
|
||||||
|
3. freeze `G2` family-expansion outputs as completed and out of scope
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. correction-boundary note
|
||||||
|
2. fixed mismatch statement
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no new `G2` family-expansion task is opened
|
||||||
|
2. the correction target is explicitly limited to real-sample contract closure
|
||||||
|
|
||||||
|
## Phase 1: Build the Real-Sample Contract Differential
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make the smallest remaining real-sample contract mismatch explicit before code changes.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. compare the current real generated `SceneIr` against the intended `tq-lineloss-report` contract
|
||||||
|
2. isolate whether the dominant remaining gap is:
|
||||||
|
- bootstrap target selection
|
||||||
|
- per-mode request template completeness
|
||||||
|
- output column semantics
|
||||||
|
- output artifact correctness
|
||||||
|
3. write a minimum contract-gap summary
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. contract differential note
|
||||||
|
2. minimum gap summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the smallest remaining `G2` mismatch is explicit
|
||||||
|
2. the next implementation target is narrower than the current broad mismatch bundle
|
||||||
|
|
||||||
|
## Phase 2: Narrow Bootstrap and Request Contract Gaps
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Correct only the bootstrap and request-side contract pieces that the real sample proves are still too coarse.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. adjust `G2` bootstrap resolution only where the real sample proves it is still misaligned
|
||||||
|
2. adjust mode-specific request contract recovery only where the real sample proves it is still incomplete
|
||||||
|
3. preserve fail-closed behavior for unresolved `G2` variants
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bounded bootstrap correction
|
||||||
|
2. bounded request-contract correction
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the real sample no longer keeps the same broad bootstrap/request mismatch shape
|
||||||
|
2. unrelated `G2` family fixtures are not broadened or reclassified
|
||||||
|
|
||||||
|
## Phase 3: Narrow Column and Output Contract Gaps
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce the remaining output-side mismatch to a verified or smaller state.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. adjust `G2` column-definition recovery only where the real sample proves it is still incomplete
|
||||||
|
2. adjust output-contract verification only where the real sample proves the generated artifact is too coarse
|
||||||
|
3. keep readiness and fail-closed behavior intact for still-unresolved samples
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bounded column-contract correction
|
||||||
|
2. bounded output-contract correction
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the real-sample mismatch becomes narrower than the current broad bundle
|
||||||
|
2. `G2` does not regress into false positives for unresolved variants
|
||||||
|
|
||||||
|
## Phase 4: Regression, Rerun, and Validation Closure
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Use rerun and validation-layer updates to close the bounded `G2` correction loop.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. add or update regression that names the corrected `G2` real-sample pattern
|
||||||
|
2. rerun the fixed real sample
|
||||||
|
3. record whether:
|
||||||
|
- the sample becomes `executed-pass`
|
||||||
|
- or the remaining mismatch is now smaller and named
|
||||||
|
4. update the validation-layer assets
|
||||||
|
5. write a formal closure report
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. rerun output
|
||||||
|
2. updated validation assets
|
||||||
|
3. `G2` real-sample contract-correction closure report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `rsv-g2-001` no longer remains unchanged as the same broad mismatch bundle
|
||||||
|
2. the narrowed outcome is covered by automated regression
|
||||||
|
3. validation assets record the narrower `G2` state
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. the fixed `G2` real sample no longer remains at the same broad mismatch bundle
|
||||||
|
2. the narrower result is covered by automated regression
|
||||||
|
3. validation assets are updated with the narrowed outcome
|
||||||
|
4. completed `G2` family-expansion work remains untouched
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this plan completes:
|
||||||
|
|
||||||
|
1. if `G2` becomes `executed-pass`, mainline real-sample pressure leaves both `G2` and `G3`
|
||||||
|
2. if `G2` still has a smaller named mismatch, move only to that narrower `G2` correction slice
|
||||||
@@ -0,0 +1,55 @@
|
|||||||
|
# G2 Remaining Fail-Closed Closure Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Route: `Route 3: G2 / multi_mode_request`
|
||||||
|
> Parent Layer: `Layer C + Layer D`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-g2-remaining-fail-closed-closure-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement one bounded correction slice for the remaining Route 3 `G2` fail-closed records.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
`multi_mode_request = 4`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `tests/scene_generator_test.rs`
|
||||||
|
5. Route 3 local inventory and report assets
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. Route 2 assets
|
||||||
|
3. Route 4+ assets
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. freeze the four Route 3 records
|
||||||
|
2. confirm the repeated missing contract
|
||||||
|
3. implement one bounded `G2` correction slice
|
||||||
|
4. rerun bounded validation
|
||||||
|
5. publish Route 3 delta
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. reduce the `multi_mode_request` fail-closed bucket
|
||||||
|
2. protect current `G2` real-sample pass and canonical stability
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. Route 3 bucket has measured before/after status
|
||||||
|
2. Route 3 is closed or explicitly deferred
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after Route 3 delta is measured.
|
||||||
|
|
||||||
|
Do not begin Route 4 under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,48 @@
|
|||||||
|
# G2 Residual 2 Readiness Closure Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Plan: `docs/superpowers/plans/2026-04-19-structured-fail-closed-residual-13-closure-plan.md`
|
||||||
|
> Parent Route: `Residual Route B`
|
||||||
|
> Parent Layer: `Layer C`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Close the `2` remaining `G2 / multi_mode_request` structured fail-closed records by correcting bounded readiness or contract interpretation.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
1. `sweep-018-scene` / `白银线损周报`
|
||||||
|
2. `sweep-071-scene` / `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/scene_generator_test.rs`
|
||||||
|
4. route-local follow-up JSON/report assets
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. G3/G6/G8 route code unless required to preserve regression tests
|
||||||
|
3. family baseline manifests
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. inspect the two fixed G2 residuals;
|
||||||
|
2. determine whether readiness labels `02` and `00` are report parsing artifacts or real contract gaps;
|
||||||
|
3. implement one bounded G2 correction if justified;
|
||||||
|
4. rerun only the two fixed scenes;
|
||||||
|
5. publish delta report.
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
Target: reduce the `2` G2 residual fail-closed records.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the two-scene route-local follow-up and report.
|
||||||
|
|
||||||
|
Do not continue into G1-E, G3, or boundary work.
|
||||||
|
|
||||||
@@ -0,0 +1,56 @@
|
|||||||
|
# G3 Enrichment Request Closure Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Route: `Route 2: G3 / paginated_enrichment`
|
||||||
|
> Parent Layer: `Layer C + Layer D`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-g3-enrichment-request-closure-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement the first bounded `G3` contract-recovery slice by recovering repeated enrichment-request and secondary-request evidence gaps inside the remaining `paginated_enrichment` fail-closed bucket.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
`paginated_enrichment + g3_enrichment_contract + secondary_request`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `tests/scene_generator_test.rs`
|
||||||
|
5. route-local follow-up JSON and report assets
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. Route 3+ plan files
|
||||||
|
3. family promotion assets
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. freeze the targeted `G3` subgroup from the current follow-up asset
|
||||||
|
2. confirm the repeated enrichment-request missing pattern
|
||||||
|
3. implement one bounded contract-recovery slice
|
||||||
|
4. rerun only the bounded validation needed by this subgroup
|
||||||
|
5. publish subgroup delta and residual subgroup count
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. reduce the count of `paginated_enrichment` fail-closed records caused primarily by enrichment-request closure failure
|
||||||
|
2. do not reduce canonical or real-sample `G3` pass stability
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. targeted subgroup has a measured before/after count
|
||||||
|
2. remaining unresolved Route 2 issues are explicitly handed to the next child plan
|
||||||
|
3. no route drift into `host_bridge_workflow`
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the targeted enrichment-request subgroup has been corrected or explicitly bounded as residual.
|
||||||
|
|
||||||
|
Do not continue into export-plan closure work under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,55 @@
|
|||||||
|
# G3 Export Plan Closure Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Route: `Route 2: G3 / paginated_enrichment`
|
||||||
|
> Parent Layer: `Layer C + Layer D`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-g3-export-plan-closure-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement the second bounded `G3` contract-recovery slice by recovering repeated export-plan evidence gaps inside the remaining `paginated_enrichment` fail-closed bucket.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
`paginated_enrichment + g3_export_plan + export_plan`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `tests/scene_generator_test.rs`
|
||||||
|
5. route-local follow-up JSON and report assets
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. Route 3+ plan files
|
||||||
|
3. promotion policy assets
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. freeze the targeted export-plan subgroup
|
||||||
|
2. confirm repeated `export_plan` and `g3_export_plan` missing pattern
|
||||||
|
3. implement one bounded export-plan recovery slice
|
||||||
|
4. rerun bounded validation only for this subgroup
|
||||||
|
5. publish delta and residual Route 2 inventory
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. reduce the count of `paginated_enrichment` records whose primary blocker is export-plan absence
|
||||||
|
2. preserve stable `G3` canonical and real-sample anchors
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. export-plan subgroup count is lower or more narrowly classified
|
||||||
|
2. residual Route 2 bucket is explicitly measured
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the export-plan subgroup has been rerun and measured.
|
||||||
|
|
||||||
|
Do not continue into Route 2 residual closure under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,171 @@
|
|||||||
|
# G3 Real Sample Archetype Correction Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g3-real-sample-archetype-correction-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g3-real-sample-archetype-correction-design.md)
|
||||||
|
> Trigger Report: [2026-04-19-g3-real-sample-execution-report.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/reports/2026-04-19-g3-real-sample-execution-report.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan implements the next bounded scope selected by the real-sample validation roadmap:
|
||||||
|
|
||||||
|
`mainline G3 real-sample archetype correction`
|
||||||
|
|
||||||
|
Its purpose is to correct the routing boundary that currently makes the real sample `95598工单明细表` collapse into `local_doc_pipeline`.
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. Do not reopen the completed `G3` repo-local family expansion program.
|
||||||
|
2. Do not broaden this work into `G8` runtime implementation.
|
||||||
|
3. Do not open `G4 / G5`.
|
||||||
|
4. Do not add new family-expansion fixtures unrelated to the real-sample mismatch.
|
||||||
|
5. Do not weaken fail-closed behavior in order to force a pass result.
|
||||||
|
6. Do not treat generic asset updates as progress unless they directly unblock the real-sample rerun.
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Real-Sample Evidence Differential
|
||||||
|
2. `WS2` G3-vs-G8 Routing Boundary Correction
|
||||||
|
3. `WS3` Regression and Fail-Closed Integrity
|
||||||
|
4. `WS4` Real-Sample Rerun and Closure
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Correction Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the scope to one mismatch: the `G3` real sample being misrouted into `G8`.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `95598工单明细表` as the only real-sample correction anchor
|
||||||
|
2. freeze the current observed mismatch:
|
||||||
|
- `archetype_mismatch`
|
||||||
|
- `evidence_not_closed`
|
||||||
|
3. freeze current `G8` behavior as a boundary-family constraint that must not regress
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. correction-boundary note
|
||||||
|
2. fixed anchor and mismatch statement
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no additional family or runtime scope is added under this plan
|
||||||
|
2. the correction target is explicitly `G3 vs G8` routing
|
||||||
|
|
||||||
|
## Phase 1: Build the Real-Sample Evidence Differential
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Understand why the real sample routes differently from the repo-local `G3` baseline.
|
||||||
|
|
||||||
|
### WS1 Tasks
|
||||||
|
|
||||||
|
1. compare repo-local `G3` canonical evidence against real-sample deterministic facts
|
||||||
|
2. isolate which evidence currently drives `local_doc_pipeline`
|
||||||
|
3. isolate which `G3` business-chain signals are present but losing in routing
|
||||||
|
4. write a differential summary that identifies the minimum routing fix
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. evidence differential note
|
||||||
|
2. real-sample routing-pressure summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the team can point to the specific evidence classes causing `G8` to win
|
||||||
|
2. the minimum routing correction is explicit before code changes start
|
||||||
|
|
||||||
|
## Phase 2: Correct the G3-vs-G8 Routing Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Change routing so recoverable `G3` business-chain evidence outranks `G8` local-pipeline evidence for this mismatch class.
|
||||||
|
|
||||||
|
### WS2 Tasks
|
||||||
|
|
||||||
|
1. tighten the `local_doc_pipeline` trigger threshold for mixed-evidence scenes
|
||||||
|
2. raise the priority of `G3` when:
|
||||||
|
- main request exists
|
||||||
|
- pagination contract is recoverable
|
||||||
|
- enrichment or detail chain exists
|
||||||
|
3. keep `G8` routing only when local pipeline evidence is still the dominant workflow backbone
|
||||||
|
4. preserve fail-closed behavior if the sample still does not satisfy the `G3` minimum contract after routing correction
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. analyzer and generator routing update
|
||||||
|
2. explicit `G3 vs G8` routing rule in code comments or tests where needed
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the real-sample mismatch no longer defaults to `local_doc_pipeline`
|
||||||
|
2. `G8` representative classification remains intact
|
||||||
|
3. incomplete `G3` still fail-closes without pseudo-runnable output
|
||||||
|
|
||||||
|
## Phase 3: Lock Regression and Fail-Closed Integrity
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Prove the correction does not trade one false-positive for another.
|
||||||
|
|
||||||
|
### WS3 Tasks
|
||||||
|
|
||||||
|
1. add deterministic regression for the mixed `G3/G8` evidence pattern
|
||||||
|
2. add generator regression showing the corrected route stays inside `G3`
|
||||||
|
3. retain or strengthen `G8` regression so the boundary family does not collapse
|
||||||
|
4. verify that unresolved `G3` cases still fail closed for `G3` reasons
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. regression tests for `G3 vs G8`
|
||||||
|
2. updated validation fixtures or assertions as needed
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no regression causes `G8` to disappear as a boundary archetype
|
||||||
|
2. no regression reintroduces a false-positive runnable skill
|
||||||
|
3. test coverage explicitly names the corrected mismatch pattern
|
||||||
|
|
||||||
|
## Phase 4: Rerun the Real Sample and Close the Loop
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Use the actual real sample to confirm the correction outcome and record the next state.
|
||||||
|
|
||||||
|
### WS4 Tasks
|
||||||
|
|
||||||
|
1. rerun `sg_scene_generate` on `95598工单明细表`
|
||||||
|
2. record whether the sample now:
|
||||||
|
- resolves as `paginated_enrichment`
|
||||||
|
- or fail-closes inside `G3`
|
||||||
|
3. update the real-sample validation record layer
|
||||||
|
4. write a formal correction closure report
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. rerun output
|
||||||
|
2. updated real-sample validation assets
|
||||||
|
3. `G3` archetype-correction closure report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the rerun no longer reports `local_doc_pipeline` as the controlling archetype
|
||||||
|
2. the validation layer records the corrected family outcome
|
||||||
|
3. the next scope recommendation can move from `G3 archetype correction` to the next remaining mainline gap
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. the `G3` real sample no longer collapses into `local_doc_pipeline`
|
||||||
|
2. the corrected route is covered by automated regression
|
||||||
|
3. real-sample validation assets are updated with the new outcome
|
||||||
|
4. `G8` remains a valid boundary-family archetype with no unintended regression
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this plan completes:
|
||||||
|
|
||||||
|
1. if `G3` real-sample routing is corrected and still shows a `G3` contract gap, move to `G3` real-sample contract correction
|
||||||
|
2. if `G3` stabilizes, return to the next mainline mismatch in priority order, which is `G2` real-sample contract correction
|
||||||
@@ -0,0 +1,173 @@
|
|||||||
|
# G3 Real Sample Output Contract Verification Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g3-real-sample-output-contract-verification-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g3-real-sample-output-contract-verification-design.md)
|
||||||
|
> Trigger Report: [2026-04-19-g3-real-sample-runtime-contract-correction-closure-report.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/reports/2026-04-19-g3-real-sample-runtime-contract-correction-closure-report.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan implements the next bounded mainline scope after `G3` runtime-scope correction:
|
||||||
|
|
||||||
|
`G3 real-sample output / contract verification`
|
||||||
|
|
||||||
|
Its purpose is to reduce the remaining real-sample mismatch from a generic verification gap to either:
|
||||||
|
|
||||||
|
1. a verified pass
|
||||||
|
2. or a smaller named output/contract mismatch
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not reopen the completed `G3` archetype-correction scope
|
||||||
|
2. do not reopen the completed `G3` runtime-scope correction scope
|
||||||
|
3. do not broaden this work into `G8` runtime implementation
|
||||||
|
4. do not reopen `G3` family expansion or add unrelated fixtures
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
6. do not update validation assets until output verification produces a narrower outcome
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Real-Sample Output Contract Differential
|
||||||
|
2. `WS2` G3 Output Contract Narrowing
|
||||||
|
3. `WS3` Regression and Verification Integrity
|
||||||
|
4. `WS4` Real-Sample Verification Rerun and Closure
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Verification Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the scope to one remaining mismatch: `output_contract_not_verified`.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `95598工单明细表` as the only verification anchor
|
||||||
|
2. freeze current remaining mismatch:
|
||||||
|
- `output_contract_not_verified`
|
||||||
|
3. freeze current `G3` routing and runtime-scope behavior as completed constraints that must not regress
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. verification-boundary note
|
||||||
|
2. fixed output-gap statement
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no additional family or runtime scope is added under this plan
|
||||||
|
2. the correction target is explicitly `G3 output / contract verification`
|
||||||
|
|
||||||
|
## Phase 1: Build the Real-Sample Output Contract Differential
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Understand exactly what part of the generated real-sample contract is still unverified.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. compare the real generated `SceneIr` against the intended `G3` business output contract
|
||||||
|
2. isolate which fields are structurally present but semantically too broad
|
||||||
|
3. isolate whether the dominant gap is:
|
||||||
|
- main request selection
|
||||||
|
- enrichment request partitioning
|
||||||
|
- join key correctness
|
||||||
|
- merge/dedupe correctness
|
||||||
|
- export contract correctness
|
||||||
|
4. write a minimum verification-gap summary before code changes begin
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. output-contract differential note
|
||||||
|
2. minimum verification-gap summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the smallest remaining output mismatch is explicit
|
||||||
|
2. the next change target is narrower than the current generic verification label
|
||||||
|
|
||||||
|
## Phase 2: Narrow the G3 Output Contract Gap
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce the real-sample mismatch from generic non-verified output to a specific verified contract state.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. adjust the minimum `G3` output-contract logic only where the real sample proves it is too coarse
|
||||||
|
2. keep routing and runtime-scope logic unchanged unless required by output verification
|
||||||
|
3. preserve fail-closed behavior for scenes whose output contract is still unresolved
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bounded output-contract update
|
||||||
|
2. explicit verification rule in code or tests where needed
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the real-sample mismatch is narrower than `output_contract_not_verified`
|
||||||
|
2. no unrelated family is reclassified or broadened
|
||||||
|
3. the corrected result stays inside `G3`
|
||||||
|
|
||||||
|
## Phase 3: Lock Regression and Verification Integrity
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Prove the narrower contract logic does not create false positives.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. add or update regression that names the corrected real-sample verification pattern
|
||||||
|
2. retain mixed-boundary, `G8`, and canonical regressions
|
||||||
|
3. verify unresolved `G3` cases still fail closed when the output contract is genuinely incomplete
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. regression tests for `G3` output verification
|
||||||
|
2. updated assertions where needed
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no regression causes `G8` to disappear as a boundary archetype
|
||||||
|
2. no regression causes unrelated `single_request_table` or other families to drift
|
||||||
|
3. test coverage explicitly names the corrected output-verification pattern
|
||||||
|
|
||||||
|
## Phase 4: Rerun the Real Sample and Close the Loop
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Use the actual real sample to confirm the narrowed output-verification outcome and record the next state.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. rerun `sg_scene_generate` on `95598工单明细表`
|
||||||
|
2. record whether:
|
||||||
|
- the sample becomes `executed-pass`
|
||||||
|
- or the remaining mismatch is narrower than `output_contract_not_verified`
|
||||||
|
3. update the real-sample validation record layer
|
||||||
|
4. write a formal closure report
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. rerun output
|
||||||
|
2. updated real-sample validation assets
|
||||||
|
3. `G3` output-contract-verification closure report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the rerun no longer leaves the generic `output_contract_not_verified` label unchanged
|
||||||
|
2. the validation layer records a narrower family outcome
|
||||||
|
3. the next scope recommendation can move from `G3` to the next mainline gap when appropriate
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. the `G3` real sample no longer ends at the generic `output_contract_not_verified` label
|
||||||
|
2. the narrowed result is covered by automated regression
|
||||||
|
3. real-sample validation assets are updated with the new outcome
|
||||||
|
4. `G8` and prior `G3` routing/runtime corrections remain intact
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this plan completes:
|
||||||
|
|
||||||
|
1. if `G3` becomes `executed-pass`, return to the next mainline mismatch in priority order, which is `G2` real-sample contract correction
|
||||||
|
2. if `G3` still has a smaller output-specific mismatch, move only to that narrower `G3` verification slice
|
||||||
@@ -0,0 +1,166 @@
|
|||||||
|
# G3 Real Sample Runtime Contract Correction Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g3-real-sample-runtime-contract-correction-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g3-real-sample-runtime-contract-correction-design.md)
|
||||||
|
> Trigger Report: [2026-04-19-g3-real-sample-archetype-correction-closure-report.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/reports/2026-04-19-g3-real-sample-archetype-correction-closure-report.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan implements the next bounded scope selected after `G3` archetype correction:
|
||||||
|
|
||||||
|
`mainline G3 real-sample runtime / contract correction`
|
||||||
|
|
||||||
|
Its purpose is to narrow the remaining real-sample gap from a coarse runtime-scope failure to the smallest accurate contract state.
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not reopen the completed `G3` archetype-correction scope
|
||||||
|
2. do not broaden this work into `G8` runtime implementation
|
||||||
|
3. do not open `G4 / G5`
|
||||||
|
4. do not add new family-expansion fixtures unrelated to the real-sample mismatch
|
||||||
|
5. do not weaken fail-closed behavior for incomplete `G3` scenes
|
||||||
|
6. do not update validation assets until the rerun result changes
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Runtime-Scope Differential
|
||||||
|
2. `WS2` G3 Runtime-Scope Gate Narrowing
|
||||||
|
3. `WS3` Regression and Fail-Closed Integrity
|
||||||
|
4. `WS4` Real-Sample Rerun and Closure
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Correction Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the scope to one remaining mismatch: `G3` real-sample runtime scope compatibility.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `95598工单明细表` as the only correction anchor
|
||||||
|
2. freeze current remaining mismatch:
|
||||||
|
- `runtime_scope_gap`
|
||||||
|
- `output_contract_not_verified`
|
||||||
|
3. freeze current `G8` behavior as a boundary-family constraint that must not regress
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. correction-boundary note
|
||||||
|
2. fixed runtime-gap statement
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no additional family or runtime scope is added under this plan
|
||||||
|
2. the correction target is explicitly `G3 runtime scope`, not a broader runtime program
|
||||||
|
|
||||||
|
## Phase 1: Build the Runtime-Scope Differential
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Understand why the current gate still marks the real sample as runtime-incompatible.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. compare current `G3` runtime-scope gate logic against the corrected real-sample evidence
|
||||||
|
2. isolate which localhost evidence should remain subordinate
|
||||||
|
3. isolate what dominant-runtime pattern should still fail closed
|
||||||
|
4. write a minimum gate-narrowing summary before code changes begin
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. runtime-scope differential note
|
||||||
|
2. gate-narrowing summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the minimum change to `g3_runtime_scope_compatible` is explicit
|
||||||
|
2. the team can distinguish subordinate host-runtime evidence from dominant runtime takeover
|
||||||
|
|
||||||
|
## Phase 2: Narrow the G3 Runtime-Scope Gate
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Allow valid `G3` real samples with subordinate localhost evidence to stay runtime-compatible.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. narrow `g3_runtime_scope_compatible` so it considers business-chain dominance, not only localhost evidence count
|
||||||
|
2. preserve fail-closed behavior for scenes whose business chain is still not dominant
|
||||||
|
3. keep `G8` representative behavior intact
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. generator gate update
|
||||||
|
2. explicit regression rule for subordinate localhost evidence inside `G3`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the corrected real sample no longer fails the runtime-scope gate for the old coarse reason
|
||||||
|
2. `G8` representative classification remains intact
|
||||||
|
3. incomplete `G3` scenes still fail closed for `G3` reasons
|
||||||
|
|
||||||
|
## Phase 3: Lock Regression and Fail-Closed Integrity
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Prove the narrowed gate does not create a pseudo-runnable class of scenes.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. add regression for real-sample-like `G3` with subordinate localhost evidence
|
||||||
|
2. retain `G8` regression and mixed-boundary regression
|
||||||
|
3. verify unresolved `G3` scenes still fail closed when business-chain dominance is absent
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. regression tests for `G3 runtime scope`
|
||||||
|
2. updated assertions where needed
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no regression causes `G8` to disappear as a boundary archetype
|
||||||
|
2. no regression causes unrelated `single_request_table` or other families to drift
|
||||||
|
3. test coverage explicitly names the corrected runtime-scope pattern
|
||||||
|
|
||||||
|
## Phase 4: Rerun the Real Sample and Close the Loop
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Use the actual real sample to confirm the narrowed runtime-scope outcome and record the next state.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. rerun `sg_scene_generate` on `95598工单明细表`
|
||||||
|
2. record whether:
|
||||||
|
- `g3_runtime_scope_compatible` now passes
|
||||||
|
- remaining mismatch, if any, is narrower than runtime-scope failure
|
||||||
|
3. update the real-sample validation record layer
|
||||||
|
4. write a formal closure report
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. rerun output
|
||||||
|
2. updated real-sample validation assets
|
||||||
|
3. `G3` runtime-contract-correction closure report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the rerun no longer fails for `g3_runtime_scope`
|
||||||
|
2. the validation layer records the narrowed family outcome
|
||||||
|
3. the next scope recommendation can move from `G3 runtime correction` to the next remaining mainline gap
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. the `G3` real sample no longer fails for the old coarse runtime-scope reason
|
||||||
|
2. the narrowed gate is covered by automated regression
|
||||||
|
3. real-sample validation assets are updated with the new outcome
|
||||||
|
4. `G8` remains a valid boundary-family archetype with no unintended regression
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this plan completes:
|
||||||
|
|
||||||
|
1. if `G3` still has a narrower output or data-verification gap, move to `G3` real-sample output or contract verification
|
||||||
|
2. if `G3` stabilizes, return to the next mainline mismatch in priority order, which is `G2` real-sample contract correction
|
||||||
@@ -0,0 +1,50 @@
|
|||||||
|
# G3 Residual 4 Workflow Evidence Closure Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Plan: `docs/superpowers/plans/2026-04-19-structured-fail-closed-residual-13-closure-plan.md`
|
||||||
|
> Parent Route: `Residual Route A`
|
||||||
|
> Parent Layer: `Layer C`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Close the `4` remaining `G3 / paginated_enrichment` structured fail-closed scenes by recovering missing workflow evidence without relaxing gates.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
1. `sweep-007-scene` / `95598供电服务月报`
|
||||||
|
2. `sweep-039-scene` / `故障报修工单信息统计表`
|
||||||
|
3. `sweep-068-scene` / `输变电设备运行分析报告`
|
||||||
|
4. `sweep-084-scene` / `巡视计划完成情况自动检索`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/scene_generator_test.rs`
|
||||||
|
4. route-local follow-up JSON/report assets
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. family baseline manifests
|
||||||
|
3. G6/G8 runtime implementation files
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. inspect the four fixed reports and source scenes;
|
||||||
|
2. identify the repeated missing G3 evidence subtype;
|
||||||
|
3. implement one bounded G3 recovery slice;
|
||||||
|
4. rerun only the four fixed scenes;
|
||||||
|
5. publish delta report.
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
Target: reduce the `4` G3 residual fail-closed records.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the four-scene route-local follow-up and report.
|
||||||
|
|
||||||
|
Do not continue into G2 or boundary residual work.
|
||||||
|
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
# G3 Residual Contract Closure Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Route: `Route 2: G3 / paginated_enrichment`
|
||||||
|
> Parent Layer: `Layer C + Layer D`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-g3-residual-contract-closure-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement the final bounded Route 2 slice for any `G3` residual contract blockers left after enrichment-request and export-plan closure work.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
Residual `G3 / paginated_enrichment` bucket after the first two Route 2 child plans.
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `tests/scene_generator_test.rs`
|
||||||
|
5. route-local residual inventory and report assets
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. Route 3+ implementation assets
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. freeze post-Route-2 residual inventory
|
||||||
|
2. group residual blockers
|
||||||
|
3. implement at most one bounded residual correction slice
|
||||||
|
4. rerun bounded validation
|
||||||
|
5. declare Route 2 complete or deferred
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. shrink or explicitly name the final residual `G3` bucket
|
||||||
|
2. produce a clean handoff into Route 3
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. Route 2 is no longer open-ended
|
||||||
|
2. remaining residual `G3` records are explicitly categorized
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after Route 2 is explicitly closed or deferred.
|
||||||
|
|
||||||
|
Do not begin Route 3 work under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,72 @@
|
|||||||
|
# G6 Host-Bridge Callback Semantics Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g6-host-bridge-callback-semantics-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g6-host-bridge-callback-semantics-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan executes one bounded next slice:
|
||||||
|
|
||||||
|
`G6 host-bridge callback semantics`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not execute a `G6` real sample
|
||||||
|
2. do not implement host-runtime directly
|
||||||
|
3. do not open `G8`
|
||||||
|
4. do not reopen `G7`
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze callback semantics scope
|
||||||
|
2. `WS2` Define completion-state semantics
|
||||||
|
3. `WS3` Publish one bounded callback-semantic result
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the plan to callback semantics only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G6` as the only target
|
||||||
|
2. freeze transport/runtime implementation and real execution as out of scope
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no broader host-runtime work begins under this plan
|
||||||
|
|
||||||
|
## Phase 1: Define Completion-State Semantics
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn callback completion into an explicit bounded semantic model.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define `ok`
|
||||||
|
2. define `partial`
|
||||||
|
3. define `blocked`
|
||||||
|
4. define `error`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. callback state logic is explicit and bounded
|
||||||
|
|
||||||
|
## Phase 2: Publish the Bounded Result
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the callback-semantic model into one bounded next artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. publish the semantic result
|
||||||
|
2. if needed, publish the next bounded follow-up plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step remains narrower than direct host-runtime implementation
|
||||||
@@ -0,0 +1,72 @@
|
|||||||
|
# G6 Host-Bridge Callback State Verification Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g6-host-bridge-callback-state-verification-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g6-host-bridge-callback-state-verification-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan executes one bounded next slice:
|
||||||
|
|
||||||
|
`G6 host-bridge callback state verification`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not execute a `G6` real sample
|
||||||
|
2. do not implement host-runtime directly
|
||||||
|
3. do not open `G8`
|
||||||
|
4. do not reopen `G7`
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze callback-state verification scope
|
||||||
|
2. `WS2` Define verification targets for `ok/partial/blocked/error`
|
||||||
|
3. `WS3` Publish one bounded verification result
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the plan to callback-state verification only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G6` as the only target
|
||||||
|
2. freeze implementation and real execution as out of scope
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no broader host-runtime work begins under this plan
|
||||||
|
|
||||||
|
## Phase 1: Define Verification Targets
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the explicit callback states into bounded verification targets.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define verification target for `ok`
|
||||||
|
2. define verification target for `partial`
|
||||||
|
3. define verification target for `blocked`
|
||||||
|
4. define verification target for `error`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. callback verification targets are explicit and bounded
|
||||||
|
|
||||||
|
## Phase 2: Publish the Bounded Result
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the callback-state verification model into one bounded next artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. publish the verification result
|
||||||
|
2. if needed, publish the next bounded follow-up plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step remains narrower than direct host-runtime implementation
|
||||||
@@ -0,0 +1,71 @@
|
|||||||
|
# G6 Host-Bridge Entry Gate Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g6-host-bridge-entry-gate-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g6-host-bridge-entry-gate-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan executes one bounded next slice:
|
||||||
|
|
||||||
|
`G6 host-bridge entry gate`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not execute a `G6` real sample
|
||||||
|
2. do not implement host-runtime directly
|
||||||
|
3. do not open `G8`
|
||||||
|
4. do not reopen `G7`
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze entry-gate scope
|
||||||
|
2. `WS2` Define bounded gate conditions
|
||||||
|
3. `WS3` Publish one bounded gate result
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the plan to entry-gate modeling only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G6` as the only target
|
||||||
|
2. freeze implementation and real execution as out of scope
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no broader host-runtime work begins under this plan
|
||||||
|
|
||||||
|
## Phase 1: Define Gate Conditions
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the semantic readiness criteria into bounded gate conditions.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define hard gate conditions
|
||||||
|
2. define soft/optional later conditions
|
||||||
|
3. define fail-close gate reasons
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. entry-gate conditions are explicit and bounded
|
||||||
|
|
||||||
|
## Phase 2: Publish the Bounded Result
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the gate model into one bounded next artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. publish the gate result
|
||||||
|
2. if needed, publish the next bounded follow-up plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step remains narrower than direct host-runtime implementation
|
||||||
@@ -0,0 +1,70 @@
|
|||||||
|
# G6 Host-Bridge Entry Gate Verification Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g6-host-bridge-entry-gate-verification-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g6-host-bridge-entry-gate-verification-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan executes one bounded next slice:
|
||||||
|
|
||||||
|
`G6 host-bridge entry gate verification`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not execute a `G6` real sample
|
||||||
|
2. do not implement host-runtime directly
|
||||||
|
3. do not open `G8`
|
||||||
|
4. do not reopen `G7`
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze gate-verification scope
|
||||||
|
2. `WS2` Define bounded verification targets for the hard gate
|
||||||
|
3. `WS3` Publish one bounded verification result
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the plan to gate verification only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G6` as the only target
|
||||||
|
2. freeze implementation and real execution as out of scope
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no broader host-runtime work begins under this plan
|
||||||
|
|
||||||
|
## Phase 1: Define Verification Targets
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the hard gate into bounded verification targets.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define verification target for each hard gate condition
|
||||||
|
2. define verification target for each fail-close reason
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. gate verification targets are explicit and bounded
|
||||||
|
|
||||||
|
## Phase 2: Publish the Bounded Result
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the gate-verification model into one bounded next artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. publish the verification result
|
||||||
|
2. if needed, publish the next bounded follow-up plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step remains narrower than direct host-runtime implementation
|
||||||
@@ -0,0 +1,71 @@
|
|||||||
|
# G6 Host-Bridge Entry Readiness Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g6-host-bridge-entry-readiness-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g6-host-bridge-entry-readiness-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan executes one bounded next slice:
|
||||||
|
|
||||||
|
`G6 host-bridge entry readiness`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not execute a `G6` real sample
|
||||||
|
2. do not implement host-runtime directly
|
||||||
|
3. do not open `G8`
|
||||||
|
4. do not reopen `G7`
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze entry-readiness scope
|
||||||
|
2. `WS2` Define bounded readiness criteria
|
||||||
|
3. `WS3` Publish one bounded readiness result
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the plan to entry-readiness only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G6` as the only target
|
||||||
|
2. freeze implementation and real execution as out of scope
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no broader host-runtime work begins under this plan
|
||||||
|
|
||||||
|
## Phase 1: Define Readiness Criteria
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the explicit callback verification model into bounded entry-readiness criteria.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define which semantics are required before `G6` entry can open
|
||||||
|
2. define which semantics remain optional
|
||||||
|
3. define the minimal readiness threshold
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. entry-readiness criteria are explicit and bounded
|
||||||
|
|
||||||
|
## Phase 2: Publish the Bounded Result
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the readiness model into one bounded next artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. publish the readiness result
|
||||||
|
2. if needed, publish the next bounded follow-up plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step remains narrower than direct host-runtime implementation
|
||||||
@@ -0,0 +1,71 @@
|
|||||||
|
# G6 Host-Bridge Execution Semantics Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g6-host-bridge-execution-semantics-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g6-host-bridge-execution-semantics-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan executes one bounded next slice:
|
||||||
|
|
||||||
|
`G6 host-bridge execution semantics`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not execute a `G6` real sample
|
||||||
|
2. do not implement host-runtime directly
|
||||||
|
3. do not open `G8`
|
||||||
|
4. do not reopen `G7`
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze the semantic boundary
|
||||||
|
2. `WS2` Separate bridge invocation from callback completion
|
||||||
|
3. `WS3` Publish one bounded semantic result
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the plan to semantic scoping only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G6` as the only target
|
||||||
|
2. freeze real execution and implementation as out of scope
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no host-runtime implementation begins under this plan
|
||||||
|
|
||||||
|
## Phase 1: Separate the Minimum Semantics
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the blocked capability into explicit bounded semantics.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. isolate bridge action invocation semantics
|
||||||
|
2. isolate callback completion semantics
|
||||||
|
3. keep both separate from broader host-runtime work
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the semantic model is explicit and bounded
|
||||||
|
|
||||||
|
## Phase 2: Publish the Bounded Result
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the semantic model into one bounded next artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. publish the semantic result
|
||||||
|
2. if needed, publish the next bounded follow-up plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step remains narrower than direct host-runtime implementation
|
||||||
@@ -0,0 +1,71 @@
|
|||||||
|
# G6 Host-Bridge Prerequisites Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g6-host-bridge-prerequisites-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g6-host-bridge-prerequisites-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan executes one bounded next slice:
|
||||||
|
|
||||||
|
`G6 host-bridge prerequisites`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not execute a `G6` real sample
|
||||||
|
2. do not implement host-runtime directly under this plan
|
||||||
|
3. do not reopen `G7`
|
||||||
|
4. do not open `G8`
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze the G6 prerequisite boundary
|
||||||
|
2. `WS2` Isolate the minimum blocked host-bridge capability
|
||||||
|
3. `WS3` Publish one bounded prerequisite result
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the plan to `G6` prerequisite scoping only.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G6` as the only target
|
||||||
|
2. freeze `G6` real-sample execution as out of scope
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no other boundary family is touched under this plan
|
||||||
|
|
||||||
|
## Phase 1: Isolate the Minimum Blocked Capability
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce `G6` prerequisite pressure to the smallest explicit capability gap.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. restate the current `G6` hold condition
|
||||||
|
2. isolate the minimum host-bridge execution semantic still missing
|
||||||
|
3. keep that capability separate from broader runtime-platform work
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the blocked capability is explicit and bounded
|
||||||
|
|
||||||
|
## Phase 2: Publish the Bounded Result
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the isolated prerequisite into one bounded next artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. publish the prerequisite result
|
||||||
|
2. if needed, publish the next bounded follow-up plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step is narrower than broad host-runtime implementation
|
||||||
@@ -0,0 +1,159 @@
|
|||||||
|
# G6 Real-Sample Entry Preparation And Bounded Execution Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g6-real-sample-entry-preparation-and-bounded-execution-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g6-real-sample-entry-preparation-and-bounded-execution-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan is the only surviving `G6` execution plan after the redesign.
|
||||||
|
|
||||||
|
Its purpose is:
|
||||||
|
|
||||||
|
`stop G6 planning recursion and move directly to one bounded implementation-plus-real-sample slice`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not open any new `G6` semantic sub-plan
|
||||||
|
2. do not reopen `G7`
|
||||||
|
3. do not open `G8`
|
||||||
|
4. do not open `G4 / G5`
|
||||||
|
5. do not broaden into host-runtime platform redesign
|
||||||
|
6. do not add more than one fixed `G6` real sample
|
||||||
|
|
||||||
|
## Preserved G6 Gate
|
||||||
|
|
||||||
|
The final frozen `G6` gate under this plan is:
|
||||||
|
|
||||||
|
### Hard Conditions
|
||||||
|
|
||||||
|
1. `host-bridge-action-invocation-defined`
|
||||||
|
2. `callback-request-completion-defined`
|
||||||
|
3. `callback-state-verification-targets-defined`
|
||||||
|
|
||||||
|
### Soft Later Conditions
|
||||||
|
|
||||||
|
1. `host-runtime-transport-implementation`
|
||||||
|
2. `real-sample-execution-proof`
|
||||||
|
|
||||||
|
### Fail-Close Reasons
|
||||||
|
|
||||||
|
1. `g6_bridge_invocation_semantics_missing`
|
||||||
|
2. `g6_callback_completion_semantics_missing`
|
||||||
|
3. `g6_callback_state_targets_missing`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze the Final G6 Entry Gate
|
||||||
|
2. `WS2` Implement the Minimum Host-Bridge Execution Seam
|
||||||
|
3. `WS3` Run the Fixed G6 Real Sample
|
||||||
|
4. `WS4` Write Back Validation And Close
|
||||||
|
|
||||||
|
## Phase 0: Freeze The Final Gate
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Stop semantic drift and declare the gate final for this execution slice.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. treat the hard `G6` gate as frozen
|
||||||
|
2. treat the fail-close reasons as frozen
|
||||||
|
3. explicitly forbid any further `G6` semantic micro-plan under this line
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. final frozen `G6` gate note
|
||||||
|
2. final fixed-sample statement
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no further `G6` semantic clarification plan is produced
|
||||||
|
|
||||||
|
## Phase 1: Implement The Minimum Execution Seam
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Add only the minimum implementation needed to let the fixed `G6` real sample enter one controlled execution attempt.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. implement the minimum host-bridge invocation seam required by the fixed sample
|
||||||
|
2. implement the minimum callback completion handling required by the fixed sample
|
||||||
|
3. keep the change narrower than generic host-runtime redesign
|
||||||
|
4. preserve fail-close behavior when the frozen hard conditions are not met
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bounded `G6` code change
|
||||||
|
2. bounded regression tests
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `G6` execution support is improved only at the seam required by the fixed sample
|
||||||
|
2. unrelated families are untouched
|
||||||
|
3. fail-close remains explicit
|
||||||
|
|
||||||
|
## Phase 2: Execute The Fixed Real Sample
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Use one real `G6` sample to prove whether the bounded implementation slice is enough.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. run the fixed `G6` real sample once
|
||||||
|
2. classify the result only as:
|
||||||
|
- `executed-pass`
|
||||||
|
- `named mismatch`
|
||||||
|
3. do not open a new semantic sub-plan regardless of result
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. real execution result
|
||||||
|
2. fixed-sample execution note
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the result is narrower than “not executed”
|
||||||
|
2. the result is not deferred into another semantic-planning loop
|
||||||
|
|
||||||
|
## Phase 3: Validation Closure
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Write the fixed result back and close the line.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. update validation-layer assets
|
||||||
|
2. if pass: close `G6`
|
||||||
|
3. if mismatch: write one implementation correction plan only
|
||||||
|
4. publish a closure report
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. validation asset update
|
||||||
|
2. closure report
|
||||||
|
3. optional implementation correction plan if mismatch occurs
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `G6` ends in `executed-pass` or `named mismatch`
|
||||||
|
2. no new semantic micro-plan is emitted
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. one bounded implementation seam is landed
|
||||||
|
2. one fixed `G6` real sample is executed
|
||||||
|
3. the line closes with `executed-pass` or `named mismatch`
|
||||||
|
|
||||||
|
## Non-Negotiable Stop Rule
|
||||||
|
|
||||||
|
After this plan starts executing:
|
||||||
|
|
||||||
|
1. do not create another `G6` semantic plan
|
||||||
|
2. if the run fails, create only one implementation correction plan
|
||||||
|
3. if the run passes, close the `G6` line immediately
|
||||||
@@ -0,0 +1,92 @@
|
|||||||
|
# G7 Real-Sample Entry Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-g7-real-sample-entry-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-g7-real-sample-entry-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This plan executes one bounded next slice:
|
||||||
|
|
||||||
|
`G7 real-sample entry`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not reopen mainline families
|
||||||
|
2. do not execute `G6` or `G8`
|
||||||
|
3. do not add new `G7` family fixtures
|
||||||
|
4. do not implement new runtime-platform prerequisites under this plan
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Fixed Verification Anchor
|
||||||
|
|
||||||
|
The only target under this plan is:
|
||||||
|
|
||||||
|
1. `计量资产库存统计`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze the G7 real-sample boundary
|
||||||
|
2. `WS2` Build the real-sample contract differential
|
||||||
|
3. `WS3` Rerun the fixed real sample against the existing G7 runtime contract
|
||||||
|
4. `WS4` Update validation assets and close the loop
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the plan to one `G7` representative sample.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `计量资产库存统计` as the only real-sample anchor
|
||||||
|
2. freeze existing `G7` repo-local runtime contract as the starting baseline
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no other boundary family is touched under this plan
|
||||||
|
|
||||||
|
## Phase 1: Build the Differential
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Understand whether the existing `G7` runtime contract is already close enough for a real-sample rerun.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. compare the representative `G7` fixture contract to the chosen real sample
|
||||||
|
2. isolate the smallest remaining contract risk
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the rerun target is explicit and bounded
|
||||||
|
|
||||||
|
## Phase 2: Real-Sample Rerun
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Use the fixed real sample to test the current `G7` runtime contract.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. run `sg_scene_generate` on the fixed `G7` real sample
|
||||||
|
2. record whether the result is a pass or a smaller mismatch
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the outcome is narrower than `not yet executed`
|
||||||
|
|
||||||
|
## Phase 3: Validation Closure
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Write the result back into the validation layer and close the bounded slice.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. update validation assets if the outcome narrows
|
||||||
|
2. write a closure report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next boundary-family ambiguity is reduced further without broadening roadmap scope
|
||||||
@@ -0,0 +1,44 @@
|
|||||||
|
# Host-Bridge Runtime Roadmap Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Sequence: `2026-04-19-final-2-residual-child-plan-sequence-plan.md`
|
||||||
|
> Fixed Scene: `sweep-085-scene`
|
||||||
|
> Status: Draft
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Run a bounded host-bridge runtime slice for the single remaining `host_bridge_workflow` residual.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
1. `sweep-085-scene`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/scene_generator_test.rs`
|
||||||
|
4. `tests/fixtures/generated_scene/host_bridge_runtime_followup_2026-04-19.json`
|
||||||
|
5. `tests/fixtures/generated_scene/host_bridge_runtime_reconciliation_candidates_2026-04-19.json`
|
||||||
|
6. `docs/superpowers/reports/2026-04-19-host-bridge-runtime-roadmap-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Freeze the current `sweep-085-scene` generation report.
|
||||||
|
2. Identify the exact host-bridge runtime missing piece.
|
||||||
|
3. Implement at most one bounded correction slice if it can be expressed in generated-scene contract or fail-closed reporting.
|
||||||
|
4. Rerun only `sweep-085-scene`.
|
||||||
|
5. Publish follow-up and reconciliation candidate assets.
|
||||||
|
|
||||||
|
## Expected Delta
|
||||||
|
|
||||||
|
Target delta is `+1 framework-auto-pass-candidate` if the host-bridge contract can be closed without full runtime transport. Otherwise the delta is `0`, with a narrower named runtime hold.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the single-scene follow-up and reconciliation candidates are published. Do not update the official board under this plan.
|
||||||
@@ -0,0 +1,50 @@
|
|||||||
|
# Local-Doc Official Board Reconciliation Refresh Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Roadmap: `2026-04-19-local-doc-runtime-roadmap-plan.md`
|
||||||
|
> Status: Active
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Refresh the official execution board using the five local-doc framework auto-pass candidates produced by the local-doc runtime roadmap.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/local_doc_runtime_reconciliation_candidates_2026-04-19.json`
|
||||||
|
2. `tests/fixtures/generated_scene/promotion_board_reconciliation_policy_2026-04-19.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/local_doc_official_board_reconciliation_refresh_2026-04-19.json`
|
||||||
|
3. `docs/superpowers/reports/2026-04-19-local-doc-official-board-reconciliation-refresh-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Load the official execution board.
|
||||||
|
2. Load the local-doc reconciliation candidates.
|
||||||
|
3. Verify the candidate asset contains exactly the five fixed local-doc scene ids.
|
||||||
|
4. Match board rows by `sceneId`.
|
||||||
|
5. Update only framework-layer fields for the five matched rows.
|
||||||
|
6. Recompute board framework summary counts.
|
||||||
|
7. Publish reconciliation refresh JSON.
|
||||||
|
8. Publish reconciliation refresh report.
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
1. Board scene count remains `102`.
|
||||||
|
2. The five fixed local-doc scene ids have `currentFrameworkStatus = framework-auto-pass`.
|
||||||
|
3. Board framework counts are `framework-auto-pass = 100` and `framework-structured-fail-closed = 2`.
|
||||||
|
4. Host-bridge and bootstrap residuals remain structured fail-closed.
|
||||||
|
5. Analyzer and generator are not modified by this plan.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the local-doc official board reconciliation refresh JSON and report are published. Do not start host-bridge runtime or bootstrap normalization under this plan.
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
# Local-Doc Runtime Roadmap Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Decision: `2026-04-19-residual-runtime-roadmap-prioritization-plan.md`
|
||||||
|
> Parent Residual Bucket: `local_doc_pipeline`
|
||||||
|
> Status: Draft
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Plan the bounded closure path for the five `local_doc_pipeline` residuals selected by the residual runtime roadmap prioritization decision.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
Only these scenes are in scope:
|
||||||
|
|
||||||
|
1. `sweep-033-scene`
|
||||||
|
2. `sweep-034-scene`
|
||||||
|
3. `sweep-042-scene`
|
||||||
|
4. `sweep-051-scene`
|
||||||
|
5. `sweep-074-scene`
|
||||||
|
|
||||||
|
## Initial Phases
|
||||||
|
|
||||||
|
### Phase 0: Freeze Local-Doc Residual Baseline
|
||||||
|
|
||||||
|
Capture current generation reports and missing pieces for the five scenes.
|
||||||
|
|
||||||
|
### Phase 1: Local-Doc Evidence Inventory
|
||||||
|
|
||||||
|
Classify document source, attachment dependency, local service dependency, and output artifact expectation.
|
||||||
|
|
||||||
|
### Phase 2: Minimal Local-Doc Contract Design
|
||||||
|
|
||||||
|
Define the smallest contract that can distinguish runnable local-doc pipelines from policy-held local-doc pipelines.
|
||||||
|
|
||||||
|
### Phase 3: Bounded Implementation Slice
|
||||||
|
|
||||||
|
Implement only the contract recovery or fail-closed detail required by the five-scene bucket.
|
||||||
|
|
||||||
|
### Phase 4: Follow-Up Sweep And Reconciliation
|
||||||
|
|
||||||
|
Rerun only the five target scenes and publish candidates. Do not update the official board inside this phase.
|
||||||
|
|
||||||
|
## Forbidden Scope
|
||||||
|
|
||||||
|
1. host-bridge runtime roadmap;
|
||||||
|
2. bootstrap target normalization;
|
||||||
|
3. G4/G5;
|
||||||
|
4. full attachment runtime implementation unless explicitly required by the minimal contract;
|
||||||
|
5. official board update.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the local-doc five-scene follow-up and reconciliation candidates are published. A later official board reconciliation plan must consume the result.
|
||||||
@@ -0,0 +1,50 @@
|
|||||||
|
# Official Board Reconciliation Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Layer: `Layer E`
|
||||||
|
> Status: Active
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Update the official execution board from the final coverage status rollup.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/final_coverage_status_rollup_2026-04-19.json`
|
||||||
|
2. `tests/fixtures/generated_scene/promotion_board_reconciliation_policy_2026-04-19.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/official_board_reconciliation_2026-04-19.json`
|
||||||
|
3. `docs/superpowers/reports/2026-04-19-official-board-reconciliation-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Load the official execution board.
|
||||||
|
2. Load the final coverage rollup.
|
||||||
|
3. Match scenes by `sceneId` where present, falling back to ordered index only if necessary.
|
||||||
|
4. Preserve frozen workbook fields.
|
||||||
|
5. Add final framework status fields to each board scene.
|
||||||
|
6. Update board summary with framework status counts.
|
||||||
|
7. Publish reconciliation JSON.
|
||||||
|
8. Publish reconciliation report.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. Board scene count remains `102`.
|
||||||
|
2. Framework status counts are `95` framework auto-pass and `7` structured fail-closed.
|
||||||
|
3. No source-unreadable, unsupported-family, missing-source, or unresolved status remains.
|
||||||
|
4. Analyzer and generator are not modified by this plan.
|
||||||
|
5. Reconciliation report is published.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the official board reconciliation JSON and report are published. Do not start runtime-roadmap work under this plan.
|
||||||
@@ -0,0 +1,144 @@
|
|||||||
|
# Post-G7 Boundary Decision Roadmap Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: [2026-04-19-post-g7-boundary-decision-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-post-g7-boundary-decision-roadmap-design.md)
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
This roadmap determines the next bounded step after `G7` has already closed as the first boundary-family executed real sample.
|
||||||
|
|
||||||
|
Its only purpose is:
|
||||||
|
|
||||||
|
`decide whether G6 or G8 may enter real-sample execution scope next, or whether both remain held pending prerequisites`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not reopen `G7`
|
||||||
|
2. do not reopen `G1-E / G2 / G3`
|
||||||
|
3. do not implement runtime-platform prerequisites under this roadmap
|
||||||
|
4. do not execute real samples for more than one remaining boundary family
|
||||||
|
5. do not open `G4 / G5`
|
||||||
|
|
||||||
|
## Candidate Directions
|
||||||
|
|
||||||
|
The only remaining directions under this roadmap are:
|
||||||
|
|
||||||
|
1. `G6`
|
||||||
|
2. `G8`
|
||||||
|
3. `prerequisites-only hold`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Freeze the Post-G7 Starting State
|
||||||
|
2. `WS2` Compare G6 and G8 Entry Cost
|
||||||
|
3. `WS3` Select One Next Direction
|
||||||
|
4. `WS4` Publish the Next Bounded Slice
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Starting State
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Lock the roadmap start point so the decision cannot drift back into closed work.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze `G7` as closed executed-pass
|
||||||
|
2. freeze `G6` and `G8` as the only remaining boundary candidates
|
||||||
|
3. freeze `G1-E / G2 / G3` as closed
|
||||||
|
4. freeze `G4 / G5` as out of scope
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. starting-state note
|
||||||
|
2. fixed candidate list
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no closed family is reopened under this roadmap
|
||||||
|
|
||||||
|
## Phase 1: Compare the Remaining Boundary Candidates
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Compare `G6` and `G8` using explicit entry cost and prerequisite pressure.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. restate the current hold condition for `G6`
|
||||||
|
2. restate the current hold condition for `G8`
|
||||||
|
3. compare which one requires the smaller new capability to enter real-sample scope
|
||||||
|
4. compare whether either direction is still too expensive and should remain held
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `G6 vs G8` comparison matrix
|
||||||
|
2. smallest-next-step summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the preferred next direction is justified explicitly
|
||||||
|
2. the non-selected direction has an explicit hold reason
|
||||||
|
|
||||||
|
## Phase 2: Select One Next Direction
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce the post-`G7` ambiguity to one bounded decision.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select exactly one direction:
|
||||||
|
- `G6`
|
||||||
|
- `G8`
|
||||||
|
- or `prerequisites-only hold`
|
||||||
|
2. record why the other directions remain out of scope
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. post-`G7` boundary decision
|
||||||
|
2. hold reasons for non-selected directions
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. only one next direction is opened
|
||||||
|
2. the decision is bounded and defensible
|
||||||
|
|
||||||
|
## Phase 3: Publish the Next Bounded Slice
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the decision into the next executable bounded artifact.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. if `G6` is selected, write a bounded `G6 real-sample entry` design and plan
|
||||||
|
2. if `G8` is selected, write a bounded `G8 real-sample entry` design and plan
|
||||||
|
3. if `prerequisites-only hold` is selected, write a bounded prerequisites roadmap
|
||||||
|
4. publish a roadmap closure report
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. next bounded `design`
|
||||||
|
2. next bounded `plan`
|
||||||
|
3. roadmap closure report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. the next step is ready without extending this roadmap
|
||||||
|
2. only one bounded direction is emitted
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This roadmap is complete when:
|
||||||
|
|
||||||
|
1. the post-`G7` next step is reduced to one bounded direction
|
||||||
|
2. `G6` and `G8` no longer compete ambiguously
|
||||||
|
3. a single follow-up `design + plan` exists for the selected direction
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
After this roadmap completes:
|
||||||
|
|
||||||
|
1. execute the selected bounded slice
|
||||||
|
2. do not reopen this roadmap during execution
|
||||||
@@ -0,0 +1,60 @@
|
|||||||
|
# Promotion And Board Reconciliation Policy Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Route: `Route 6: promotion and board reconciliation`
|
||||||
|
> Parent Layer: `Layer E`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-promotion-and-board-reconciliation-policy-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Publish the promotion and reconciliation policy that governs how future stronger statuses may update official scene-state assets.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
Policy inputs only:
|
||||||
|
|
||||||
|
1. `auto-pass`
|
||||||
|
2. `fail-closed-known`
|
||||||
|
3. `adjudicated-valid-host-bridge`
|
||||||
|
4. hygiene-aware timeout interpretation
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. policy design and plan docs
|
||||||
|
2. policy JSON assets
|
||||||
|
3. policy reports
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. define promotion thresholds
|
||||||
|
2. define how timeout hygiene is represented
|
||||||
|
3. define how structured fail-closed progress is represented
|
||||||
|
4. define what evidence is sufficient for board reconciliation
|
||||||
|
5. publish policy assets
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
No direct scene-count delta is required.
|
||||||
|
|
||||||
|
The expected result is policy readiness for later rule-driven reconciliation.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. promotion thresholds are explicit
|
||||||
|
2. timeout hygiene representation is explicit
|
||||||
|
3. board update rules are explicit
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the Route 6 policy is published.
|
||||||
|
|
||||||
|
Do not update the execution board under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,168 @@
|
|||||||
|
# Remaining Route Conflict Correction Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: `docs/superpowers/specs/2026-04-19-remaining-route-conflict-correction-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Adjudicate and, where evidence supports it, correct the remaining `4` route conflicts from the follow-up full sweep.
|
||||||
|
|
||||||
|
This is a bounded route-conflict plan, not a new full-sweep roadmap.
|
||||||
|
|
||||||
|
## Fixed Input
|
||||||
|
|
||||||
|
Use only the `4` `misclassified` records from:
|
||||||
|
|
||||||
|
`tests/fixtures/generated_scene/full_sweep_improvement_followup_2026-04-19.json`
|
||||||
|
|
||||||
|
The fixed scene set is:
|
||||||
|
|
||||||
|
1. `95598报修工单日管控`
|
||||||
|
2. `95598重要服务事项报备统计表`
|
||||||
|
3. `台区线损台区月度高负损预测`
|
||||||
|
4. `配网支撑月报(95598抢修统计报表)`
|
||||||
|
|
||||||
|
## Fixed Outputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/remaining_route_conflict_decisions_2026-04-19.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-19-remaining-route-conflict-correction-report.md`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not touch timeout handling
|
||||||
|
2. do not touch structured fail-closed reporting
|
||||||
|
3. do not add new families
|
||||||
|
4. do not update execution board
|
||||||
|
5. do not promote scenes
|
||||||
|
6. do not weaken current `G2/G3/G6` pass cases
|
||||||
|
7. do not force a scene into G2/G3 if host bridge is the only complete path
|
||||||
|
|
||||||
|
## Phase 0: Freeze Conflict Set
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the `4` route conflicts as the only input.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. read `full_sweep_improvement_followup_2026-04-19.json`
|
||||||
|
2. select only `dryRunStatus = misclassified`
|
||||||
|
3. verify count is `4`
|
||||||
|
4. freeze expected group and inferred archetype for each record
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. frozen route conflict inventory
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. exactly `4` records are in scope
|
||||||
|
2. no extra scene is added
|
||||||
|
|
||||||
|
## Phase 1: Evidence Adjudication
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Decide whether each conflict should be corrected or retained as host bridge.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. inspect existing generation reports for the `4` records
|
||||||
|
2. compare business-chain evidence against host-bridge evidence
|
||||||
|
3. apply the route decision model:
|
||||||
|
- `route-corrected-to-g3`
|
||||||
|
- `route-corrected-to-g2`
|
||||||
|
- `valid-host-bridge-workflow`
|
||||||
|
- `board-expectation-stale`
|
||||||
|
- `route-conflict-unresolved`
|
||||||
|
4. write preliminary decisions
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. preliminary route conflict decision table
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `4` records have a preliminary decision
|
||||||
|
2. no code is changed before evidence is adjudicated
|
||||||
|
|
||||||
|
## Phase 2: Bounded Route Correction
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Apply only the route corrections justified by Phase 1.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. update analyzer routing precedence only if evidence supports correction
|
||||||
|
2. keep valid host-bridge cases unchanged
|
||||||
|
3. add targeted regression tests for corrected cases
|
||||||
|
4. preserve existing `G2/G3/G6` real-sample and canonical tests
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bounded analyzer routing patch if needed
|
||||||
|
2. route conflict regression tests
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. corrected records no longer misclassify
|
||||||
|
2. valid host-bridge records remain host bridge
|
||||||
|
3. no broad routing rewrite is introduced
|
||||||
|
|
||||||
|
## Phase 3: Targeted Probe
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Verify only the fixed `4` records after correction.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. rerun generation for the same `4` scenes
|
||||||
|
2. record resulting archetype and readiness
|
||||||
|
3. classify each final decision
|
||||||
|
4. write final decision JSON
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `remaining_route_conflict_decisions_2026-04-19.json`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `4` records have final probe results
|
||||||
|
2. no full `102` sweep is required by this plan
|
||||||
|
|
||||||
|
## Phase 4: Report and Stop
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Publish the route conflict report and stop.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write the route conflict correction report
|
||||||
|
2. include final decisions for all `4` records
|
||||||
|
3. list verification commands
|
||||||
|
4. explicitly state that the execution board is not updated
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. route conflict correction report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `4` conflicts are adjudicated
|
||||||
|
2. tests pass
|
||||||
|
3. no execution board update is made
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. the fixed `4` route conflicts have final decisions
|
||||||
|
2. targeted probes have been run
|
||||||
|
3. relevant regressions pass
|
||||||
|
4. decision JSON and report are published
|
||||||
|
5. execution stops without opening another plan
|
||||||
|
|
||||||
@@ -0,0 +1,40 @@
|
|||||||
|
# Residual 13 Follow-Up Sweep And Reconciliation Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Plan: `docs/superpowers/plans/2026-04-19-structured-fail-closed-residual-13-closure-plan.md`
|
||||||
|
> Parent Route: `Residual Route E`
|
||||||
|
> Parent Layer: `Layer E`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Measure the cumulative delta after residual Routes A through D complete.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
The fixed input bucket is the same `13` residual scenes from the parent residual closure plan.
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. residual follow-up JSON asset
|
||||||
|
2. residual reconciliation candidate JSON asset
|
||||||
|
3. residual follow-up report
|
||||||
|
4. residual reconciliation report
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. rerun the fixed 13 residual scenes;
|
||||||
|
2. classify raw statuses;
|
||||||
|
3. apply promotion policy;
|
||||||
|
4. report remaining residual count.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after residual follow-up and reconciliation reports.
|
||||||
|
|
||||||
@@ -0,0 +1,49 @@
|
|||||||
|
# Residual Runtime Roadmap Prioritization Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Layer: `Layer E`
|
||||||
|
> Status: Active
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Select the next roadmap from the three residual inputs after official board reconciliation.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/official_board_reconciliation_2026-04-19.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/residual_runtime_roadmap_prioritization_2026-04-19.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-19-residual-runtime-roadmap-prioritization-report.md`
|
||||||
|
3. selected next roadmap design
|
||||||
|
4. selected next roadmap plan
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
1. Load official board residual records.
|
||||||
|
2. Group residuals by next action.
|
||||||
|
3. Score local-doc runtime, host-bridge runtime, and bootstrap target normalization.
|
||||||
|
4. Select exactly one next roadmap.
|
||||||
|
5. Publish prioritization JSON.
|
||||||
|
6. Publish prioritization report.
|
||||||
|
7. Create design/plan for the selected roadmap only.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. All `7` residual records are represented.
|
||||||
|
2. Exactly one selected roadmap exists.
|
||||||
|
3. Non-selected roadmaps are deferred with reasons.
|
||||||
|
4. No implementation file is modified.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after prioritization assets and the selected next roadmap design/plan are published. Do not execute the selected roadmap under this plan.
|
||||||
@@ -0,0 +1,129 @@
|
|||||||
|
# Scene Skill 102 Final Materialization Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Parent Layer: final asset materialization before validation
|
||||||
|
> Status: Draft
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Generate and freeze a single canonical `102` skill package set for later static, mock, and production-like validation.
|
||||||
|
|
||||||
|
This plan answers whether all 102 scenes have materialized skill assets, not just framework auto-pass status.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_framework_closure_rollup_2026-04-19.json`
|
||||||
|
3. scene source root: `D:/desk/智能体资料/全量业务场景/一平台场景`
|
||||||
|
|
||||||
|
## Output Root
|
||||||
|
|
||||||
|
`examples/scene_skill_102_final_materialization_2026-04-19`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/**`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_final_materialization_failures_2026-04-19.json`
|
||||||
|
4. `docs/superpowers/reports/2026-04-19-scene-skill-102-final-materialization-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
5. existing `examples/*` follow-up roots outside the output root
|
||||||
|
|
||||||
|
## Phase 0: Freeze Materialization Boundary
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm framework rollup is `102 / 102`.
|
||||||
|
2. Confirm materialization does not delete existing `examples/*`.
|
||||||
|
3. Confirm this plan does not perform static/mock/production validation.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Scope is materialization only.
|
||||||
|
2. Output root is isolated.
|
||||||
|
|
||||||
|
## Phase 1: Build Materialization Input Manifest
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Load official board or fallback source-list assets.
|
||||||
|
2. Produce exactly 102 materialization input rows.
|
||||||
|
3. Validate unique scene ids.
|
||||||
|
4. Resolve source directory for each scene.
|
||||||
|
5. Sanitize manifest-only string fields for control characters.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Input manifest has 102 rows.
|
||||||
|
2. No missing source directory remains.
|
||||||
|
3. No duplicate scene id remains.
|
||||||
|
|
||||||
|
## Phase 2: Generate 102 Skill Packages
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
For each manifest row, run:
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
cargo run --bin sg_scene_generate -- `
|
||||||
|
--source-dir "<sourceDir>" `
|
||||||
|
--scene-id "<sceneId>" `
|
||||||
|
--scene-name "<sceneName>" `
|
||||||
|
--scene-kind report_collection `
|
||||||
|
--output-root "D:/data/ideaSpace/rust/sgClaw/claw-new/examples/scene_skill_102_final_materialization_2026-04-19"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Every row is attempted.
|
||||||
|
2. No single scene failure stops the full batch.
|
||||||
|
3. stdout/stderr/result status are captured.
|
||||||
|
|
||||||
|
## Phase 3: Verify Materialized Package Presence
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
For each scene, check:
|
||||||
|
|
||||||
|
1. `SKILL.toml`
|
||||||
|
2. `SKILL.md`
|
||||||
|
3. `scene.toml`
|
||||||
|
4. `references/generation-report.json`
|
||||||
|
5. at least one script under `scripts/`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. All successful rows have required files.
|
||||||
|
2. Failures are explicit in the failures asset.
|
||||||
|
|
||||||
|
## Phase 4: Publish Manifest And Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Publish final materialization manifest.
|
||||||
|
2. Publish final materialization failures.
|
||||||
|
3. Publish superpowers report.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Manifest row count is 102.
|
||||||
|
2. Report states generated count, failure count, readiness distribution, and next validation input.
|
||||||
|
3. The report explicitly states that old `examples/*` roots were not cleaned.
|
||||||
|
|
||||||
|
## Expected Delta
|
||||||
|
|
||||||
|
No framework coverage delta. Expected asset delta is:
|
||||||
|
|
||||||
|
1. `102` canonical final skill package rows;
|
||||||
|
2. one stable manifest for later validation.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after final materialization manifest, failures asset, and report are published. Do not start static, mock, or production validation under this plan.
|
||||||
@@ -0,0 +1,132 @@
|
|||||||
|
# Scene Skill 102 Full Coverage Child Plan Sequence Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-scene-skill-102-full-coverage-child-plan-sequence-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Create the full bounded child-plan sequence for `Route 2` through `Route 6` under the `102` full-coverage parent framework.
|
||||||
|
|
||||||
|
This plan only creates the downstream plan tree. It does not implement any bucket directly.
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not modify `analyzer.rs`
|
||||||
|
2. do not modify `generator.rs`
|
||||||
|
3. do not modify `ir.rs`
|
||||||
|
4. do not update `scene_execution_board_2026-04-18.json`
|
||||||
|
5. do not rerun `102` sweep
|
||||||
|
6. do not open new families
|
||||||
|
7. do not collapse multiple buckets into one child implementation plan
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Route 2 child plans
|
||||||
|
2. `WS2` Route 3 child plans
|
||||||
|
3. `WS3` Route 4 child plans
|
||||||
|
4. `WS4` Route 5 child plans
|
||||||
|
5. `WS5` Route 6 child plans
|
||||||
|
|
||||||
|
## Phase 0: Freeze Sequence Inputs
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the parent baseline and route order before generating child plans.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze parent framework references
|
||||||
|
2. freeze current bucket sizes
|
||||||
|
3. freeze route order from Route 2 through Route 6
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. child-plan sequence design
|
||||||
|
2. child-plan sequence plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all later child plans can reference the same parent baseline
|
||||||
|
2. route order is explicit and cannot drift
|
||||||
|
|
||||||
|
## Phase 1: Route 2 Child Plans
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Create the first three bounded child plans under the largest remaining mainline bucket.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. create `G3 enrichment-request closure` design and plan
|
||||||
|
2. create `G3 export-plan closure` design and plan
|
||||||
|
3. create `G3 residual contract closure` design and plan
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. Route 2 child designs
|
||||||
|
2. Route 2 child plans
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. each Route 2 child plan owns a narrower fixed bucket
|
||||||
|
2. Route 2 plans declare allowed and forbidden file sets
|
||||||
|
3. Route 2 plans declare expected deltas separately
|
||||||
|
|
||||||
|
## Phase 2: Route 3 and Route 4 Child Plans
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Create the bounded plans for the smaller remaining mainline buckets.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. create `G2 remaining fail-closed closure` design and plan
|
||||||
|
2. create `G1-E remaining fail-closed closure` design and plan
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. Route 3 child design and plan
|
||||||
|
2. Route 4 child design and plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Route 3 and Route 4 remain downstream of Route 2
|
||||||
|
2. neither plan absorbs Route 2 issues
|
||||||
|
|
||||||
|
## Phase 3: Route 5 and Route 6 Child Plans
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Create the policy and decision plans that follow mainline contract-recovery work.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. create `boundary fail-closed decision` design and plan
|
||||||
|
2. create `promotion and board reconciliation policy` design and plan
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. Route 5 child design and plan
|
||||||
|
2. Route 6 child design and plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Route 5 is decision-first, not implementation-first
|
||||||
|
2. Route 6 is policy-only
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. Route 2 through Route 6 all have bounded child designs and plans
|
||||||
|
2. every child plan declares parent route, parent layer, input bucket, allowed files, forbidden files, expected delta, and stop statement
|
||||||
|
3. later work can proceed without inventing new unanchored micro-plans
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the bounded child-plan sequence for Route 2 through Route 6 has been created.
|
||||||
|
|
||||||
|
Do not implement any route from this sequence under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,298 @@
|
|||||||
|
# Scene Skill 102 Full Coverage Framework Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-scene-skill-102-full-coverage-framework-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Turn the current sgClaw post-roadmap work into a single controlled framework for driving the `102` scene set toward full bounded coverage.
|
||||||
|
|
||||||
|
This plan is the parent roadmap for all later bounded plans. Future bounded plans must fit inside one of the routes defined here.
|
||||||
|
|
||||||
|
## Current Baseline
|
||||||
|
|
||||||
|
Current integrated baseline:
|
||||||
|
|
||||||
|
| Status | Count |
|
||||||
|
| --- | ---: |
|
||||||
|
| `auto-pass` | 48 |
|
||||||
|
| `fail-closed-known` | 47 |
|
||||||
|
| `adjudicated-valid-host-bridge` | 4 |
|
||||||
|
| raw `source-unreadable` | 3 |
|
||||||
|
| Total | 102 |
|
||||||
|
|
||||||
|
Timeout hygiene overlay:
|
||||||
|
|
||||||
|
| Hygiene interpretation | Count |
|
||||||
|
| --- | ---: |
|
||||||
|
| `timeout-as-pass-candidate` | 2 |
|
||||||
|
| `timeout-as-fail-closed-candidate` | 1 |
|
||||||
|
| `timeout-still-unreadable` | 0 |
|
||||||
|
| `timeout-rerun-error` | 0 |
|
||||||
|
|
||||||
|
## Overall Goal
|
||||||
|
|
||||||
|
The overall goal is:
|
||||||
|
|
||||||
|
`100% bounded framework coverage for the current 102 scene set`
|
||||||
|
|
||||||
|
This means:
|
||||||
|
|
||||||
|
1. every scene is covered by a supported framework path
|
||||||
|
2. every non-pass scene has a structured and named reason
|
||||||
|
3. no unresolved timeout, unsupported-family, or route-conflict bucket remains
|
||||||
|
|
||||||
|
It does not require `100% auto-pass`.
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not start `G4/G5`
|
||||||
|
2. do not add new families unless this parent framework is updated first
|
||||||
|
3. do not treat diagnostics as promotions
|
||||||
|
4. do not update `scene_execution_board_2026-04-18.json` inside diagnostic or bounded recovery plans
|
||||||
|
5. do not mix timeout policy work with contract recovery work in the same bounded implementation plan
|
||||||
|
6. do not create semantics-only micro-plans that are not tied to one of the routes below
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Coverage and Reporting Integrity
|
||||||
|
2. `WS2` Mainline Contract Closure
|
||||||
|
3. `WS3` Boundary Bucket Handling
|
||||||
|
4. `WS4` Promotion and Board Reconciliation
|
||||||
|
|
||||||
|
## Phase 0: Freeze the Parent Framework
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make this plan the single parent framework for the next improvement cycle.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze the current integrated baseline
|
||||||
|
2. freeze the five framework layers
|
||||||
|
3. freeze the route order
|
||||||
|
4. forbid out-of-framework micro-plan drift
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. parent framework design
|
||||||
|
2. parent framework plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. future bounded plans can be mapped to one framework layer
|
||||||
|
2. future bounded plans can be mapped to one route
|
||||||
|
|
||||||
|
## Phase 1: Close Reporting Integrity
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Finish the reporting-side work so the `102` scene set is measured correctly before further implementation.
|
||||||
|
|
||||||
|
### Route
|
||||||
|
|
||||||
|
`Route 1: Layer E hygiene integration`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. preserve raw timeout counts
|
||||||
|
2. preserve hygiene-aware timeout interpretation
|
||||||
|
3. preserve route adjudication
|
||||||
|
4. preserve structured fail-closed buckets
|
||||||
|
5. produce reconciliation-friendly current-state reporting
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. timeout hygiene integration assets
|
||||||
|
2. reconciliation-friendly integrated reporting
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no unresolved timeout interpretation remains
|
||||||
|
2. no unresolved route conflict remains
|
||||||
|
|
||||||
|
## Phase 2: Mainline G3 Contract Closure
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce the largest remaining fail-closed bucket in a controlled way.
|
||||||
|
|
||||||
|
### Route
|
||||||
|
|
||||||
|
`Route 2: G3 / paginated_enrichment`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze the current `G3` fail-closed subgrouping
|
||||||
|
2. select the top repeated recoverable pattern
|
||||||
|
3. implement bounded contract recovery
|
||||||
|
4. rerun only the bounded validation needed by that slice
|
||||||
|
5. measure delta against the parent baseline
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bounded G3 implementation plan(s)
|
||||||
|
2. bounded G3 implementation report(s)
|
||||||
|
3. updated coverage delta assets
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no scene-name hardcoding
|
||||||
|
2. no gate relaxation
|
||||||
|
3. canonical `G3` and real-sample `G3` remain stable
|
||||||
|
|
||||||
|
## Phase 3: Mainline G2 Closure
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce the remaining `multi_mode_request` fail-closed bucket.
|
||||||
|
|
||||||
|
### Route
|
||||||
|
|
||||||
|
`Route 3: G2 / multi_mode_request`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze the current `4` G2 fail-closed records
|
||||||
|
2. identify the common missing contract
|
||||||
|
3. implement one bounded G2 correction slice
|
||||||
|
4. rerun bounded validation
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bounded G2 implementation plan(s)
|
||||||
|
2. bounded G2 implementation report(s)
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. real-sample `G2` pass remains stable
|
||||||
|
2. no route drift into host-bridge or other families
|
||||||
|
|
||||||
|
## Phase 4: Mainline G1-E Closure
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Reduce the remaining `single_request_enrichment` fail-closed bucket.
|
||||||
|
|
||||||
|
### Route
|
||||||
|
|
||||||
|
`Route 4: G1-E / single_request_enrichment`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze the current `2` G1-E fail-closed records
|
||||||
|
2. identify the common missing contract
|
||||||
|
3. implement one bounded G1-E correction slice
|
||||||
|
4. rerun bounded validation
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bounded G1-E implementation plan(s)
|
||||||
|
2. bounded G1-E implementation report(s)
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. real-sample `G1-E` pass remains stable
|
||||||
|
2. no route drift into host-bridge or page-state families
|
||||||
|
|
||||||
|
## Phase 5: Boundary Buckets After Mainline
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Touch boundary-family fail-closed buckets only after the mainline buckets have been reduced or explicitly deferred.
|
||||||
|
|
||||||
|
### Route
|
||||||
|
|
||||||
|
`Route 5: local_doc_pipeline and host_bridge_workflow remaining fail-closed`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. inspect the `5` local-doc records
|
||||||
|
2. inspect the `1` host-bridge fail-closed record
|
||||||
|
3. decide whether to defer or open one bounded boundary correction slice
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. boundary bucket decision report
|
||||||
|
2. optional bounded boundary plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no boundary slice starts before mainline routes are resolved or deferred
|
||||||
|
|
||||||
|
## Phase 6: Promotion and Board Policy
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Define how stronger framework-resolved statuses may flow back into official scene status assets.
|
||||||
|
|
||||||
|
### Route
|
||||||
|
|
||||||
|
`Route 6: promotion and board reconciliation`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define promotion thresholds
|
||||||
|
2. define how hygiene-aware timeout results are represented
|
||||||
|
3. define how structured fail-closed progress is represented
|
||||||
|
4. define what can and cannot update the execution board
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. promotion policy design
|
||||||
|
2. execution-board reconciliation plan
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. diagnostics remain distinct from promotion
|
||||||
|
2. execution board updates become rule-driven instead of ad hoc
|
||||||
|
|
||||||
|
## Route Order
|
||||||
|
|
||||||
|
The route order is fixed:
|
||||||
|
|
||||||
|
1. finish reporting integrity
|
||||||
|
2. reduce `G3` fail-closed bucket
|
||||||
|
3. reduce `G2` fail-closed bucket
|
||||||
|
4. reduce `G1-E` fail-closed bucket
|
||||||
|
5. inspect boundary fail-closed buckets
|
||||||
|
6. define promotion and board reconciliation policy
|
||||||
|
|
||||||
|
No bounded plan may skip upward in this order unless this parent plan is revised.
|
||||||
|
|
||||||
|
## Required Contents for Future Bounded Plans
|
||||||
|
|
||||||
|
Every future bounded plan must include:
|
||||||
|
|
||||||
|
1. parent route reference
|
||||||
|
2. parent framework layer
|
||||||
|
3. fixed input bucket
|
||||||
|
4. exact files allowed to change
|
||||||
|
5. files that must not change
|
||||||
|
6. expected coverage delta
|
||||||
|
7. stop statement
|
||||||
|
|
||||||
|
If one of these is missing, the bounded plan is not valid under this framework.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This parent framework remains active until all of the following are true:
|
||||||
|
|
||||||
|
1. `unsupported-family = 0`
|
||||||
|
2. `missing-source = 0`
|
||||||
|
3. `misclassified-unresolved = 0`
|
||||||
|
4. `timeout-still-unreadable = 0`
|
||||||
|
5. every remaining non-pass scene is either:
|
||||||
|
- structured fail-closed
|
||||||
|
- adjudicated valid host-bridge
|
||||||
|
- policy-recognized timeout rerun hygiene result
|
||||||
|
6. board reconciliation policy exists
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
This is a parent framework plan.
|
||||||
|
|
||||||
|
Do not implement code directly from this plan.
|
||||||
|
|
||||||
|
All implementation must happen through later bounded plans that explicitly declare which route and which layer they belong to.
|
||||||
@@ -0,0 +1,263 @@
|
|||||||
|
# Structured Fail-Closed Improvement Roadmap Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Spec: `docs/superpowers/specs/2026-04-19-structured-fail-closed-improvement-roadmap-design.md`
|
||||||
|
> Upstream Reconciliation: `tests/fixtures/generated_scene/full_sweep_status_reconciliation_2026-04-19.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Coordinate the next improvement cycle for the `48` structured fail-closed records from the reconciled `102` sweep.
|
||||||
|
|
||||||
|
This is a roadmap-level plan. It intentionally starts with inventory and gap taxonomy before any implementation correction.
|
||||||
|
|
||||||
|
## Baseline
|
||||||
|
|
||||||
|
Current reconciled `102` status:
|
||||||
|
|
||||||
|
| Status | Count |
|
||||||
|
| --- | ---: |
|
||||||
|
| `auto-pass` | 48 |
|
||||||
|
| `fail-closed-known` | 48 |
|
||||||
|
| `adjudicated-valid-host-bridge` | 4 |
|
||||||
|
| `source-unreadable` | 2 |
|
||||||
|
|
||||||
|
Fail-closed distribution:
|
||||||
|
|
||||||
|
| Inferred archetype | Count |
|
||||||
|
| --- | ---: |
|
||||||
|
| `paginated_enrichment` | 35 |
|
||||||
|
| `local_doc_pipeline` | 5 |
|
||||||
|
| `multi_mode_request` | 4 |
|
||||||
|
| `single_request_enrichment` | 2 |
|
||||||
|
| `host_bridge_workflow` | 1 |
|
||||||
|
| `page_state_eval` | 1 |
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not add new scene families
|
||||||
|
2. do not start `G4/G5`
|
||||||
|
3. do not implement login recovery
|
||||||
|
4. do not implement full host runtime transport
|
||||||
|
5. do not implement local document attachment runtime
|
||||||
|
6. do not update `scene_execution_board_2026-04-18.json`
|
||||||
|
7. do not promote scenes directly from dry-run or follow-up results
|
||||||
|
8. do not reopen `adjudicated-valid-host-bridge` records
|
||||||
|
9. do not handle the `2` timeout records in this roadmap
|
||||||
|
10. do not loosen readiness gates to increase pass count
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Fail-Closed Inventory and Gap Taxonomy
|
||||||
|
2. `WS2` G3 Paginated Enrichment Recovery
|
||||||
|
3. `WS3` Small-Bucket Recovery
|
||||||
|
4. `WS4` Bootstrap Isolation
|
||||||
|
5. `WS5` Follow-Up Sweep and Reporting
|
||||||
|
|
||||||
|
## Phase 0: Freeze Structured Fail-Closed Baseline
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the `48` fail-closed records as the only implementation-analysis input.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. read `full_sweep_status_reconciliation_2026-04-19.json`
|
||||||
|
2. verify total scene count is `102`
|
||||||
|
3. verify `fail-closed-known = 48`
|
||||||
|
4. verify `adjudicated-valid-host-bridge = 4`
|
||||||
|
5. verify `source-unreadable = 2`
|
||||||
|
6. extract only records with `reconciledStatus = fail-closed-known`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. frozen fail-closed input list
|
||||||
|
2. baseline validation summary
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. exactly `48` records enter this roadmap
|
||||||
|
2. route-adjudicated records are excluded
|
||||||
|
3. timeout records are excluded
|
||||||
|
|
||||||
|
## Phase 1: Build Fail-Closed Inventory and Gap Taxonomy
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Split the `48` records into actionable missing-contract buckets.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. inspect each fail-closed record
|
||||||
|
2. assign exactly one primary missing-contract label:
|
||||||
|
- `main_request_missing`
|
||||||
|
- `pagination_plan_missing`
|
||||||
|
- `enrichment_request_missing`
|
||||||
|
- `join_key_missing`
|
||||||
|
- `export_plan_missing`
|
||||||
|
- `mode_matrix_missing`
|
||||||
|
- `mode_request_contract_missing`
|
||||||
|
- `single_request_enrichment_contract_missing`
|
||||||
|
- `host_bridge_contract_missing`
|
||||||
|
- `local_doc_contract_missing`
|
||||||
|
- `bootstrap_target_unresolved`
|
||||||
|
- `mixed_or_ambiguous_contract_gap`
|
||||||
|
3. attach secondary labels when useful
|
||||||
|
4. group by inferred archetype and primary label
|
||||||
|
5. identify top repeated recoverable patterns
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/structured_fail_closed_inventory_2026-04-19.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-19-structured-fail-closed-inventory-report.md`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `48` records have exactly one primary label
|
||||||
|
2. the `35` `paginated_enrichment` records are explicitly split
|
||||||
|
3. no implementation is performed in this phase
|
||||||
|
|
||||||
|
## Phase 2: G3 Paginated Enrichment Recovery Slice
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Improve the largest bucket only when Phase 1 identifies repeated recoverable G3 patterns.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. select only `paginated_enrichment` records from the inventory
|
||||||
|
2. prioritize repeated primary labels in this order:
|
||||||
|
- `main_request_missing`
|
||||||
|
- `pagination_plan_missing`
|
||||||
|
- `enrichment_request_missing`
|
||||||
|
- `join_key_missing`
|
||||||
|
- `export_plan_missing`
|
||||||
|
3. define bounded recovery rules for the top repeated pattern
|
||||||
|
4. implement only traceable evidence recovery
|
||||||
|
5. add regression tests for the recovered pattern
|
||||||
|
6. preserve canonical `G3` and real-sample `G3` pass
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. G3 recovery implementation if evidence supports it
|
||||||
|
2. regression tests for the recovered pattern
|
||||||
|
3. G3 recovery report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no scene-name hardcoding
|
||||||
|
2. no gate relaxation
|
||||||
|
3. recovered fields are traceable to source evidence
|
||||||
|
4. existing `G3` canonical and real-sample tests pass
|
||||||
|
|
||||||
|
## Phase 3: Small-Bucket Recovery Slice
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Handle smaller buckets only after the G3 slice is complete or explicitly deferred.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. inspect `local_doc_pipeline = 5`
|
||||||
|
2. inspect `multi_mode_request = 4`
|
||||||
|
3. inspect `single_request_enrichment = 2`
|
||||||
|
4. inspect `host_bridge_workflow = 1`
|
||||||
|
5. choose at most one bounded non-G3 recovery slice
|
||||||
|
6. preserve existing real-sample passes for `G1-E`, `G2`, `G6`, `G7`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. small-bucket recovery decision report
|
||||||
|
2. optional bounded implementation and tests
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. only one small-bucket slice is implemented in this roadmap
|
||||||
|
2. no `G8` attachment/local document runtime is started
|
||||||
|
3. no full host runtime transport is started
|
||||||
|
|
||||||
|
## Phase 4: Bootstrap Target Isolation
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Keep the single `page_state_eval + bootstrap_target` record separate.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. identify the bootstrap target record
|
||||||
|
2. preserve it as a separate future input
|
||||||
|
3. do not implement login recovery
|
||||||
|
4. produce bootstrap isolation note
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. bootstrap isolation note
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. bootstrap target does not pollute G3 or small-bucket recovery
|
||||||
|
2. no login or bootstrap auto-recovery is implemented
|
||||||
|
|
||||||
|
## Phase 5: Follow-Up Sweep and Coverage Delta
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Measure the impact of bounded recovery work.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. rerun the fixed `102` scene sweep
|
||||||
|
2. produce a new follow-up result
|
||||||
|
3. compare against the reconciled baseline:
|
||||||
|
- auto-pass delta
|
||||||
|
- fail-closed-known delta
|
||||||
|
- actionable coverage delta
|
||||||
|
- timeout count
|
||||||
|
- adjudicated host-bridge count
|
||||||
|
4. publish coverage delta report
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-19-structured-fail-closed-improvement-coverage-delta-report.md`
|
||||||
|
3. `docs/superpowers/reports/2026-04-19-structured-fail-closed-improvement-roadmap-closure-report.md`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. scene set remains exactly `102`
|
||||||
|
2. improvements are measured, not assumed
|
||||||
|
3. execution board remains unchanged
|
||||||
|
4. fail-closed count only drops when contracts close or become more specifically isolated
|
||||||
|
|
||||||
|
## Milestone Order
|
||||||
|
|
||||||
|
The order is fixed:
|
||||||
|
|
||||||
|
1. Phase 0: freeze fail-closed baseline
|
||||||
|
2. Phase 1: build inventory and taxonomy
|
||||||
|
3. Phase 2: G3 recovery slice
|
||||||
|
4. Phase 3: small-bucket recovery slice
|
||||||
|
5. Phase 4: bootstrap target isolation
|
||||||
|
6. Phase 5: follow-up sweep and delta
|
||||||
|
|
||||||
|
Do not start implementation before Phase 1 is complete.
|
||||||
|
|
||||||
|
Do not start small-bucket recovery before the G3 slice is completed or explicitly deferred with reasons.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This roadmap is complete when:
|
||||||
|
|
||||||
|
1. all `48` structured fail-closed records are inventoried and labeled
|
||||||
|
2. the `35` G3 records are split into actionable contract-gap groups
|
||||||
|
3. at least the highest-value repeated recoverable pattern is either implemented or explicitly deferred
|
||||||
|
4. small buckets are inspected and at most one bounded slice is implemented
|
||||||
|
5. the bootstrap target remains isolated
|
||||||
|
6. a follow-up sweep quantifies coverage delta
|
||||||
|
7. no new family is introduced
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the follow-up sweep, delta report, and closure report.
|
||||||
|
|
||||||
|
Do not automatically update the execution board or start another roadmap inside this plan.
|
||||||
@@ -0,0 +1,150 @@
|
|||||||
|
# Structured Fail-Closed Residual 13 Closure Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Parent Framework: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-structured-fail-closed-residual-13-closure-design.md`
|
||||||
|
> Fixed Input: `tests/fixtures/generated_scene/full_coverage_reconciliation_candidates_2026-04-19.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Turn the remaining `13` `framework-structured-fail-closed` scenes into a controlled residual closure sequence.
|
||||||
|
|
||||||
|
This plan is a coordinator plan. It does not directly implement code. Implementation must happen only in bounded child plans declared below.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
The fixed input bucket is the `13` scenes with:
|
||||||
|
|
||||||
|
`reconciliationCandidateStatus = framework-structured-fail-closed`
|
||||||
|
|
||||||
|
from:
|
||||||
|
|
||||||
|
`tests/fixtures/generated_scene/full_coverage_reconciliation_candidates_2026-04-19.json`
|
||||||
|
|
||||||
|
## Residual Routes
|
||||||
|
|
||||||
|
### Residual Route A: G3 Residual Closure
|
||||||
|
|
||||||
|
Fixed input:
|
||||||
|
|
||||||
|
1. `sweep-007-scene` / `95598供电服务月报`
|
||||||
|
2. `sweep-039-scene` / `故障报修工单信息统计表`
|
||||||
|
3. `sweep-068-scene` / `输变电设备运行分析报告`
|
||||||
|
4. `sweep-084-scene` / `巡视计划完成情况自动检索`
|
||||||
|
|
||||||
|
Expected child plan:
|
||||||
|
|
||||||
|
`2026-04-19-g3-residual-4-workflow-evidence-closure-plan.md`
|
||||||
|
|
||||||
|
Allowed implementation area:
|
||||||
|
|
||||||
|
1. G3 workflow evidence recovery.
|
||||||
|
2. G3 contract assembly.
|
||||||
|
3. bounded G3 route-local validation.
|
||||||
|
|
||||||
|
Forbidden:
|
||||||
|
|
||||||
|
1. G8 runtime.
|
||||||
|
2. G6 host bridge runtime.
|
||||||
|
3. new family creation.
|
||||||
|
|
||||||
|
### Residual Route B: G2 Residual Closure
|
||||||
|
|
||||||
|
Fixed input:
|
||||||
|
|
||||||
|
1. `sweep-018-scene` / `白银线损周报`
|
||||||
|
2. `sweep-071-scene` / `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
|
||||||
|
Expected child plan:
|
||||||
|
|
||||||
|
`2026-04-19-g2-residual-2-readiness-closure-plan.md`
|
||||||
|
|
||||||
|
Allowed implementation area:
|
||||||
|
|
||||||
|
1. G2 readiness interpretation.
|
||||||
|
2. G2 mode/request/response contract correction.
|
||||||
|
3. bounded G2 route-local validation.
|
||||||
|
|
||||||
|
Forbidden:
|
||||||
|
|
||||||
|
1. changing G2 real-sample pass semantics;
|
||||||
|
2. adding a new G2 variant family;
|
||||||
|
3. route drift into host bridge.
|
||||||
|
|
||||||
|
### Residual Route C: Boundary Residual Decision
|
||||||
|
|
||||||
|
Fixed input:
|
||||||
|
|
||||||
|
1. `sweep-033-scene` / `供电可靠率指标统计表`
|
||||||
|
2. `sweep-034-scene` / `供电可靠性数据质量自查报告月报`
|
||||||
|
3. `sweep-042-scene` / `国网金昌供电公司营商环境周例会报告`
|
||||||
|
4. `sweep-051-scene` / `嘉峪关可靠性分析报告`
|
||||||
|
5. `sweep-074-scene` / `同兴智能安全督查日报`
|
||||||
|
6. `sweep-085-scene` / `业扩报装管理制度`
|
||||||
|
|
||||||
|
Expected child plan:
|
||||||
|
|
||||||
|
`2026-04-19-boundary-residual-hold-decision-plan.md`
|
||||||
|
|
||||||
|
Allowed action:
|
||||||
|
|
||||||
|
1. decision-only hold/defer classification.
|
||||||
|
2. no implementation.
|
||||||
|
|
||||||
|
### Residual Route D: Bootstrap Residual Isolation
|
||||||
|
|
||||||
|
Fixed input:
|
||||||
|
|
||||||
|
1. `sweep-091-scene` / `用户停电频次分析监测`
|
||||||
|
|
||||||
|
Expected child plan:
|
||||||
|
|
||||||
|
`2026-04-19-bootstrap-target-residual-isolation-plan.md`
|
||||||
|
|
||||||
|
Allowed action:
|
||||||
|
|
||||||
|
1. bootstrap target isolation.
|
||||||
|
2. no login recovery implementation.
|
||||||
|
|
||||||
|
### Residual Route E: Residual Follow-Up Reconciliation
|
||||||
|
|
||||||
|
Expected child plan:
|
||||||
|
|
||||||
|
`2026-04-19-residual-13-followup-sweep-and-reconciliation-plan.md`
|
||||||
|
|
||||||
|
Allowed action:
|
||||||
|
|
||||||
|
1. route-local or fixed 13-scene follow-up sweep.
|
||||||
|
2. reconciliation candidate refresh.
|
||||||
|
3. no official board update.
|
||||||
|
|
||||||
|
## Phase Order
|
||||||
|
|
||||||
|
1. Run Residual Route A.
|
||||||
|
2. Run Residual Route B.
|
||||||
|
3. Run Residual Route C.
|
||||||
|
4. Run Residual Route D.
|
||||||
|
5. Run Residual Route E.
|
||||||
|
|
||||||
|
Do not skip to Route E before Routes A through D are complete.
|
||||||
|
|
||||||
|
## Deliverables
|
||||||
|
|
||||||
|
1. residual 13 design.
|
||||||
|
2. residual 13 coordinator plan.
|
||||||
|
3. child bounded plans for Routes A through E.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
1. the 13 residual scenes are fully assigned to residual routes;
|
||||||
|
2. every residual route has an expected child plan name;
|
||||||
|
3. mainline residuals are separated from boundary/bootstrap residuals;
|
||||||
|
4. no implementation is performed directly by this coordinator plan.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing this coordinator plan and its child plan skeletons.
|
||||||
|
|
||||||
|
Do not modify implementation files under this coordinator plan.
|
||||||
|
|
||||||
@@ -0,0 +1,144 @@
|
|||||||
|
# Timeout Budget and Rerun Hygiene Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-timeout-budget-rerun-hygiene-design.md`
|
||||||
|
> Upstream Diagnostic: `docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Create a bounded timeout-budget and rerun-hygiene layer so budget-sensitive scenes are not collapsed into a single `source-unreadable` bucket.
|
||||||
|
|
||||||
|
This plan is classification and reporting only. It does not change analyzer or generator code.
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not modify `src/generated_scene/analyzer.rs`
|
||||||
|
2. do not modify `src/generated_scene/generator.rs`
|
||||||
|
3. do not update `scene_execution_board_2026-04-18.json`
|
||||||
|
4. do not promote scenes
|
||||||
|
5. do not rerun the full `102` sweep
|
||||||
|
6. do not treat rerun success as validated pass
|
||||||
|
7. do not start timeout implementation fixes
|
||||||
|
|
||||||
|
## Fixed Input
|
||||||
|
|
||||||
|
The fixed input is:
|
||||||
|
|
||||||
|
`tests/fixtures/generated_scene/timeout_regression_diagnostic_2026-04-19.json`
|
||||||
|
|
||||||
|
Only the three diagnosed timeout records enter this plan.
|
||||||
|
|
||||||
|
## Phase 0: Freeze Timeout Diagnostic Input
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the timeout diagnostic records before hygiene mapping.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. read the timeout diagnostic JSON
|
||||||
|
2. verify total timeout records is `3`
|
||||||
|
3. verify the label set is:
|
||||||
|
- `timeout-rerun-pass = 2`
|
||||||
|
- `timeout-rerun-fail-closed = 1`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. frozen timeout diagnostic baseline
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. exactly `3` records enter this hygiene plan
|
||||||
|
2. no non-timeout scene enters the plan
|
||||||
|
|
||||||
|
## Phase 1: Define Hygiene Mapping
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Map timeout diagnostic results to explicit rerun hygiene statuses.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. map `timeout-rerun-pass` to `rerun-resolved-pass`
|
||||||
|
2. map `timeout-rerun-fail-closed` to `rerun-resolved-fail-closed`
|
||||||
|
3. preserve any future timeout as `rerun-still-timeout`
|
||||||
|
4. preserve any future unexpected exit as `rerun-error`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. explicit rerun hygiene mapping table
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. each timeout diagnostic label maps to one hygiene status
|
||||||
|
2. pass-like rerun and fail-closed rerun remain distinct
|
||||||
|
|
||||||
|
## Phase 2: Build Hygiene Output
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Publish a hygiene-layer view for the three timeout records.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write `tests/fixtures/generated_scene/timeout_budget_rerun_hygiene_2026-04-19.json`
|
||||||
|
2. include:
|
||||||
|
- original timeout status
|
||||||
|
- diagnostic label
|
||||||
|
- rerun hygiene status
|
||||||
|
- elapsed seconds
|
||||||
|
- report presence
|
||||||
|
- readiness if present
|
||||||
|
3. summarize how many records are:
|
||||||
|
- `rerun-resolved-pass`
|
||||||
|
- `rerun-resolved-fail-closed`
|
||||||
|
- `rerun-still-timeout`
|
||||||
|
- `rerun-error`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. timeout budget hygiene JSON
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all three timeout records appear in the hygiene JSON
|
||||||
|
2. each has exactly one hygiene status
|
||||||
|
|
||||||
|
## Phase 3: Publish Report
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Publish the bounded timeout hygiene report without changing scene status.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write `docs/superpowers/reports/2026-04-19-timeout-budget-rerun-hygiene-report.md`
|
||||||
|
2. explain why `sweep-040-scene` should not be counted the same way as a hard unreadable source
|
||||||
|
3. explain why `sweep-015-scene` and `sweep-025-scene` are budget-sensitive pass candidates
|
||||||
|
4. state that this remains a hygiene layer, not a promotion layer
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. timeout budget and rerun hygiene report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. report exists
|
||||||
|
2. no execution board update is made
|
||||||
|
3. no implementation change is made
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. timeout diagnostic input is frozen
|
||||||
|
2. rerun hygiene mapping is defined
|
||||||
|
3. hygiene JSON is published
|
||||||
|
4. hygiene report is published
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the timeout hygiene JSON and report.
|
||||||
|
|
||||||
|
Do not start timeout implementation or scene promotion inside this plan.
|
||||||
@@ -0,0 +1,178 @@
|
|||||||
|
# Timeout Regression Diagnostic Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-timeout-regression-diagnostic-design.md`
|
||||||
|
> Upstream Follow-up: `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Run a bounded diagnostic for the three timeout records after the structured fail-closed improvement follow-up sweep.
|
||||||
|
|
||||||
|
This plan only diagnoses timeout behavior. It does not implement fixes.
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not modify `src/generated_scene/analyzer.rs`
|
||||||
|
2. do not modify `src/generated_scene/generator.rs`
|
||||||
|
3. do not update `scene_execution_board_2026-04-18.json`
|
||||||
|
4. do not promote scenes
|
||||||
|
5. do not add family baselines
|
||||||
|
6. do not handle the remaining structured fail-closed records
|
||||||
|
7. do not handle adjudicated host-bridge records
|
||||||
|
8. do not treat diagnostic rerun success as validated scene pass
|
||||||
|
|
||||||
|
## Fixed Input
|
||||||
|
|
||||||
|
The fixed input is:
|
||||||
|
|
||||||
|
`tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
|
||||||
|
|
||||||
|
Only records with `followupStatus = source-unreadable` and reason `generator timeout after 45s` enter this plan.
|
||||||
|
|
||||||
|
Expected fixed set:
|
||||||
|
|
||||||
|
| Scene id | Scene | Type |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `sweep-015-scene` | `任务报表` | persistent timeout |
|
||||||
|
| `sweep-025-scene` | `力禾动环系统巡视记录` | persistent timeout |
|
||||||
|
| `sweep-040-scene` | `嘉峪关日报` | regression timeout |
|
||||||
|
|
||||||
|
## Phase 0: Freeze Timeout Inputs
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the exact timeout set before diagnostics.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. read the follow-up sweep JSON
|
||||||
|
2. filter `source-unreadable` timeout records
|
||||||
|
3. verify the count is exactly `3`
|
||||||
|
4. identify `sweep-040-scene` as the regression timeout
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. frozen timeout input list
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. exactly `3` timeout records enter diagnostics
|
||||||
|
2. no non-timeout record enters diagnostics
|
||||||
|
|
||||||
|
## Phase 1: Source Directory Diagnostics
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Determine whether timeout records are likely caused by source scale or source structure.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. inspect each source directory
|
||||||
|
2. count all files
|
||||||
|
3. count HTML files
|
||||||
|
4. count JavaScript files
|
||||||
|
5. compute total source bytes
|
||||||
|
6. record the largest files
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. per-scene source diagnostics in JSON
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `3` timeout records have source diagnostics
|
||||||
|
2. missing directories are reported explicitly
|
||||||
|
|
||||||
|
## Phase 2: Bounded Diagnostic Rerun
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Check whether each timeout completes under a longer diagnostic budget.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. rerun each timeout scene with a diagnostic timeout budget
|
||||||
|
2. write output under `examples/timeout_regression_diagnostic_2026-04-19`
|
||||||
|
3. capture exit code
|
||||||
|
4. capture elapsed seconds
|
||||||
|
5. record whether a `generation-report.json` is produced
|
||||||
|
6. do not update any execution status based on the result
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. diagnostic rerun result per timeout scene
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. each timeout has exactly one diagnostic rerun result
|
||||||
|
2. rerun success is marked only as diagnostic evidence
|
||||||
|
3. rerun failure is categorized, not fixed
|
||||||
|
|
||||||
|
## Phase 3: Timeout Labeling
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Assign each timeout one final diagnostic label.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. assign one primary diagnostic label:
|
||||||
|
- `timeout-rerun-pass`
|
||||||
|
- `timeout-rerun-fail-closed`
|
||||||
|
- `timeout-large-source`
|
||||||
|
- `timeout-command-hang`
|
||||||
|
- `timeout-nondeterministic`
|
||||||
|
- `timeout-source-scan-heavy`
|
||||||
|
- `timeout-unknown`
|
||||||
|
2. attach secondary labels when useful
|
||||||
|
3. distinguish persistent timeouts from regression timeout
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. labeled timeout diagnostic JSON
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all `3` records have exactly one primary diagnostic label
|
||||||
|
2. `sweep-040-scene` remains clearly identified as the regression timeout
|
||||||
|
|
||||||
|
## Phase 4: Diagnostic Report
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Publish diagnostic results without starting implementation.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write `tests/fixtures/generated_scene/timeout_regression_diagnostic_2026-04-19.json`
|
||||||
|
2. write `docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md`
|
||||||
|
3. summarize whether the next step should be timeout implementation, rerun hygiene, or no action
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/timeout_regression_diagnostic_2026-04-19.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. diagnostic output exists
|
||||||
|
2. report exists
|
||||||
|
3. no implementation changes are made
|
||||||
|
4. no execution board update is made
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. the three timeout records are frozen
|
||||||
|
2. each has source diagnostics
|
||||||
|
3. each has one diagnostic rerun result
|
||||||
|
4. each has one final diagnostic label
|
||||||
|
5. JSON and report are published
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the timeout diagnostic JSON and report.
|
||||||
|
|
||||||
|
Do not start timeout implementation or status promotion inside this plan.
|
||||||
@@ -0,0 +1,140 @@
|
|||||||
|
# Timeout Rerun Hygiene Integration Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-19
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-timeout-rerun-hygiene-integration-design.md`
|
||||||
|
> Upstream Hygiene: `tests/fixtures/generated_scene/timeout_budget_rerun_hygiene_2026-04-19.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Integrate timeout rerun hygiene into sweep and reconciliation reporting.
|
||||||
|
|
||||||
|
This plan only changes the reporting layer. It does not change scene generation behavior.
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
1. do not modify `src/generated_scene/analyzer.rs`
|
||||||
|
2. do not modify `src/generated_scene/generator.rs`
|
||||||
|
3. do not update `scene_execution_board_2026-04-18.json`
|
||||||
|
4. do not promote scenes
|
||||||
|
5. do not rerun the `102` sweep
|
||||||
|
6. do not start timeout implementation fixes
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
|
||||||
|
2. `tests/fixtures/generated_scene/timeout_budget_rerun_hygiene_2026-04-19.json`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Inputs
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze the sweep follow-up and timeout hygiene inputs.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. verify follow-up sweep status counts
|
||||||
|
2. verify timeout hygiene summary:
|
||||||
|
- `rerun-resolved-pass = 2`
|
||||||
|
- `rerun-resolved-fail-closed = 1`
|
||||||
|
- `rerun-still-timeout = 0`
|
||||||
|
- `rerun-error = 0`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. frozen integration input set
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. only the fixed follow-up and hygiene inputs are used
|
||||||
|
|
||||||
|
## Phase 1: Build Hygiene Overlay
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Attach timeout hygiene results onto raw timeout scenes.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. match timeout hygiene records to the follow-up sweep by `sceneId`
|
||||||
|
2. preserve raw `source-unreadable`
|
||||||
|
3. add:
|
||||||
|
- `hygieneStatus`
|
||||||
|
- `hygieneInterpretation`
|
||||||
|
4. map:
|
||||||
|
- `rerun-resolved-pass -> timeout-as-pass-candidate`
|
||||||
|
- `rerun-resolved-fail-closed -> timeout-as-fail-closed-candidate`
|
||||||
|
- `rerun-still-timeout -> timeout-still-unreadable`
|
||||||
|
- `rerun-error -> timeout-rerun-error`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. timeout hygiene overlay records
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. all three timeout scenes receive one overlay status
|
||||||
|
2. raw status is preserved
|
||||||
|
|
||||||
|
## Phase 2: Build Integrated Summary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Publish a hygiene-aware timeout summary alongside the raw sweep summary.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. preserve raw follow-up status counts
|
||||||
|
2. add hygiene-aware timeout interpretation counts
|
||||||
|
3. summarize:
|
||||||
|
- `timeout-as-pass-candidate`
|
||||||
|
- `timeout-as-fail-closed-candidate`
|
||||||
|
- `timeout-still-unreadable`
|
||||||
|
- `timeout-rerun-error`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. integrated summary block
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. raw and hygiene-aware summaries both exist
|
||||||
|
2. timeout bucket is no longer lossy in the integrated output
|
||||||
|
|
||||||
|
## Phase 3: Publish Integrated Output
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Publish the bounded reconciliation-friendly hygiene integration output.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write `tests/fixtures/generated_scene/timeout_rerun_hygiene_integration_2026-04-19.json`
|
||||||
|
2. write `docs/superpowers/reports/2026-04-19-timeout-rerun-hygiene-integration-report.md`
|
||||||
|
3. state that this is an interpretation/reporting layer only
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. timeout hygiene integration JSON
|
||||||
|
2. timeout hygiene integration report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. both files exist
|
||||||
|
2. no execution board update is made
|
||||||
|
3. no implementation change is made
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. inputs are frozen
|
||||||
|
2. timeout hygiene overlay is attached
|
||||||
|
3. integrated raw and hygiene-aware summaries are published
|
||||||
|
4. JSON and report are written
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the integration JSON and report.
|
||||||
|
|
||||||
|
Do not start implementation or board updates inside this plan.
|
||||||
@@ -0,0 +1,86 @@
|
|||||||
|
# Deterministic Keyword Scoring Refinement Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Design: `2026-04-20-deterministic-keyword-scoring-refinement-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Close the 9 deterministic dispatch ambiguity gaps by bounded manifest keyword refinement and dry-run verification.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_2026-04-20.json`
|
||||||
|
2. `examples/scene_skill_102_final_materialization_2026-04-19/scene_skill_102_index.json`
|
||||||
|
3. `examples/scene_skill_102_final_materialization_2026-04-19/skills/*/scene.toml`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/skills/*/scene.toml`
|
||||||
|
2. `tests/fixtures/generated_scene/deterministic_keyword_scoring_refinement_2026-04-20.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json`
|
||||||
|
4. `docs/superpowers/reports/2026-04-20-deterministic-keyword-scoring-refinement-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/compat/scene_platform/dispatch.rs`
|
||||||
|
2. `src/compat/scene_platform/resolvers.rs`
|
||||||
|
3. `src/generated_scene/analyzer.rs`
|
||||||
|
4. `src/generated_scene/generator.rs`
|
||||||
|
5. generated `scripts/*`
|
||||||
|
6. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Gap Set
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Load readiness gaps from the parent readiness asset.
|
||||||
|
2. Confirm the fixed gap set is exactly 9 ambiguous dispatch entries.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. No additional gap categories are pulled into scope.
|
||||||
|
2. `sweep-012-scene` remains excluded.
|
||||||
|
|
||||||
|
## Phase 1: Refine Manifest Keywords
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. For each fixed gap, identify direct collision partner.
|
||||||
|
2. Narrow include keywords to distinctive full phrases.
|
||||||
|
3. Remove broad standalone collision tokens where they create ties.
|
||||||
|
4. Add explicit exclude keywords only when a pair is mutually exclusive.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. The fixed 9 scenes retain non-empty include keywords.
|
||||||
|
2. No generated script is changed.
|
||||||
|
|
||||||
|
## Phase 2: Dispatch Dry-Run Verification
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Re-run dispatch dry-run for all 101 complete packages.
|
||||||
|
2. Verify the fixed 9 gaps uniquely select their expected scene by full-name sample.
|
||||||
|
3. Check that no previously-ready scene regresses into ambiguity or no-match.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `dispatchReady = 101` or all residual gaps are explicitly justified.
|
||||||
|
2. `ambiguous = 0` unless escalated to a separate runtime scoring plan.
|
||||||
|
|
||||||
|
## Phase 3: Publish Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Publish refinement JSON.
|
||||||
|
2. Publish post-refinement readiness JSON.
|
||||||
|
3. Publish report.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Report states before/after ready and ambiguous counts.
|
||||||
|
2. Report states whether runtime scoring changes are needed.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after refinement assets and report are published. Do not start browser execution, runtime dispatch implementation, or `sweep-012-scene` recovery under this plan.
|
||||||
@@ -0,0 +1,101 @@
|
|||||||
|
# Final Skill Human-Readable Index Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Parent Plan: `2026-04-19-scene-skill-102-final-materialization-plan.md`
|
||||||
|
> Design: `2026-04-20-final-skill-human-readable-index-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Add human-readable lookup and metadata to the final materialized skill set so reviewers can identify which `sweep-xxx-scene` skill maps to which business scene.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_final_materialization_failures_2026-04-19.json`
|
||||||
|
4. `examples/scene_skill_102_final_materialization_2026-04-19`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/SCENE_INDEX.md`
|
||||||
|
2. `examples/scene_skill_102_final_materialization_2026-04-19/scene_skill_102_index.json`
|
||||||
|
3. `examples/scene_skill_102_final_materialization_2026-04-19/skills/*/SKILL.toml`
|
||||||
|
4. `examples/scene_skill_102_final_materialization_2026-04-19/skills/*/SKILL.md`
|
||||||
|
5. `docs/superpowers/reports/2026-04-20-final-skill-human-readable-index-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
5. generated `scripts/*`
|
||||||
|
6. existing materialization manifest and failures assets
|
||||||
|
|
||||||
|
## Phase 0: Freeze Metadata Boundary
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm final materialization root exists.
|
||||||
|
2. Confirm official board has 102 scene mappings.
|
||||||
|
3. Confirm this plan does not repair failed packages.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Scope is metadata/index only.
|
||||||
|
2. Stable `sweep-xxx-scene` ids are preserved.
|
||||||
|
|
||||||
|
## Phase 1: Build Human-Readable Mapping
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Load scene id and scene name from official board.
|
||||||
|
2. Load materialization status from final materialization manifest and failures asset.
|
||||||
|
3. Produce 102 mapping rows.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Row count is 102.
|
||||||
|
2. `sweep-012-scene` is included and marked failed.
|
||||||
|
|
||||||
|
## Phase 2: Publish Index Assets
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Write `SCENE_INDEX.md`.
|
||||||
|
2. Write `scene_skill_102_index.json`.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Index files are present.
|
||||||
|
2. Index files include scene id, scene name, archetype, readiness, status, and skill directory.
|
||||||
|
|
||||||
|
## Phase 3: Normalize Skill Metadata
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. For each complete package, update `SKILL.toml` readable fields while preserving `[skill].name`.
|
||||||
|
2. For each complete package, update `SKILL.md` readable summary.
|
||||||
|
3. Skip failed packages that lack required files.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Complete packages expose readable scene names.
|
||||||
|
2. Failed packages remain explicit failures.
|
||||||
|
3. Generated scripts are not modified.
|
||||||
|
|
||||||
|
## Phase 4: Publish Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Publish human-readable index report.
|
||||||
|
2. State materialized package count and skipped failed package count.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Report explains how to find scene-to-skill mapping.
|
||||||
|
2. Report states that no generation or recovery was performed.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after index assets, metadata normalization, and report are published. Do not start static/mock validation or `sweep-012-scene` recovery under this plan.
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
# Generated Scene Embedded Dictionary Extraction Hardening Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent route:
|
||||||
|
> - `embedded_dictionary_extraction_hardening`
|
||||||
|
> Parent ledger:
|
||||||
|
> - `tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement the first reusable slice for source-side dictionary/tree extraction.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
Use the bounded bucket:
|
||||||
|
|
||||||
|
1. scenes with declared `org` parameters
|
||||||
|
2. scenes with source-side dictionary evidence (`city.js`, `dict.js`, `enum.js`, tree/options files)
|
||||||
|
3. scenes whose current generated `org-dictionary.json` is absent or starter-sized
|
||||||
|
|
||||||
|
This first slice should center on the 10 parameterized scenes that most resemble `sweep-030-scene`.
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. route-local generator tests
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. no edits to already materialized dictionaries under `examples/`
|
||||||
|
2. no runtime resolver implementation outside generation output needs
|
||||||
|
3. no board assets
|
||||||
|
4. no pseudo-production handoff assets
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. generated dictionaries move beyond starter subsets for the bucketed scenes
|
||||||
|
2. dictionary recovery becomes source-driven rather than hand-seeded
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the first reusable dictionary-extraction slice is implemented and route-local follow-up assets are published.
|
||||||
|
|
||||||
|
Do not attempt complete organization-tree closure for every scene inside this route plan.
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
# Generated Scene Invocation Alias Generation Hardening Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent route:
|
||||||
|
> - `alias_generation_hardening`
|
||||||
|
> Parent ledger:
|
||||||
|
> - `tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement the first reusable slice for natural-language alias generation.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
Use the bounded bucket:
|
||||||
|
|
||||||
|
1. scenes with source-side alias evidence
|
||||||
|
2. scenes whose current generated deterministic manifests still expose only narrow keyword coverage
|
||||||
|
3. high-risk browser-script report scenes where operator wording is likely to diverge from canonical scene names
|
||||||
|
|
||||||
|
This first slice should prefer the densest high-risk alias bucket rather than the full 84-scene route at once.
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. route-local generator tests
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. no runtime scoring changes in sgClaw dispatch
|
||||||
|
2. no service-console changes
|
||||||
|
3. no direct edits to final materialized `scene.toml`
|
||||||
|
4. no board assets
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. generated `include_keywords` become less brittle for the bucketed scenes
|
||||||
|
2. deterministic invocation becomes less dependent on exact canonical wording
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the first reusable alias-generation slice is implemented and route-local follow-up assets are published.
|
||||||
|
|
||||||
|
Do not attempt one-shot full alias closure for every scene inside this route plan.
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
# Generated Scene Parameter Default Semantics Hardening Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent route:
|
||||||
|
> - `parameter_default_semantics_recovery_hardening`
|
||||||
|
> Parent ledger:
|
||||||
|
> - `tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement the first reusable slice for page-native default period/date/mode recovery.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
Use the bounded bucket:
|
||||||
|
|
||||||
|
1. scenes with explicit `period` parameters
|
||||||
|
2. scenes whose source evidence shows implicit month/week/date initialization
|
||||||
|
3. scenes whose current generated manifests do not encode a reusable default strategy
|
||||||
|
|
||||||
|
This first slice should center on the parameterized monthly/weekly scenes highlighted by the ledger.
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. route-local generator tests
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. no runtime resolver patching outside generation metadata needs
|
||||||
|
2. no edits to generated skill bundle under `examples/`
|
||||||
|
3. no board assets
|
||||||
|
4. no pseudo-production assets
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. generated parameter metadata can preserve source-side default semantics for the bucketed scenes
|
||||||
|
2. callers are no longer forced to supply values that the source page itself normally supplies
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the first reusable default-semantics slice is implemented and route-local follow-up assets are published.
|
||||||
|
|
||||||
|
Do not expand to all possible date semantics inside this route plan.
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
# Generated Scene Resolver Request Mapping Hardening Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent route:
|
||||||
|
> - `resolver_request_mapping_hardening`
|
||||||
|
> Parent ledger:
|
||||||
|
> - `tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement the first reusable mapping slice for request-field recovery.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
Use the bounded bucket:
|
||||||
|
|
||||||
|
1. scenes with explicit `org` and/or `period` params
|
||||||
|
2. scenes whose source evidence shows request-field tokens like `orgno`, `fdate`, `weekSfdate`, `weekEfdate`
|
||||||
|
3. scenes currently lacking explicit generated request-mapping metadata
|
||||||
|
|
||||||
|
This first slice is expected to center on the parameterized `multi_mode_request` family and adjacent structured-request scenes.
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. route-local generator tests
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. no edits to final materialized skill bundle
|
||||||
|
2. no execution-board assets
|
||||||
|
3. no runtime / browser callback host
|
||||||
|
4. no service console assets
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. introduce reusable request-field mapping metadata rather than scene-name patches
|
||||||
|
2. reduce `resolver_to_request_mapping_gap` in the highest-signal parameterized bucket
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the first reusable mapping slice is implemented and route-local follow-up assets are published.
|
||||||
|
|
||||||
|
Do not yet attempt full 102-scene closure inside this route plan.
|
||||||
@@ -0,0 +1,143 @@
|
|||||||
|
# Generated Scene Rule Hardening Route Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent roadmap:
|
||||||
|
> - `docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-plan.md`
|
||||||
|
> Parent design:
|
||||||
|
> - `docs/superpowers/specs/2026-04-20-generated-scene-rule-hardening-route-design.md`
|
||||||
|
> Upstream ledger:
|
||||||
|
> - `docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-ledger-plan.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Convert the completed runtime-semantics ledger into a bounded hardening-route sequence.
|
||||||
|
|
||||||
|
This stage decides execution order and the next child implementation plans. It does not change code yet.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-generated-scene-source-first-runtime-semantics-ledger-report.md`
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
Allowed:
|
||||||
|
|
||||||
|
1. cluster scenes by reusable route
|
||||||
|
2. freeze route order
|
||||||
|
3. define bounded child implementation plans
|
||||||
|
4. define rematerialization dependency
|
||||||
|
5. define validation refresh dependency
|
||||||
|
|
||||||
|
Forbidden:
|
||||||
|
|
||||||
|
1. no implementation changes in `src/`
|
||||||
|
2. no skill manifest changes
|
||||||
|
3. no rematerialization execution
|
||||||
|
4. no validation reruns
|
||||||
|
5. no inner-network execution
|
||||||
|
|
||||||
|
## Phase 0: Freeze Route Order
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the ledger into one fixed route order for downstream implementation.
|
||||||
|
|
||||||
|
### Ordered Routes
|
||||||
|
|
||||||
|
1. `resolver_request_mapping_hardening`
|
||||||
|
2. `runtime_url_classification_hardening`
|
||||||
|
3. `embedded_dictionary_extraction_hardening`
|
||||||
|
4. `parameter_default_semantics_recovery_hardening`
|
||||||
|
5. `alias_generation_hardening`
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. the order is explicit and no longer derived ad hoc during later implementation
|
||||||
|
|
||||||
|
## Phase 1: Build Route Clusters
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Cluster scenes from the ledger into reusable route buckets.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. count all scenes covered by each route
|
||||||
|
2. identify the densest scene families per route
|
||||||
|
3. identify route-local anchor scenes
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. each route has a stable implementation bucket definition
|
||||||
|
|
||||||
|
## Phase 2: Define Bounded Child Implementation Plans
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Create one bounded implementation child plan for each top route.
|
||||||
|
|
||||||
|
### Required child plans
|
||||||
|
|
||||||
|
1. `2026-04-20-generated-scene-resolver-request-mapping-hardening-plan.md`
|
||||||
|
2. `2026-04-20-generated-scene-runtime-url-classification-hardening-plan.md`
|
||||||
|
3. `2026-04-20-generated-scene-embedded-dictionary-extraction-hardening-plan.md`
|
||||||
|
4. `2026-04-20-generated-scene-parameter-default-semantics-hardening-plan.md`
|
||||||
|
5. `2026-04-20-generated-scene-invocation-alias-generation-hardening-plan.md`
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. each child plan has a fixed scope and stop rule
|
||||||
|
2. no child plan is scene-name hardcoded as its whole purpose
|
||||||
|
|
||||||
|
## Phase 3: Declare Rematerialization Dependency
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make full 102-scene rematerialization a mandatory downstream stage after route execution.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define `generated-scene-runtime-semantics-rematerialization-refresh-plan`
|
||||||
|
2. freeze it as required after implementation
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. no route may be considered complete without rematerialization
|
||||||
|
|
||||||
|
## Phase 4: Declare Validation Refresh Dependency
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make validation refresh mandatory after rematerialization.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define `generated-scene-runtime-semantics-validation-refresh-plan`
|
||||||
|
2. require refresh of:
|
||||||
|
- deterministic invocation readiness
|
||||||
|
- natural-language parameter readiness
|
||||||
|
- static validation
|
||||||
|
- direct mock execution
|
||||||
|
- pseudo-production handoff
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. no route may be considered fully closed until validation assets are refreshed
|
||||||
|
|
||||||
|
## Deliverables
|
||||||
|
|
||||||
|
1. route design / sequencing report
|
||||||
|
2. route cluster JSON
|
||||||
|
3. bounded child-plan list for the five routes
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after:
|
||||||
|
|
||||||
|
1. publishing the route design / sequencing assets
|
||||||
|
2. publishing the five child implementation plans
|
||||||
|
3. publishing rematerialization and validation-refresh dependency plans
|
||||||
|
|
||||||
|
Do not execute route implementation inside this plan.
|
||||||
@@ -0,0 +1,34 @@
|
|||||||
|
# Generated Scene Rule Hardening Route Sequence Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent design:
|
||||||
|
> - `docs/superpowers/specs/2026-04-20-generated-scene-rule-hardening-route-sequence-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Publish the bounded child-plan tree that follows the completed runtime-semantics ledger.
|
||||||
|
|
||||||
|
## Fixed Sequence
|
||||||
|
|
||||||
|
1. `generated-scene-resolver-request-mapping-hardening`
|
||||||
|
2. `generated-scene-runtime-url-classification-hardening`
|
||||||
|
3. `generated-scene-embedded-dictionary-extraction-hardening`
|
||||||
|
4. `generated-scene-parameter-default-semantics-hardening`
|
||||||
|
5. `generated-scene-invocation-alias-generation-hardening`
|
||||||
|
6. `generated-scene-runtime-semantics-rematerialization-refresh`
|
||||||
|
7. `generated-scene-runtime-semantics-validation-refresh`
|
||||||
|
|
||||||
|
## Deliverables
|
||||||
|
|
||||||
|
1. route cluster JSON
|
||||||
|
2. route sequence report
|
||||||
|
3. five bounded child implementation plans
|
||||||
|
4. one rematerialization refresh dependency plan
|
||||||
|
5. one validation refresh dependency plan
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the child-plan tree.
|
||||||
|
|
||||||
|
Do not implement any route in this plan.
|
||||||
@@ -0,0 +1,122 @@
|
|||||||
|
# Generated Scene Runtime Semantics Gap Analysis Plan
|
||||||
|
|
||||||
|
> Status: Superseded by `docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-plan.md`
|
||||||
|
|
||||||
|
## Parent
|
||||||
|
|
||||||
|
- Parent design: [2026-04-20-generated-scene-runtime-semantics-gap-analysis-design.md](/D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-20-generated-scene-runtime-semantics-gap-analysis-design.md)
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Analyze the 102 final generated scene skills for runtime-semantics divergence, using `sweep-030-scene` as the anchor case and systematizing the five gap classes exposed during inner-network validation.
|
||||||
|
|
||||||
|
This plan is analysis-only.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
- `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
||||||
|
- `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json`
|
||||||
|
- `tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json`
|
||||||
|
- `tests/fixtures/generated_scene/scene_skill_102_parameter_dictionary_template_normalization_2026-04-20.json`
|
||||||
|
- Anchor source:
|
||||||
|
- `D:/desk/智能体资料/全量业务场景/一平台场景/台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
|
||||||
|
## Boundaries
|
||||||
|
|
||||||
|
Allowed:
|
||||||
|
|
||||||
|
- Read skill manifests, reports, references, and selected source-scene evidence
|
||||||
|
- Produce JSON inventory and report
|
||||||
|
|
||||||
|
Forbidden:
|
||||||
|
|
||||||
|
- No edits in `src/`
|
||||||
|
- No edits to generated skills
|
||||||
|
- No rerun materialization
|
||||||
|
- No execution board updates
|
||||||
|
- No pseudo-production execution
|
||||||
|
- No implementation patch for any scene
|
||||||
|
|
||||||
|
## Phase 0: Freeze Gap Taxonomy
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. Fix the five runtime-semantics gap classes from the anchor case
|
||||||
|
2. Define high / medium / low risk buckets
|
||||||
|
3. Lock analysis outputs and stop rule
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. The five gap classes are explicit and stable
|
||||||
|
2. The plan remains analysis-only
|
||||||
|
|
||||||
|
## Phase 1: Anchor-Case Evidence Extraction
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. Read `sweep-030-scene` generated assets:
|
||||||
|
- `scene.toml`
|
||||||
|
- `references/generation-report.json`
|
||||||
|
- `references/org-dictionary.json`
|
||||||
|
- generated script
|
||||||
|
2. Read source-scene evidence from the original `台区线损大数据-月_周累计线损率统计分析`
|
||||||
|
3. Record direct evidence for:
|
||||||
|
- alias gap
|
||||||
|
- dictionary recovery gap
|
||||||
|
- parameter default semantics gap
|
||||||
|
- resolver-to-request mapping gap
|
||||||
|
- runtime URL semantics gap
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. `sweep-030-scene` has explicit evidence for each applicable gap class
|
||||||
|
|
||||||
|
## Phase 2: 102-Scene Inventory Scan
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. Scan all 102 final skills
|
||||||
|
2. Extract:
|
||||||
|
- deterministic keywords
|
||||||
|
- params presence
|
||||||
|
- dictionary reference presence
|
||||||
|
- bootstrap target presence
|
||||||
|
- generation-report URL evidence
|
||||||
|
3. Tag scenes with likely gap classes using bounded heuristics
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. Every scene gets a runtime-semantics record
|
||||||
|
2. Every scene has `riskLevel` and `gaps`
|
||||||
|
|
||||||
|
## Phase 3: Family / Archetype Grouping
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. Group findings by archetype / family
|
||||||
|
2. Count gap incidence by bucket
|
||||||
|
3. Separate:
|
||||||
|
- generator-level fix candidates
|
||||||
|
- runtime-only residuals
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. Summary counts exist per gap type and per archetype
|
||||||
|
2. Report can distinguish generator vs runtime responsibilities
|
||||||
|
|
||||||
|
## Phase 4: Publish Analysis Assets
|
||||||
|
|
||||||
|
Deliverables:
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/generated_scene_runtime_semantics_gap_analysis_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-generated-scene-runtime-semantics-gap-analysis-report.md`
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. All 102 scenes are represented
|
||||||
|
2. `sweep-030-scene` is explicitly called out as anchor evidence
|
||||||
|
3. The report recommends next implementation routes, but does not execute them
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the JSON inventory and report.
|
||||||
@@ -0,0 +1,34 @@
|
|||||||
|
# Generated Scene Runtime Semantics Rematerialization Refresh Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Dependency stage:
|
||||||
|
> - post route implementation
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Make full 102-scene rematerialization mandatory after runtime-semantics hardening routes land.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. completed route-local hardening reports
|
||||||
|
2. current canonical final skill root
|
||||||
|
3. current final materialization manifest/failure assets
|
||||||
|
|
||||||
|
## Required Outputs
|
||||||
|
|
||||||
|
1. refreshed final 102-skill materialization directory
|
||||||
|
2. refreshed materialization manifest
|
||||||
|
3. refreshed materialization failures asset
|
||||||
|
4. refreshed scene index / metadata layer
|
||||||
|
|
||||||
|
## Guardrails
|
||||||
|
|
||||||
|
1. no route may be considered complete without this refresh
|
||||||
|
2. rematerialization must use hardened generator rules, not manual skill edits
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the rematerialization refresh plan.
|
||||||
|
|
||||||
|
Do not execute rematerialization inside this dependency plan.
|
||||||
@@ -0,0 +1,29 @@
|
|||||||
|
# Generated Scene Runtime Semantics Validation Refresh Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Dependency stage:
|
||||||
|
> - post rematerialization refresh
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Make validation refresh mandatory after runtime-semantics rematerialization.
|
||||||
|
|
||||||
|
## Required Refresh Layers
|
||||||
|
|
||||||
|
1. deterministic invocation readiness
|
||||||
|
2. natural-language parameter readiness
|
||||||
|
3. static validation
|
||||||
|
4. direct mock execution
|
||||||
|
5. pseudo-production handoff assets
|
||||||
|
|
||||||
|
## Guardrails
|
||||||
|
|
||||||
|
1. validation must consume the refreshed canonical 102-skill bundle
|
||||||
|
2. old validation assets may not be reused as proof of the hardened bundle
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the validation refresh plan.
|
||||||
|
|
||||||
|
Do not execute validation refresh inside this dependency plan.
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
# Generated Scene Runtime URL Classification Hardening Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent route:
|
||||||
|
> - `runtime_url_classification_hardening`
|
||||||
|
> Parent ledger:
|
||||||
|
> - `tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement the first reusable slice that separates runtime URL roles during generation.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
Use the bounded bucket:
|
||||||
|
|
||||||
|
1. scenes with strong source evidence for multiple URL roles
|
||||||
|
2. scenes whose current generated manifest only exposes `target_url`
|
||||||
|
3. high-signal browser-script scenes where runtime context URL and module-route URL are likely to diverge
|
||||||
|
|
||||||
|
This first slice should focus on the highest-risk parameterized browser families before broader expansion.
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. route-local generator tests
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. no callback-host/runtime implementation
|
||||||
|
2. no service-console changes
|
||||||
|
3. no direct edits to generated skills
|
||||||
|
4. no board or validation assets
|
||||||
|
|
||||||
|
## Expected Coverage Delta
|
||||||
|
|
||||||
|
1. generated metadata can distinguish app-entry/runtime-context/module-route roles
|
||||||
|
2. callers are no longer forced to guess `page_url` semantics for the bucketed scenes
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after the first reusable URL-classification slice is implemented and route-local follow-up assets are published.
|
||||||
|
|
||||||
|
Do not expand to every scene in this route plan.
|
||||||
@@ -0,0 +1,94 @@
|
|||||||
|
# Generated Scene Source Evidence Cross-Scan Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent roadmap:
|
||||||
|
> - `docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-plan.md`
|
||||||
|
> Parent design:
|
||||||
|
> - `docs/superpowers/specs/2026-04-20-generated-scene-source-evidence-cross-scan-design.md`
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Perform a bounded source-first cross-scan over the original 102 scene directories so the project can identify which scenes share the same runtime-semantics risk family as `sweep-030-scene`.
|
||||||
|
|
||||||
|
This plan is analysis-only.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.json`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
3. source root:
|
||||||
|
- `D:/desk/智能体资料/全量业务场景/一平台场景`
|
||||||
|
|
||||||
|
## Boundaries
|
||||||
|
|
||||||
|
Allowed:
|
||||||
|
|
||||||
|
1. map the current 102 scenes to original source directories
|
||||||
|
2. scan bounded source evidence
|
||||||
|
3. publish JSON inventory and report
|
||||||
|
|
||||||
|
Forbidden:
|
||||||
|
|
||||||
|
1. no edits in `src/`
|
||||||
|
2. no edits to generated skills
|
||||||
|
3. no rematerialization
|
||||||
|
4. no validation reruns
|
||||||
|
5. no execution board updates
|
||||||
|
|
||||||
|
## Phase 0: Freeze Scene Mapping
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. derive the exact 102-scene source directory mapping
|
||||||
|
2. validate that each scene maps to one source directory or an explicit missing record
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. all 102 scenes have a source mapping status
|
||||||
|
|
||||||
|
## Phase 1: Run Bounded Source Evidence Scan
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. scan for alias evidence
|
||||||
|
2. scan for dictionary evidence
|
||||||
|
3. scan for default parameter evidence
|
||||||
|
4. scan for request mapping evidence
|
||||||
|
5. scan for runtime URL evidence
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. each scene has evidence flags
|
||||||
|
2. representative evidence files are recorded where found
|
||||||
|
|
||||||
|
## Phase 2: Build Cross-Scan Ledger
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
1. write one record per scene
|
||||||
|
2. tag scenes with source-side risk hints
|
||||||
|
3. explicitly identify scenes that look similar to `sweep-030-scene`
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. all 102 scenes appear in the ledger
|
||||||
|
2. the anchor case is clearly represented
|
||||||
|
|
||||||
|
## Phase 3: Publish Assets
|
||||||
|
|
||||||
|
Deliverables:
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/generated_scene_source_evidence_cross_scan_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-generated-scene-source-evidence-cross-scan-report.md`
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
1. the JSON can be used as the next input to the runtime-semantics ledger stage
|
||||||
|
2. the report summarizes the five evidence families across the 102-scene set
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing the JSON inventory and report.
|
||||||
|
|
||||||
|
Do not start rule-hardening or rematerialization in this plan.
|
||||||
@@ -0,0 +1,214 @@
|
|||||||
|
# Generated Scene Source-First Runtime Semantics Hardening Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent design: `docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Replace the weaker generated-skill-first analysis path with a stronger source-first roadmap:
|
||||||
|
|
||||||
|
1. scan all 102 original source scenes
|
||||||
|
2. detect scenes that can reproduce the same runtime-semantics defect classes exposed by `sweep-030-scene`
|
||||||
|
3. convert those findings into rule-level hardening routes
|
||||||
|
4. require full 102-scene rematerialization after rule changes
|
||||||
|
5. refresh the full validation stack after rematerialization
|
||||||
|
|
||||||
|
## Why This Plan Exists
|
||||||
|
|
||||||
|
The project goal is not to describe already-surfaced gaps after they break in inner-network testing.
|
||||||
|
|
||||||
|
The goal is to prevent the same class of defect from reappearing across the remaining source scenes.
|
||||||
|
|
||||||
|
Therefore this plan is driven by original source-scene evidence, not generated skill artifacts alone.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. Original source root:
|
||||||
|
- `D:/desk/智能体资料/全量业务场景/一平台场景`
|
||||||
|
2. Current final generated skills:
|
||||||
|
- `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
||||||
|
3. Current 102-skill materialization manifest
|
||||||
|
4. Current invocation / parameter readiness assets
|
||||||
|
5. `sweep-030-scene` inner-network runtime findings
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
Allowed:
|
||||||
|
|
||||||
|
1. scan all 102 original source-scene directories
|
||||||
|
2. compare source evidence against current generated skills
|
||||||
|
3. produce risk ledgers, reports, and downstream bounded plans
|
||||||
|
|
||||||
|
Forbidden in this parent plan:
|
||||||
|
|
||||||
|
1. no implementation changes in `src/`
|
||||||
|
2. no skill manifest edits
|
||||||
|
3. no rematerialization execution yet
|
||||||
|
4. no validation reruns yet
|
||||||
|
5. no inner-network patching as a substitute for source-first analysis
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Source Evidence Scan
|
||||||
|
2. `WS2` Runtime-Semantics Risk Ledger
|
||||||
|
3. `WS3` Rule Hardening Route Design
|
||||||
|
4. `WS4` Full Rematerialization and Validation Refresh Planning
|
||||||
|
|
||||||
|
## Phase 0: Freeze Parent Scope
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make this the new parent roadmap for generated-scene runtime semantics hardening.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. freeze the five gap classes
|
||||||
|
2. freeze the source-first principle
|
||||||
|
3. freeze rematerialization as a required downstream step
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. future work must start from source-scene evidence
|
||||||
|
2. future fixes must be rule-level before scene-level
|
||||||
|
|
||||||
|
## Phase 1: Full 102 Source Cross-Scan
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Systematically scan the original 102 source scenes for high-signal evidence related to the five runtime-semantics gap classes.
|
||||||
|
|
||||||
|
### Required scan targets
|
||||||
|
|
||||||
|
1. dictionary / enum / tree files
|
||||||
|
2. default parameter logic
|
||||||
|
3. request payload field names
|
||||||
|
4. runtime URL candidates
|
||||||
|
5. operator-facing wording and alias sources
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. map each scene id to its original source directory
|
||||||
|
2. run a bounded evidence scan over all 102 source directories
|
||||||
|
3. tag source-side evidence flags per scene
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. source evidence scan JSON
|
||||||
|
2. source evidence scan report
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. all 102 scenes have source evidence flags
|
||||||
|
2. `sweep-030-scene` is validated as anchor evidence
|
||||||
|
|
||||||
|
## Phase 2: Build the Source-First Runtime Semantics Ledger
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Merge source-side evidence with generated-skill evidence into a full runtime-semantics risk ledger.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. compare source evidence with generated manifests and references
|
||||||
|
2. assign gap classes per scene
|
||||||
|
3. assign risk level per scene
|
||||||
|
4. distinguish:
|
||||||
|
- generator-level rule gap
|
||||||
|
- runtime-only residual
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
||||||
|
2. source-first runtime semantics report
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. all 102 scenes are represented
|
||||||
|
2. each scene has `gaps`, `riskLevel`, and `recommendedFixRoutes`
|
||||||
|
|
||||||
|
## Phase 3: Convert Ledger into Rule-Hardening Routes
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Turn the source-first ledger into bounded implementation routes that modify reusable generation rules rather than scene-specific patches.
|
||||||
|
|
||||||
|
### Candidate hardening routes
|
||||||
|
|
||||||
|
1. alias generation hardening
|
||||||
|
2. embedded dictionary extraction hardening
|
||||||
|
3. parameter default semantics recovery hardening
|
||||||
|
4. resolver-to-request mapping hardening
|
||||||
|
5. runtime URL classification hardening
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. count scenes affected by each route
|
||||||
|
2. prioritize routes by coverage gain and reuse
|
||||||
|
3. define bounded implementation slices for the top routes
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. child-plan sequence for runtime semantics hardening
|
||||||
|
2. bounded route plans for top reusable fixes
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. no route is scene-name hardcoded
|
||||||
|
2. route priority is based on 102-scene reuse, not anecdotal debugging order
|
||||||
|
|
||||||
|
## Phase 4: Require Full 102 Rematerialization
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Ensure that hardened rules are propagated into the final generated skill inventory.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define full 102 rematerialization as mandatory after route implementation
|
||||||
|
2. define materialization outputs that must be refreshed
|
||||||
|
3. define how canonical final skill bundle is replaced
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. full rematerialization refresh plan
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. no runtime-semantics hardening route may be considered complete without rematerialization
|
||||||
|
|
||||||
|
## Phase 5: Require Validation Refresh
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Refresh downstream validation after rematerialization so improved rules are measured end-to-end.
|
||||||
|
|
||||||
|
### Required refresh layers
|
||||||
|
|
||||||
|
1. deterministic invocation readiness
|
||||||
|
2. natural-language parameter readiness
|
||||||
|
3. static validation
|
||||||
|
4. direct mock execution
|
||||||
|
5. pseudo-production handoff refresh
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. validation refresh plan
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. the new final 102-skill bundle is revalidated before more inner-network testing
|
||||||
|
|
||||||
|
## Immediate Next Output
|
||||||
|
|
||||||
|
This parent plan should immediately lead to a new bounded child plan:
|
||||||
|
|
||||||
|
- `2026-04-20-generated-scene-source-evidence-cross-scan-plan.md`
|
||||||
|
|
||||||
|
That child plan should perform the actual source cross-scan over the 102 original scenes.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing this parent plan and its design.
|
||||||
|
|
||||||
|
Do not execute the source cross-scan or implementation inside this plan.
|
||||||
@@ -0,0 +1,143 @@
|
|||||||
|
# Generated Scene Source-First Runtime Semantics Ledger Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Parent roadmap:
|
||||||
|
> - `docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-plan.md`
|
||||||
|
> Parent design:
|
||||||
|
> - `docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-ledger-design.md`
|
||||||
|
> Upstream completed step:
|
||||||
|
> - `docs/superpowers/plans/2026-04-20-generated-scene-source-evidence-cross-scan-plan.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Build the full source-first runtime-semantics ledger for the current 102-scene set.
|
||||||
|
|
||||||
|
This stage exists to convert the completed source cross-scan into a reusable comparison ledger before any analyzer/generator hardening route is defined.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/generated_scene_source_evidence_cross_scan_2026-04-20.json`
|
||||||
|
2. `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json`
|
||||||
|
4. `tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_skill_102_parameter_dictionary_template_normalization_2026-04-20.json`
|
||||||
|
6. `sweep-030-scene` inner-network findings already established in prior discussion and analysis assets
|
||||||
|
|
||||||
|
## Scope Guardrails
|
||||||
|
|
||||||
|
Allowed:
|
||||||
|
|
||||||
|
1. read source cross-scan outputs
|
||||||
|
2. read current generated skills and references
|
||||||
|
3. compare source evidence with generated evidence
|
||||||
|
4. assign gap classes, risk levels, and route hints
|
||||||
|
5. publish ledger JSON and report
|
||||||
|
|
||||||
|
Forbidden:
|
||||||
|
|
||||||
|
1. no implementation changes in `src/`
|
||||||
|
2. no manifest or script edits
|
||||||
|
3. no rematerialization
|
||||||
|
4. no validation reruns
|
||||||
|
5. no execution-board update
|
||||||
|
6. no inner-network testing
|
||||||
|
|
||||||
|
## Phase 0: Freeze Ledger Inputs
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Make the cross-scan asset and current generated-skill assets the only valid inputs for this ledger stage.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. verify the cross-scan JSON parses
|
||||||
|
2. verify all 102 scenes are represented
|
||||||
|
3. verify the current generated skill root is readable
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. the ledger stage starts from a stable 102-scene evidence base
|
||||||
|
|
||||||
|
## Phase 1: Build Per-Scene Comparison Records
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
For each scene, merge source evidence with generated-skill evidence into one comparison record.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. load source evidence flags, evidence files, alias samples, request tokens, and runtime URL samples
|
||||||
|
2. read current scene-level generated manifests/references as needed
|
||||||
|
3. summarize generated-side evidence for:
|
||||||
|
- invocation aliases
|
||||||
|
- dictionaries
|
||||||
|
- parameter defaults
|
||||||
|
- request mapping
|
||||||
|
- runtime URL roles
|
||||||
|
4. write one comparison record per scene
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. all 102 scenes have both source-side and generated-side summaries
|
||||||
|
|
||||||
|
## Phase 2: Assign Gap Classes and Risk Levels
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Convert comparison records into a stable runtime-semantics risk ledger.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. assign `gaps` from the fixed five-class taxonomy
|
||||||
|
2. assign `riskLevel = high|medium|low`
|
||||||
|
3. assign:
|
||||||
|
- `generatorLevelGap`
|
||||||
|
- `runtimeOnlyResidual`
|
||||||
|
4. record `comparisonNotes`
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. every scene has `gaps`
|
||||||
|
2. every scene has `riskLevel`
|
||||||
|
3. every scene has `recommendedFixRoutes`
|
||||||
|
|
||||||
|
## Phase 3: Aggregate Route-Level Signals
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Produce route-level reuse signals from the scene ledger so the next stage can design bounded hardening routes.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. count scenes carrying each gap class
|
||||||
|
2. count scenes marked `generatorLevelGap`
|
||||||
|
3. count scenes marked `runtimeOnlyResidual`
|
||||||
|
4. identify the highest-density reusable route clusters
|
||||||
|
|
||||||
|
### Acceptance
|
||||||
|
|
||||||
|
1. the ledger can drive downstream route prioritization without returning to anecdotal scene debugging
|
||||||
|
|
||||||
|
## Deliverables
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-generated-scene-source-first-runtime-semantics-ledger-report.md`
|
||||||
|
|
||||||
|
## Expected Coverage
|
||||||
|
|
||||||
|
The ledger should represent:
|
||||||
|
|
||||||
|
1. all 102 scenes
|
||||||
|
2. all five canonical gap classes
|
||||||
|
3. source-first route hints derived from the completed cross-scan
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after:
|
||||||
|
|
||||||
|
1. publishing the ledger JSON
|
||||||
|
2. publishing the ledger report
|
||||||
|
3. summarizing the highest-reuse hardening routes
|
||||||
|
|
||||||
|
Do not yet create implementation route plans inside this ledger plan.
|
||||||
@@ -0,0 +1,110 @@
|
|||||||
|
# Scene Skill 102 Deterministic Invocation Readiness Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Design: `2026-04-20-scene-skill-102-deterministic-invocation-readiness-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Make the materialized scene skills ready for sgClaw deterministic invocation using natural-language instructions ending with `。。。`.
|
||||||
|
|
||||||
|
This plan does not prove production execution. It only prepares and verifies registry/dispatch readiness.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19`
|
||||||
|
2. `examples/scene_skill_102_final_materialization_2026-04-19/scene_skill_102_index.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/skills/*/scene.toml`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_2026-04-20.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_samples_2026-04-20.json`
|
||||||
|
4. `docs/superpowers/reports/2026-04-20-scene-skill-102-deterministic-invocation-readiness-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/compat/scene_platform/dispatch.rs`
|
||||||
|
2. `src/compat/scene_platform/resolvers.rs`
|
||||||
|
3. `src/generated_scene/analyzer.rs`
|
||||||
|
4. `src/generated_scene/generator.rs`
|
||||||
|
5. generated `scripts/*`
|
||||||
|
6. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Invocation Readiness Boundary
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm final materialization root exists.
|
||||||
|
2. Confirm human-readable index exists.
|
||||||
|
3. Confirm this plan excludes browser execution and runtime changes.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Scope is deterministic invocation readiness only.
|
||||||
|
2. `sweep-012-scene` remains outside complete-package normalization.
|
||||||
|
|
||||||
|
## Phase 1: Normalize Deterministic Manifest Metadata
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. For each complete package, set `[deterministic].suffix = "。。。"`.
|
||||||
|
2. Preserve scene id, skill, tool, bootstrap, params, artifact, and postprocess sections.
|
||||||
|
3. Generate include keywords from:
|
||||||
|
- full scene name;
|
||||||
|
- meaningful scene-name tokens;
|
||||||
|
- archetype/family hints when available.
|
||||||
|
4. Keep exclude keywords.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. All complete packages use suffix `。。。`.
|
||||||
|
2. Every complete package has non-empty include keywords.
|
||||||
|
3. Skill directories and scripts are unchanged.
|
||||||
|
|
||||||
|
## Phase 2: Build Invocation Samples
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
For each complete package, generate at least:
|
||||||
|
|
||||||
|
1. full-name sample: `<sceneName>。。。`
|
||||||
|
2. keyword sample: `<bestKeyword>。。。`
|
||||||
|
3. parameterized sample when params exist.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Sample asset contains all complete packages.
|
||||||
|
2. Failed package is listed as excluded.
|
||||||
|
|
||||||
|
## Phase 3: Dispatch Dry-Run
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Run registry-backed dispatch checks without browser execution.
|
||||||
|
2. Verify full-name sample selects the expected scene.
|
||||||
|
3. Record ambiguous or unsupported dispatch results.
|
||||||
|
4. Record required-param prompts separately from dispatch misses.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Every complete package has a dispatch result.
|
||||||
|
2. Results distinguish selected, prompt, ambiguous, and no-match.
|
||||||
|
|
||||||
|
## Phase 4: Publish Readiness Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Publish readiness JSON.
|
||||||
|
2. Publish invocation sample JSON.
|
||||||
|
3. Publish superpowers report.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Report states deterministic-ready count.
|
||||||
|
2. Report states gap count and gap categories.
|
||||||
|
3. Report states whether runtime dispatch changes are needed.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after readiness assets and report are published. Do not start browser execution, static validation, production validation, or runtime dispatch implementation under this plan.
|
||||||
@@ -0,0 +1,112 @@
|
|||||||
|
# Scene Skill 102 Full Direct Mock Execution Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-20-scene-skill-102-full-direct-mock-execution-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Run all `102` final materialized scene skill scripts through a local direct mock runtime.
|
||||||
|
|
||||||
|
This plan expands beyond representative harness execution, but remains fully mock-only and local.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
||||||
|
2. `examples/scene_skill_102_final_materialization_2026-04-19/scene_skill_102_index.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_static_validation_2026-04-20.json`
|
||||||
|
4. `tests/fixtures/generated_scene/scene_skill_102_mock_runtime_harness_results_2026-04-20.json`
|
||||||
|
|
||||||
|
## Planned Outputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_full_direct_mock_execution_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-scene-skill-102-full-direct-mock-execution-report.md`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. new direct mock runner under `tests/`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_full_direct_mock_execution_2026-04-20.json`
|
||||||
|
3. `docs/superpowers/reports/2026-04-20-scene-skill-102-full-direct-mock-execution-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `examples/scene_skill_102_final_materialization_2026-04-19/skills/**`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Direct Mock Boundary
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm representative mock harness is complete.
|
||||||
|
2. Confirm this plan does not mutate generated skill packages.
|
||||||
|
3. Confirm this plan does not use real network, browser, or credentials.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. direct mock starts from final materialized skills
|
||||||
|
2. generated skills remain unchanged
|
||||||
|
|
||||||
|
## Phase 1: Build Direct Mock Runner
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. load the `102` scene index
|
||||||
|
2. locate each generated script
|
||||||
|
3. reuse fake runtime dependencies by archetype
|
||||||
|
4. call `buildBrowserEntrypointResult`
|
||||||
|
5. capture artifact status, row count, failure reason, and mock request log
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. every scene is attempted
|
||||||
|
2. no single scene failure aborts the full run
|
||||||
|
3. no real request is sent
|
||||||
|
|
||||||
|
## Phase 2: Execute Direct Mock For 102
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. run the direct mock runner
|
||||||
|
2. write per-scene direct mock result
|
||||||
|
3. classify each scene as:
|
||||||
|
- `direct-mock-pass`
|
||||||
|
- `direct-mock-partial`
|
||||||
|
- `direct-mock-fail`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. output record count is `102`
|
||||||
|
2. each failure has a named reason
|
||||||
|
|
||||||
|
## Phase 3: Publish Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. summarize direct mock pass/fail
|
||||||
|
2. summarize results by archetype
|
||||||
|
3. identify remaining mock-only blockers
|
||||||
|
4. recommend whether pseudo-production batch selection should start
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. report does not claim production execution
|
||||||
|
2. report separates mock pass from production pass
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. all `102` scenes have direct mock results
|
||||||
|
2. JSON asset is published
|
||||||
|
3. report is published
|
||||||
|
4. generated skill packages remain unchanged
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing direct mock execution results and report.
|
||||||
|
|
||||||
|
Do not start pseudo-production batch selection under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,279 @@
|
|||||||
|
# Scene Skill 102 Mock Runtime Harness Implementation Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-20-scene-skill-102-mock-runtime-harness-implementation-design.md`
|
||||||
|
> Input Matrix: `tests/fixtures/generated_scene/scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Implement and execute bounded mock runtime harnesses for representative generated scene skills.
|
||||||
|
|
||||||
|
This plan validates generated script control flow under fake dependencies. It does not validate production access, real data correctness, or browser-integrated host behavior.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_static_validation_2026-04-20.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_dispatch_dry_run_validation_2026-04-20.json`
|
||||||
|
4. `tests/fixtures/generated_scene/scene_skill_102_mock_runtime_validation_matrix_2026-04-20.json`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_readiness_2026-04-20.json`
|
||||||
|
|
||||||
|
## Planned Outputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_mock_runtime_harness_results_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-scene-skill-102-mock-runtime-harness-report.md`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. new mock harness files under `tests/` or `tests/fixtures/generated_scene/`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_mock_runtime_harness_results_2026-04-20.json`
|
||||||
|
3. `docs/superpowers/reports/2026-04-20-scene-skill-102-mock-runtime-harness-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `examples/scene_skill_102_final_materialization_2026-04-19/skills/**`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
1. `WS1` Mock harness foundation
|
||||||
|
2. `WS2` Mainline fetch archetype harnesses
|
||||||
|
3. `WS3` Small bucket harnesses
|
||||||
|
4. `WS4` Boundary/runtime harnesses
|
||||||
|
5. `WS5` Integrated result reporting
|
||||||
|
|
||||||
|
## Phase 0: Freeze Mock Runtime Boundary
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Freeze mock validation as a non-production, non-browser, non-network stage.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm static validation is `102 / 102`.
|
||||||
|
2. Confirm deterministic dispatch dry-run is `102 / 102`.
|
||||||
|
3. Confirm this plan does not mutate generated skill packages.
|
||||||
|
4. Confirm this plan does not require production credentials or network access.
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. baseline section in final mock runtime harness report
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no production environment is accessed
|
||||||
|
2. no generated skill is modified
|
||||||
|
3. no official board status is changed
|
||||||
|
|
||||||
|
## Phase 1: Mock Harness Foundation
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Create the shared fake runtime primitives used by all representative harnesses.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. define fake `fetch`
|
||||||
|
2. define fake browser DOM surface
|
||||||
|
3. define fake artifact writer
|
||||||
|
4. define fake host bridge callback surface
|
||||||
|
5. define fake local-doc service surface
|
||||||
|
6. define common result schema:
|
||||||
|
- `script-load-pass`
|
||||||
|
- `mock-runtime-pass`
|
||||||
|
- `mock-runtime-partial`
|
||||||
|
- `mock-runtime-fail`
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. shared mock harness implementation
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. harness foundation does not call real network
|
||||||
|
2. harness foundation can run without browser or credentials
|
||||||
|
3. harness foundation can load a generated script from the final materialization root
|
||||||
|
|
||||||
|
## Phase 2: Route 1 - Paginated Enrichment Harness
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Validate the largest archetype bucket first.
|
||||||
|
|
||||||
|
### Fixed Representatives
|
||||||
|
|
||||||
|
1. `sweep-001-scene`
|
||||||
|
2. `sweep-002-scene`
|
||||||
|
3. `sweep-003-scene`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. load each representative script
|
||||||
|
2. provide fake primary page response
|
||||||
|
3. provide fake enrichment response
|
||||||
|
4. verify expected request order where observable
|
||||||
|
5. verify artifact metadata or structured result is produced
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. paginated enrichment mock result records
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. each representative receives a `mock-runtime-*` status
|
||||||
|
2. no real request is sent
|
||||||
|
3. failures include named failure reason
|
||||||
|
|
||||||
|
## Phase 3: Route 2 - G2 And G1-E Fetch Harnesses
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Validate fetch-based mainline small buckets.
|
||||||
|
|
||||||
|
### Fixed Representatives
|
||||||
|
|
||||||
|
`multi_mode_request`:
|
||||||
|
|
||||||
|
1. `sweep-020-scene`
|
||||||
|
2. `sweep-023-scene`
|
||||||
|
3. `sweep-030-scene`
|
||||||
|
|
||||||
|
`single_request_enrichment`:
|
||||||
|
|
||||||
|
1. `sweep-013-scene`
|
||||||
|
2. `sweep-016-scene`
|
||||||
|
3. `sweep-068-scene`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. run representative scripts with fake fetch
|
||||||
|
2. verify mode/request paths for multi-mode scenes
|
||||||
|
3. verify enrichment path for single-request enrichment scenes
|
||||||
|
4. record pass/fail reason
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. multi-mode request mock result records
|
||||||
|
2. single-request enrichment mock result records
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. each representative receives a `mock-runtime-*` status
|
||||||
|
2. real-sample or production execution is not started
|
||||||
|
|
||||||
|
## Phase 4: Route 3 - Inventory And Page-State Harnesses
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Validate the small specialized buckets.
|
||||||
|
|
||||||
|
### Fixed Representatives
|
||||||
|
|
||||||
|
`multi_endpoint_inventory`:
|
||||||
|
|
||||||
|
1. `sweep-084-scene`
|
||||||
|
2. `sweep-085-scene`
|
||||||
|
|
||||||
|
`page_state_eval`:
|
||||||
|
|
||||||
|
1. `sweep-066-scene`
|
||||||
|
2. `sweep-094-scene`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. run multi-endpoint representatives with fake endpoint responses
|
||||||
|
2. run page-state representatives with fake DOM state
|
||||||
|
3. record pass/fail reason
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. inventory mock result records
|
||||||
|
2. page-state mock result records
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. each representative receives a `mock-runtime-*` status
|
||||||
|
2. no host browser is required
|
||||||
|
|
||||||
|
## Phase 5: Route 4 - Local-Doc And Host-Bridge Harnesses
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Validate boundary runtime families with fake local-doc and fake host-bridge surfaces.
|
||||||
|
|
||||||
|
### Fixed Representatives
|
||||||
|
|
||||||
|
`local_doc_pipeline`:
|
||||||
|
|
||||||
|
1. `sweep-012-scene`
|
||||||
|
2. `sweep-017-scene`
|
||||||
|
3. `sweep-019-scene`
|
||||||
|
|
||||||
|
`host_bridge_workflow`:
|
||||||
|
|
||||||
|
1. `sweep-007-scene`
|
||||||
|
2. `sweep-009-scene`
|
||||||
|
3. `sweep-010-scene`
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. run local-doc representatives with fake local document query and export responses
|
||||||
|
2. run host-bridge representatives with fake action and callback completion responses
|
||||||
|
3. classify boundary failures as mock harness gaps or script contract gaps
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. local-doc mock result records
|
||||||
|
2. host-bridge mock result records
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no real host bridge is invoked
|
||||||
|
2. no local document service is invoked
|
||||||
|
3. failures are explicitly categorized
|
||||||
|
|
||||||
|
## Phase 6: Integrated Mock Runtime Report
|
||||||
|
|
||||||
|
### Objective
|
||||||
|
|
||||||
|
Publish representative execution results and propagated matrix interpretation.
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write `scene_skill_102_mock_runtime_harness_results_2026-04-20.json`
|
||||||
|
2. summarize representative pass/fail by archetype
|
||||||
|
3. summarize which non-representative scenes are covered only by representative inference
|
||||||
|
4. identify which archetypes still require direct mock expansion
|
||||||
|
5. recommend whether to proceed to pseudo-production batch planning
|
||||||
|
|
||||||
|
### Deliverables
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_mock_runtime_harness_results_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-scene-skill-102-mock-runtime-harness-report.md`
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. report distinguishes representative execution from propagated coverage
|
||||||
|
2. report does not claim production execution
|
||||||
|
3. report does not update official board
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when:
|
||||||
|
|
||||||
|
1. every fixed representative has a mock runtime result record
|
||||||
|
2. integrated mock runtime results JSON is published
|
||||||
|
3. mock runtime report is published
|
||||||
|
4. generated skill packages remain unchanged
|
||||||
|
5. no real browser or production environment was used
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing mock runtime harness results and report.
|
||||||
|
|
||||||
|
Do not start pseudo-production or real-environment validation under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,104 @@
|
|||||||
|
# Scene Skill 102 Natural-Language Parameter Readiness Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Design: `2026-04-20-scene-skill-102-natural-language-parameter-readiness-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Build a 102-scene natural-language invocation parameter readiness view before pseudo-production testing.
|
||||||
|
|
||||||
|
This plan answers which skills should be invoked with query conditions such as organization and period, which skills currently only support scene-keyword deterministic selection, and which required-param skills have resolver gaps.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_full_direct_mock_execution_2026-04-20.json`
|
||||||
|
|
||||||
|
## Allowed Outputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_natural_language_invocation_samples_2026-04-20.json`
|
||||||
|
3. `docs/superpowers/reports/2026-04-20-scene-skill-102-natural-language-parameter-readiness-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/compat/scene_platform/dispatch.rs`
|
||||||
|
2. `src/compat/scene_platform/resolvers.rs`
|
||||||
|
3. `src/generated_scene/analyzer.rs`
|
||||||
|
4. `src/generated_scene/generator.rs`
|
||||||
|
5. `examples/scene_skill_102_final_materialization_2026-04-19/skills/*`
|
||||||
|
6. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Boundary
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm final skill count is `102`.
|
||||||
|
2. Confirm this plan is analysis-only.
|
||||||
|
3. Confirm no browser, network, host bridge, or production execution is performed.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. No generated skill files are modified.
|
||||||
|
2. No runtime source files are modified.
|
||||||
|
|
||||||
|
## Phase 1: Parameter Manifest Scan
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Read each `SKILL.toml` for scene name and archetype.
|
||||||
|
2. Read each `scene.toml` for deterministic suffix, params, and resolver declarations.
|
||||||
|
3. Record required params and resolver types.
|
||||||
|
4. Check resolver resources such as dictionary files.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. All `102` scenes have one parameter scan record.
|
||||||
|
2. Required param scenes are explicitly identified.
|
||||||
|
|
||||||
|
## Phase 2: Readiness Classification
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Mark scenes with supported, populated resolver resources as `parameter-ready`.
|
||||||
|
2. Mark scenes with empty or missing resolver resources as `parameter-gap`.
|
||||||
|
3. Mark no-param scenes as `parameter-not-required`.
|
||||||
|
4. Mark no-param scenes with likely filter words as `parameter-implicit-risk`.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Every scene has exactly one primary readiness class.
|
||||||
|
2. Resolver gaps list concrete file or config reasons.
|
||||||
|
|
||||||
|
## Phase 3: Invocation Sample Generation
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Generate minimal invocation samples for every scene.
|
||||||
|
2. Generate parameterized samples for scenes with required params.
|
||||||
|
3. Generate cautionary samples for implicit-risk scenes.
|
||||||
|
4. Make clear when organization or period wording is not currently parsed.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Sample JSON covers all `102` scenes.
|
||||||
|
2. Parameterized samples are not generated as if resolver gaps are resolved.
|
||||||
|
|
||||||
|
## Phase 4: Publish Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Write readiness JSON.
|
||||||
|
2. Write invocation sample JSON.
|
||||||
|
3. Write superpowers report with counts, gaps, and next route.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Report explains why `场景名。。。` is insufficient for parameterized scenes.
|
||||||
|
2. Report states whether pseudo-production batch input should be regenerated.
|
||||||
|
3. Stop after report; do not start implementation.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after readiness assets and report are published. Do not edit runtime, generated skills, board assets, or pseudo-production execution records under this plan.
|
||||||
@@ -0,0 +1,124 @@
|
|||||||
|
# Scene Skill 102 Parameter Dictionary And Invocation Template Normalization Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Design: `2026-04-20-scene-skill-102-parameter-dictionary-template-normalization-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Close the natural-language parameter readiness gap for the fixed `10` required-param scene skills and refresh pseudo-production invocation templates.
|
||||||
|
|
||||||
|
## Fixed Input Bucket
|
||||||
|
|
||||||
|
The only input bucket is the `10` scenes marked `parameter-gap` in:
|
||||||
|
|
||||||
|
`tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/skills/{fixed-10}/references/org-dictionary.json`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_natural_language_invocation_samples_2026-04-20.json`
|
||||||
|
4. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_execution_handoff_2026-04-20.json`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_skill_102_parameter_dictionary_template_normalization_2026-04-20.json`
|
||||||
|
6. `docs/superpowers/reports/2026-04-20-scene-skill-102-parameter-dictionary-template-normalization-report.md`
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/compat/scene_platform/dispatch.rs`
|
||||||
|
2. `src/compat/scene_platform/resolvers.rs`
|
||||||
|
3. `src/generated_scene/analyzer.rs`
|
||||||
|
4. `src/generated_scene/generator.rs`
|
||||||
|
5. `examples/scene_skill_102_final_materialization_2026-04-19/skills/*/scripts/*`
|
||||||
|
6. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Scope
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm the fixed `10` required-param scenes.
|
||||||
|
2. Confirm all current gaps are empty org dictionaries.
|
||||||
|
3. Confirm no runtime code changes are needed.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. No non-param scene is changed.
|
||||||
|
2. No browser or production execution is started.
|
||||||
|
|
||||||
|
## Phase 1: Populate Starter Organization Dictionaries
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Write the pseudo-production starter dictionary into each fixed `10` skill.
|
||||||
|
2. Use the already-tested aliases:
|
||||||
|
- `兰州公司`
|
||||||
|
- `兰州供电公司`
|
||||||
|
- `国网兰州供电公司`
|
||||||
|
- `城关供电分公司`
|
||||||
|
- `城关分公司`
|
||||||
|
- `天水公司`
|
||||||
|
- `天水供电公司`
|
||||||
|
- `国网天水供电公司`
|
||||||
|
3. Mark dictionary provenance as starter, not full production.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. All fixed `10` dictionaries are non-empty arrays.
|
||||||
|
2. Each dictionary contains alias coverage for `兰州公司`.
|
||||||
|
|
||||||
|
## Phase 2: Refresh Parameter Readiness
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Re-scan all `102` skills.
|
||||||
|
2. Recompute parameter readiness.
|
||||||
|
3. Verify the fixed `10` move to `parameter-ready`.
|
||||||
|
4. Keep implicit-risk classification for no-param scenes.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. `parameter-gap = 0`.
|
||||||
|
2. `parameter-ready = 10`.
|
||||||
|
3. `total scenes = 102`.
|
||||||
|
|
||||||
|
## Phase 3: Refresh Invocation Templates
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Generate parameterized samples for the fixed `10`.
|
||||||
|
2. Ensure samples include concrete period, e.g. `月累计 2026-03`.
|
||||||
|
3. Ensure samples keep `。。。` suffix.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. All fixed `10` have parameterized sample input.
|
||||||
|
2. No-param scenes keep minimal invocation samples.
|
||||||
|
|
||||||
|
## Phase 4: Refresh Pseudo-Production Handoff
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Update the selected pseudo-production handoff entries that are in the fixed `10`.
|
||||||
|
2. Replace bare scene-name inputs with parameterized inputs.
|
||||||
|
3. Preserve credential policy and evidence collection fields.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Selected required-param scenes no longer use bare `场景名。。。` in handoff.
|
||||||
|
2. No credentials are written to the repository.
|
||||||
|
|
||||||
|
## Phase 5: Publish Normalization Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Publish normalization JSON.
|
||||||
|
2. Publish superpowers report.
|
||||||
|
3. State remaining limits explicitly.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Report states that dictionaries are starter dictionaries, not complete production unit trees.
|
||||||
|
2. Report states next step for pseudo-production execution preparation refresh.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after dictionaries, readiness assets, invocation samples, handoff, and report are refreshed. Do not run browser, production, or runtime implementation work under this plan.
|
||||||
@@ -0,0 +1,111 @@
|
|||||||
|
# Scene Skill 102 Pseudo-Production Batch Execution Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-20-scene-skill-102-pseudoprod-batch-execution-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Run the prepared 10-scene pseudo-production batch in an operator-provided environment and record structured results.
|
||||||
|
|
||||||
|
This plan is bounded to execution and evidence collection for the selected 10 scenes.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_execution_handoff_2026-04-20.json`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_evidence_checklist_2026-04-20.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_execution_record_template_2026-04-20.json`
|
||||||
|
4. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_batch_selection_2026-04-20.json`
|
||||||
|
|
||||||
|
## Planned Outputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_batch_execution_results_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-scene-skill-102-pseudoprod-batch-execution-report.md`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. planned execution result JSON
|
||||||
|
2. planned execution report
|
||||||
|
3. redacted evidence summaries if explicitly generated
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `examples/scene_skill_102_final_materialization_2026-04-19/skills/**`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
6. any credential, token, cookie, or secret file
|
||||||
|
|
||||||
|
## Phase 0: Confirm Environment Readiness
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm operator-provided browser/runtime environment exists.
|
||||||
|
2. Confirm network/session access is provided outside repository.
|
||||||
|
3. Confirm evidence output location.
|
||||||
|
4. Confirm redaction rules.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. No credentials are stored in repository.
|
||||||
|
2. Execution does not start unless environment readiness is confirmed externally.
|
||||||
|
|
||||||
|
## Phase 1: Execute Selected Scenes
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
For each selected scene:
|
||||||
|
|
||||||
|
1. use the deterministic invocation input ending with `。。。`
|
||||||
|
2. execute through sgClaw runtime or agreed quasi-production host
|
||||||
|
3. collect console log
|
||||||
|
4. collect network summary
|
||||||
|
5. capture screenshot if target page is required
|
||||||
|
6. capture exported artifact if produced
|
||||||
|
7. record final result state
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. every selected scene has one execution record
|
||||||
|
2. every record has exactly one result state
|
||||||
|
3. failures use the allowed taxonomy
|
||||||
|
|
||||||
|
## Phase 2: Redact And Normalize Evidence
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. redact credentials, cookies, tokens, Authorization headers, and private data
|
||||||
|
2. normalize evidence paths
|
||||||
|
3. confirm each evidence checklist item is present or explicitly unavailable
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. no secret material enters repository output
|
||||||
|
2. missing evidence has a reason
|
||||||
|
|
||||||
|
## Phase 3: Publish Execution Results
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. write execution results JSON
|
||||||
|
2. write execution report
|
||||||
|
3. summarize pass/blocker/mismatch/runtime-error counts
|
||||||
|
4. list follow-up blockers
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. selected scene count remains 10
|
||||||
|
2. report does not claim full production certification
|
||||||
|
3. official board is not updated under this plan
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when all 10 selected scenes have structured execution records and a redacted execution report is published.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing execution results and report.
|
||||||
|
|
||||||
|
Do not update official board status under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,114 @@
|
|||||||
|
# Scene Skill 102 Pseudo-Production Batch Execution Preparation Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-20-scene-skill-102-pseudoprod-batch-execution-preparation-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Prepare the first pseudo-production batch for execution without executing it.
|
||||||
|
|
||||||
|
This plan creates handoff and evidence templates for the 10 selected scenes.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_batch_selection_2026-04-20.json`
|
||||||
|
2. `examples/scene_skill_102_final_materialization_2026-04-19/scene_skill_102_index.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_full_direct_mock_execution_2026-04-20.json`
|
||||||
|
|
||||||
|
## Planned Outputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_execution_handoff_2026-04-20.json`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_execution_record_template_2026-04-20.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_evidence_checklist_2026-04-20.json`
|
||||||
|
4. `docs/superpowers/reports/2026-04-20-scene-skill-102-pseudoprod-batch-execution-preparation-report.md`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. planned JSON assets
|
||||||
|
2. planned report
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `examples/scene_skill_102_final_materialization_2026-04-19/skills/**`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Preparation Boundary
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm selected batch is exactly 10 scenes.
|
||||||
|
2. Confirm all 10 are direct-mock-pass.
|
||||||
|
3. Confirm no browser/network execution happens in this plan.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. No production or quasi-production target is invoked.
|
||||||
|
2. No credentials are requested or stored.
|
||||||
|
|
||||||
|
## Phase 1: Build Environment Handoff
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. List required environment inputs for the operator.
|
||||||
|
2. Map required dependencies per selected scene.
|
||||||
|
3. Define credential handling rule: outside repository only.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Every selected scene has environment prerequisites.
|
||||||
|
2. The handoff asset contains no credential values.
|
||||||
|
|
||||||
|
## Phase 2: Build Evidence Checklist
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Define evidence required for each scene.
|
||||||
|
2. Define evidence file names.
|
||||||
|
3. Define redaction requirements for logs and screenshots.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Every selected scene has an evidence checklist.
|
||||||
|
2. Every checklist includes final execution classification.
|
||||||
|
|
||||||
|
## Phase 3: Build Execution Record Template
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Define common execution record fields.
|
||||||
|
2. Include per-scene placeholders for operator output.
|
||||||
|
3. Include allowed result states and failure taxonomy.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. The template can record pass, blocker, mismatch, and runtime error.
|
||||||
|
2. The template stores references to evidence files, not credentials.
|
||||||
|
|
||||||
|
## Phase 4: Publish Preparation Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Summarize selected batch.
|
||||||
|
2. Summarize environment handoff.
|
||||||
|
3. Summarize evidence package structure.
|
||||||
|
4. Identify next bounded execution plan.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Report states this is preparation-only.
|
||||||
|
2. Report does not claim pseudo-production execution.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when handoff, evidence checklist, record template, and report are published.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing preparation assets.
|
||||||
|
|
||||||
|
Do not run pseudo-production execution under this plan.
|
||||||
|
|
||||||
@@ -0,0 +1,127 @@
|
|||||||
|
# Scene Skill 102 Pseudo-Production Batch Selection Plan
|
||||||
|
|
||||||
|
> Date: 2026-04-20
|
||||||
|
> Status: Draft
|
||||||
|
> Upstream Design: `docs/superpowers/specs/2026-04-20-scene-skill-102-pseudoprod-batch-selection-design.md`
|
||||||
|
|
||||||
|
## Plan Intent
|
||||||
|
|
||||||
|
Select the first pseudo-production validation batch from the 102 final materialized skills.
|
||||||
|
|
||||||
|
This plan is selection-only. It does not run pseudo-production execution.
|
||||||
|
|
||||||
|
## Fixed Inputs
|
||||||
|
|
||||||
|
1. `examples/scene_skill_102_final_materialization_2026-04-19/scene_skill_102_index.json`
|
||||||
|
2. `tests/fixtures/generated_scene/scene_skill_102_static_validation_2026-04-20.json`
|
||||||
|
3. `tests/fixtures/generated_scene/scene_skill_102_dispatch_dry_run_validation_2026-04-20.json`
|
||||||
|
4. `tests/fixtures/generated_scene/scene_skill_102_full_direct_mock_execution_2026-04-20.json`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_readiness_2026-04-20.json`
|
||||||
|
|
||||||
|
## Planned Outputs
|
||||||
|
|
||||||
|
1. `tests/fixtures/generated_scene/scene_skill_102_pseudoprod_batch_selection_2026-04-20.json`
|
||||||
|
2. `docs/superpowers/reports/2026-04-20-scene-skill-102-pseudoprod-batch-selection-report.md`
|
||||||
|
|
||||||
|
## Allowed Files
|
||||||
|
|
||||||
|
1. the planned output JSON
|
||||||
|
2. the planned report
|
||||||
|
|
||||||
|
## Forbidden Files
|
||||||
|
|
||||||
|
1. `src/generated_scene/analyzer.rs`
|
||||||
|
2. `src/generated_scene/generator.rs`
|
||||||
|
3. `src/generated_scene/ir.rs`
|
||||||
|
4. `examples/scene_skill_102_final_materialization_2026-04-19/skills/**`
|
||||||
|
5. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||||
|
|
||||||
|
## Phase 0: Freeze Selection Boundary
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Confirm all 102 scenes are direct-mock-pass.
|
||||||
|
2. Confirm this plan does not execute browser automation or real network access.
|
||||||
|
3. Confirm this plan does not update official board status.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Selection starts from a clean `102 / 102` local mock baseline.
|
||||||
|
2. Selection does not mutate generated skills or runtime code.
|
||||||
|
|
||||||
|
## Phase 1: Build Eligible Candidate Set
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Read pseudo-production readiness records.
|
||||||
|
2. Keep only `pseudo-prod-ready` scenes.
|
||||||
|
3. Exclude `real-env-required` scenes from first batch.
|
||||||
|
4. Join with direct mock results and static/dispatch readiness.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Every selected candidate is static validated.
|
||||||
|
2. Every selected candidate is dispatch ready.
|
||||||
|
3. Every selected candidate is direct-mock-pass.
|
||||||
|
|
||||||
|
## Phase 2: Select Balanced First Batch
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
Select `10` scenes with archetype balance:
|
||||||
|
|
||||||
|
1. `paginated_enrichment`: 4
|
||||||
|
2. `multi_mode_request`: 2
|
||||||
|
3. `single_request_enrichment`: 2
|
||||||
|
4. `multi_endpoint_inventory`: 1
|
||||||
|
5. `page_state_eval`: 1
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. The selected batch contains exactly `10` scenes.
|
||||||
|
2. The batch excludes host-bridge and local-doc runtime-dependent scenes.
|
||||||
|
3. Every selected scene has a deterministic invocation input.
|
||||||
|
|
||||||
|
## Phase 3: Define Evidence Checklist
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
For each selected scene, define required evidence:
|
||||||
|
|
||||||
|
1. deterministic invocation input
|
||||||
|
2. console log
|
||||||
|
3. network log or request summary
|
||||||
|
4. screenshot if browser target page is required
|
||||||
|
5. exported file if produced
|
||||||
|
6. generation report path
|
||||||
|
7. failure taxonomy slot
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. Every selected scene has a complete checklist.
|
||||||
|
2. Checklist does not require production credentials to be stored in the repository.
|
||||||
|
|
||||||
|
## Phase 4: Publish Selection Report
|
||||||
|
|
||||||
|
### Tasks
|
||||||
|
|
||||||
|
1. Write selection JSON.
|
||||||
|
2. Write selection report.
|
||||||
|
3. Summarize selected and deferred scenes.
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
1. The report states this is selection-only.
|
||||||
|
2. The report does not claim pseudo-production execution.
|
||||||
|
3. The report identifies the next bounded execution plan.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This plan is complete when the first pseudo-production batch selection JSON and report are published.
|
||||||
|
|
||||||
|
## Stop Statement
|
||||||
|
|
||||||
|
Stop after publishing selection assets.
|
||||||
|
|
||||||
|
Do not execute pseudo-production validation under this plan.
|
||||||
|
|
||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user