1193 lines
28 KiB
Markdown
1193 lines
28 KiB
Markdown
# Skill Normalized Result And Dashboard Local Reader Design
|
||
|
||
Date: 2026-04-26
|
||
|
||
Status: Historical design archive; the in-repo `normalized_writer` design described here was later migrated out of `claw-new`
|
||
|
||
Historical note (2026-05-06): this document records the pre-extraction design that originally placed `normalized_writer` inside `claw-new`. The canonical implementation now lives in `D:\data\ideaSpace\rust\sgClaw\normalized_writer`.
|
||
|
||
Related baseline designs:
|
||
|
||
- `docs/superpowers/specs/2026-04-22-scheduled-monitoring-action-end-to-end-architecture-design.md`
|
||
- `docs/superpowers/specs/2026-04-23-archive-workorder-grid-push-monitor-design.md`
|
||
- `docs/superpowers/specs/2026-04-24-available-balance-below-zero-monitor-design.md`
|
||
- `docs/superpowers/specs/2026-04-26-scheduled-monitoring-generated-scene-adaptation-hardening-design.md`
|
||
|
||
Related skill/result sources:
|
||
|
||
- `D:/desk/sgclaw/sgclaw/results/archive-workorder-grid-push-monitor.run-record.json`
|
||
- `D:/desk/sgclaw/sgclaw/results/available-balance-below-zero-monitor.run-record.json`
|
||
- `D:/desk/sgclaw/sgclaw/results/command-center-fee-control-monitor.run-record.json`
|
||
- `D:/desk/sgclaw/sgclaw/results/sgcc-todo-crawler.run-record.json`
|
||
|
||
Target dashboard project:
|
||
|
||
- `D:/data/ideaSpace/rust/sgClaw/digital-employee`
|
||
|
||
## Plain-Language Goal
|
||
|
||
Turn raw scheduled-skill run results into a stable, dashboard-friendly result contract.
|
||
|
||
The system must support:
|
||
|
||
1. current four skills:
|
||
- `archive-workorder-grid-push-monitor`
|
||
- `available-balance-below-zero-monitor`
|
||
- `command-center-fee-control-monitor`
|
||
- `sgcc-todo-crawler`
|
||
2. future new skills without redesigning the dashboard data path
|
||
3. low-coupling consumption where dashboard does not parse runtime-private `run-record` structure
|
||
4. safe fallback when the latest run fails but the last successful business result should still remain visible
|
||
|
||
This design is not about changing business detection logic.
|
||
It is about adding a stable result contract and a stable local read path for dashboard consumption.
|
||
|
||
## Historical Architecture Constraints
|
||
|
||
The pre-extraction design required these boundaries:
|
||
|
||
1. `run-record.json` remains the raw runtime/audit artifact
|
||
2. dashboard must not directly consume `run-record.json`
|
||
3. normalized result files become the only supported dashboard-facing data contract
|
||
4. at the time, `claw-new` was responsible for producing normalized files
|
||
5. `digital-employee` is responsible for reading normalized files through a local API
|
||
6. dashboard is responsible only for rendering and polling the local API
|
||
7. first implementation uses HTTP polling, not websocket
|
||
|
||
## Why Direct Dashboard Parsing Is Rejected
|
||
|
||
The current raw result files are large, runtime-private, and still evolving.
|
||
|
||
If `digital-employee` directly parses:
|
||
|
||
1. `decisionPreview`
|
||
2. `auditPreview.detectReadDiagnostics`
|
||
3. skill-specific nested fields
|
||
|
||
then the dashboard becomes tightly coupled to runtime internals.
|
||
|
||
That would make every runtime refinement risky for UI consumption.
|
||
|
||
The dashboard must instead consume a stable, normalized contract that:
|
||
|
||
1. preserves business values
|
||
2. hides runtime-private structure
|
||
3. supports future extractor growth without UI rewrites
|
||
|
||
## Why Websocket Is Not The First Transport
|
||
|
||
The source data is not user-interactive event flow.
|
||
It is scheduled task output that refreshes at minute-level cadence.
|
||
|
||
Therefore the first transport must be:
|
||
|
||
1. normalized file snapshot
|
||
2. local HTTP reader
|
||
3. dashboard polling
|
||
|
||
This keeps the system simpler and more stable:
|
||
|
||
1. no connection state
|
||
2. no reconnect logic
|
||
3. no push protocol versioning
|
||
4. no incremental event semantics
|
||
|
||
If true push is needed later, it can be added inside the local reader layer without changing normalized file contract.
|
||
|
||
## End-State Architecture
|
||
|
||
The full data chain is:
|
||
|
||
```text
|
||
scheduled skill runtime
|
||
-> *.run-record.json
|
||
-> normalized result extractor
|
||
-> normalized/*.json
|
||
-> digital-employee local reader API
|
||
-> dashboard polling
|
||
```
|
||
|
||
The architecture is intentionally split into four layers.
|
||
|
||
### Layer 1: Raw Result Layer
|
||
|
||
Location:
|
||
|
||
```text
|
||
D:\desk\sgclaw\sgclaw\results\*.run-record.json
|
||
```
|
||
|
||
Responsibility:
|
||
|
||
1. preserve raw runtime result
|
||
2. preserve diagnostics/audit structure
|
||
3. remain implementation-private to runtime and normalizer
|
||
|
||
This layer is allowed to be large and unstable.
|
||
|
||
### Layer 2: Normalized Result Layer
|
||
|
||
Location:
|
||
|
||
```text
|
||
D:\desk\sgclaw\sgclaw\results\normalized\
|
||
```
|
||
|
||
Responsibility:
|
||
|
||
1. expose stable dashboard-facing contract
|
||
2. split one file per skill
|
||
3. preserve both latest state and last-known-good state
|
||
4. provide index-level aggregation for dashboard first screen
|
||
|
||
This is the single consumption contract for downstream readers.
|
||
|
||
### Layer 3: Local Reader Layer
|
||
|
||
Location:
|
||
|
||
```text
|
||
D:\data\ideaSpace\rust\sgClaw\digital-employee\server\
|
||
```
|
||
|
||
Responsibility:
|
||
|
||
1. read normalized files from disk
|
||
2. expose stable local HTTP endpoints
|
||
3. validate skill ids and schema shape
|
||
4. hide file-path and filesystem concerns from the browser
|
||
|
||
This layer does not parse raw `run-record.json`.
|
||
It only reads normalized files.
|
||
|
||
### Layer 4: Dashboard Consumption Layer
|
||
|
||
Location:
|
||
|
||
```text
|
||
D:\data\ideaSpace\rust\sgClaw\digital-employee\src\
|
||
```
|
||
|
||
Responsibility:
|
||
|
||
1. poll local reader API
|
||
2. render cards, lists, trend widgets, and detail views
|
||
3. react to `status`, `metric`, `payload`, and `freshness`
|
||
|
||
This layer must not know anything about raw runtime-private nesting.
|
||
|
||
## Required Directory Layout
|
||
|
||
The normalized result tree must follow this layout:
|
||
|
||
```text
|
||
D:\desk\sgclaw\sgclaw\results\
|
||
archive-workorder-grid-push-monitor.run-record.json
|
||
available-balance-below-zero-monitor.run-record.json
|
||
command-center-fee-control-monitor.run-record.json
|
||
sgcc-todo-crawler.run-record.json
|
||
normalized\
|
||
index.json
|
||
latest\
|
||
archive-workorder-grid-push-monitor.json
|
||
available-balance-below-zero-monitor.json
|
||
command-center-fee-control-monitor.json
|
||
sgcc-todo-crawler.json
|
||
last-good\
|
||
archive-workorder-grid-push-monitor.json
|
||
available-balance-below-zero-monitor.json
|
||
command-center-fee-control-monitor.json
|
||
sgcc-todo-crawler.json
|
||
history\
|
||
archive-workorder-grid-push-monitor\
|
||
2026\
|
||
04\
|
||
2026-04-25T19-46-06+08-00.json
|
||
available-balance-below-zero-monitor\
|
||
command-center-fee-control-monitor\
|
||
sgcc-todo-crawler\
|
||
```
|
||
|
||
Required semantics:
|
||
|
||
1. `latest/` always represents most recent normalized extraction
|
||
2. `last-good/` updates only when normalized result is still business-usable:
|
||
- `ok`
|
||
- `empty`
|
||
- optionally `soft_error` only if the extractor explicitly certifies business values are still trustworthy
|
||
3. `history/` stores immutable snapshots for audit and later trend construction
|
||
4. `index.json` summarizes current state of all known normalized skills
|
||
|
||
## Normalized Contract
|
||
|
||
All normalized files must share one public envelope.
|
||
|
||
### Common Envelope
|
||
|
||
```json
|
||
{
|
||
"schemaVersion": "1.0",
|
||
"skillId": "available-balance-below-zero-monitor",
|
||
"skillName": "可用电费小于零监测",
|
||
"category": "monitor",
|
||
"resultType": "count_snapshot",
|
||
"observedAt": "2026-04-25T19:46:24+08:00",
|
||
"generatedAt": "2026-04-25T19:46:26+08:00",
|
||
"status": "ok",
|
||
"freshness": {
|
||
"staleAfterSeconds": 900,
|
||
"isStale": false
|
||
},
|
||
"summary": "2026-04-25 19:46:24--可用电费小于零监测检测到【数量】:4265",
|
||
"metric": {
|
||
"label": "数量",
|
||
"value": 4265,
|
||
"unit": "items"
|
||
},
|
||
"payload": null,
|
||
"diagnostics": {},
|
||
"source": {
|
||
"kind": "run_record",
|
||
"runRecordPath": "D:\\desk\\sgclaw\\sgclaw\\results\\available-balance-below-zero-monitor.run-record.json",
|
||
"runRecordMtime": "2026-04-25T19:46:26+08:00",
|
||
"extractorVersion": "1.0"
|
||
}
|
||
}
|
||
```
|
||
|
||
### Envelope Field Rules
|
||
|
||
#### `schemaVersion`
|
||
|
||
This is the public normalized schema version.
|
||
It is not the raw runtime version.
|
||
|
||
#### `skillId`
|
||
|
||
This must equal the stable skill directory / scene id.
|
||
It is the primary machine identity.
|
||
|
||
#### `skillName`
|
||
|
||
This must come from a normalizer registry, not from opportunistic raw JSON text extraction.
|
||
This avoids encoding drift and log-text coupling.
|
||
|
||
#### `category`
|
||
|
||
Current categories:
|
||
|
||
1. `monitor`
|
||
2. `crawler`
|
||
|
||
This is mainly for dashboard grouping and future filtering.
|
||
|
||
#### `resultType`
|
||
|
||
Current supported public result types:
|
||
|
||
1. `count_snapshot`
|
||
2. `detail_snapshot`
|
||
|
||
New skill onboarding should fit one of these two whenever possible.
|
||
|
||
#### `observedAt`
|
||
|
||
This is the business observation timestamp.
|
||
Priority:
|
||
|
||
1. stable run completion / business timestamp from raw result if available
|
||
2. raw `run-record.json` file mtime as fallback
|
||
|
||
#### `generatedAt`
|
||
|
||
This is the time normalized file is written.
|
||
|
||
#### `status`
|
||
|
||
Allowed values:
|
||
|
||
1. `ok`
|
||
2. `empty`
|
||
3. `soft_error`
|
||
4. `error`
|
||
|
||
#### `freshness`
|
||
|
||
This communicates staleness independent of success/failure.
|
||
|
||
Required fields:
|
||
|
||
1. `staleAfterSeconds`
|
||
2. `isStale`
|
||
|
||
The first version may compute `isStale` at generation time or local-reader response time.
|
||
Local-reader recomputation is preferred because it reflects current wall-clock state.
|
||
|
||
#### `summary`
|
||
|
||
This is a stable human-readable sentence assembled by the normalizer from structured fields.
|
||
It must never be parsed back by the dashboard.
|
||
|
||
#### `metric`
|
||
|
||
This is the primary business indicator for card rendering.
|
||
All result types must still expose `metric`.
|
||
|
||
#### `payload`
|
||
|
||
This is `null` for simple count snapshots.
|
||
This is a structured object for detail snapshots.
|
||
|
||
#### `diagnostics`
|
||
|
||
This is public-but-auxiliary.
|
||
It is for drill-down, debugging, and support tooling.
|
||
Dashboard must not rely on it for primary rendering.
|
||
|
||
#### `source`
|
||
|
||
This preserves audit linkage back to raw result source.
|
||
|
||
Required fields:
|
||
|
||
1. `kind`
|
||
2. `runRecordPath`
|
||
3. `runRecordMtime`
|
||
4. `extractorVersion`
|
||
|
||
## Result Types
|
||
|
||
### Type 1: `count_snapshot`
|
||
|
||
This type is used for:
|
||
|
||
1. `available-balance-below-zero-monitor`
|
||
2. `archive-workorder-grid-push-monitor`
|
||
3. `command-center-fee-control-monitor`
|
||
|
||
Characteristics:
|
||
|
||
1. the business primary output is one main count
|
||
2. the dashboard mostly needs a number card and a summary line
|
||
3. drill-down diagnostics may exist, but primary payload is count-focused
|
||
|
||
Example:
|
||
|
||
```json
|
||
{
|
||
"schemaVersion": "1.0",
|
||
"skillId": "archive-workorder-grid-push-monitor",
|
||
"skillName": "归档工单配网推送监测",
|
||
"category": "monitor",
|
||
"resultType": "count_snapshot",
|
||
"observedAt": "2026-04-25T19:46:06+08:00",
|
||
"generatedAt": "2026-04-25T19:46:06+08:00",
|
||
"status": "soft_error",
|
||
"freshness": {
|
||
"staleAfterSeconds": 900,
|
||
"isStale": false
|
||
},
|
||
"summary": "2026-04-25 19:46:06--归档工单配网推送监测检测到【数量】:0",
|
||
"metric": {
|
||
"label": "数量",
|
||
"value": 0,
|
||
"unit": "items"
|
||
},
|
||
"payload": null,
|
||
"diagnostics": {
|
||
"pendingCount": 0,
|
||
"queryStatus": "soft_error",
|
||
"queryError": "XHR 0 for .../getWkorderAll"
|
||
},
|
||
"source": {
|
||
"kind": "run_record",
|
||
"runRecordPath": "D:\\desk\\sgclaw\\sgclaw\\results\\archive-workorder-grid-push-monitor.run-record.json",
|
||
"runRecordMtime": "2026-04-25T19:46:06+08:00",
|
||
"extractorVersion": "1.0"
|
||
}
|
||
}
|
||
```
|
||
|
||
### Type 2: `detail_snapshot`
|
||
|
||
This type is used for:
|
||
|
||
1. `sgcc-todo-crawler`
|
||
|
||
Characteristics:
|
||
|
||
1. the business primary output is a collection of items
|
||
2. the dashboard still needs a count for summary cards
|
||
3. future downstream consumers may need full item detail
|
||
|
||
The public contract must therefore preserve both:
|
||
|
||
1. projected stable items for easy UI rendering
|
||
2. raw item collection for future richer extraction needs
|
||
|
||
Example:
|
||
|
||
```json
|
||
{
|
||
"schemaVersion": "1.0",
|
||
"skillId": "sgcc-todo-crawler",
|
||
"skillName": "国网待办抓取",
|
||
"category": "crawler",
|
||
"resultType": "detail_snapshot",
|
||
"observedAt": "2026-04-25T19:47:36+08:00",
|
||
"generatedAt": "2026-04-25T19:47:36+08:00",
|
||
"status": "ok",
|
||
"freshness": {
|
||
"staleAfterSeconds": 900,
|
||
"isStale": false
|
||
},
|
||
"summary": "2026-04-25 19:47:36--国网待办抓取同步到【待办数量】:42",
|
||
"metric": {
|
||
"label": "待办数量",
|
||
"value": 42,
|
||
"unit": "items"
|
||
},
|
||
"payload": {
|
||
"items": [
|
||
{
|
||
"id": "todo_8f9b1c",
|
||
"index": 1,
|
||
"datetime": "2026-04-24 08:25",
|
||
"tag": "会议",
|
||
"title": "...",
|
||
"processNode": "...",
|
||
"titleWithProcess": "...",
|
||
"user": "...",
|
||
"unread": true,
|
||
"href": ""
|
||
}
|
||
],
|
||
"rawItems": [
|
||
{
|
||
"datetime": "...",
|
||
"href": "",
|
||
"index": 1,
|
||
"processNode": "...",
|
||
"tag": "...",
|
||
"title": "...",
|
||
"titleWithProcess": "...",
|
||
"unread": true,
|
||
"user": "..."
|
||
}
|
||
],
|
||
"aggregates": {
|
||
"total": 42,
|
||
"unread": 17,
|
||
"read": 25
|
||
}
|
||
},
|
||
"diagnostics": {
|
||
"pendingCount": 42,
|
||
"itemSchemaVersion": "1.0"
|
||
},
|
||
"source": {
|
||
"kind": "run_record",
|
||
"runRecordPath": "D:\\desk\\sgclaw\\sgclaw\\results\\sgcc-todo-crawler.run-record.json",
|
||
"runRecordMtime": "2026-04-25T19:47:36+08:00",
|
||
"extractorVersion": "1.0"
|
||
}
|
||
}
|
||
```
|
||
|
||
## Status Semantics
|
||
|
||
Status must not collapse business emptiness and technical failure.
|
||
|
||
### `ok`
|
||
|
||
Use when:
|
||
|
||
1. raw source was readable
|
||
2. extractor succeeded
|
||
3. business result is trustworthy
|
||
4. any auxiliary diagnostics do not undermine the main business output
|
||
|
||
### `empty`
|
||
|
||
Use when:
|
||
|
||
1. the query path succeeded
|
||
2. the business result is truly empty
|
||
3. `metric.value == 0`
|
||
4. there is no technical indication that `0` is merely a failed read
|
||
|
||
`empty` is a business conclusion, not a technical fallback.
|
||
|
||
### `soft_error`
|
||
|
||
Use when:
|
||
|
||
1. a normalized result can still be produced
|
||
2. but some query stage, partial path, helper path, or side read degraded
|
||
3. and the extractor wants that degradation visible to the dashboard or operators
|
||
|
||
`soft_error` may still include a usable `metric`.
|
||
|
||
### `error`
|
||
|
||
Use when:
|
||
|
||
1. raw file missing
|
||
2. JSON parse failure
|
||
3. required business fields missing
|
||
4. extractor cannot certify a business result
|
||
|
||
In this case `latest/` updates, but `last-good/` must not be replaced.
|
||
|
||
## Skill Registry And Mapping Rules
|
||
|
||
The normalizer must maintain a registry that defines for each known skill:
|
||
|
||
1. `skillId`
|
||
2. `skillName`
|
||
3. `category`
|
||
4. `resultType`
|
||
5. summary template
|
||
6. extractor implementation binding
|
||
7. stale timeout default
|
||
|
||
This registry is the only place that should encode display name and primary result classification.
|
||
|
||
### Current Skill Registry
|
||
|
||
#### `available-balance-below-zero-monitor`
|
||
|
||
1. `skillName = "可用电费小于零监测"`
|
||
2. `category = "monitor"`
|
||
3. `resultType = "count_snapshot"`
|
||
4. `summary = "<time>--可用电费小于零监测检测到【数量】:<value>"`
|
||
|
||
Primary count source priority:
|
||
|
||
1. `auditPreview.detectReadDiagnostics.rawMergedCount`
|
||
2. `decisionPreview.summary.pending_count`
|
||
3. `decisionPreview.pendingList.length`
|
||
|
||
Primary diagnostics to preserve:
|
||
|
||
1. `slice01Count`
|
||
2. `slice02Count`
|
||
3. `slice03Count`
|
||
4. `queriedSlices`
|
||
5. `sliceErrors`
|
||
6. `requestTimeoutMs`
|
||
7. `readStepTraces`
|
||
|
||
Soft-error conditions:
|
||
|
||
1. `sliceErrors` non-empty
|
||
2. any `readStepTraces.status != "ok"`
|
||
3. missing required slice with fallback count still present
|
||
|
||
#### `archive-workorder-grid-push-monitor`
|
||
|
||
1. `skillName = "归档工单配网推送监测"`
|
||
2. `category = "monitor"`
|
||
3. `resultType = "count_snapshot"`
|
||
4. `summary = "<time>--归档工单配网推送监测检测到【数量】:<value>"`
|
||
|
||
Primary count source priority:
|
||
|
||
1. `auditPreview.detectReadDiagnostics.newItemCount`
|
||
2. `auditPreview.detectReadDiagnostics.filteredCount`
|
||
3. `decisionPreview.summary.pending_count`
|
||
4. `decisionPreview.pendingList.length`
|
||
|
||
Primary diagnostics to preserve:
|
||
|
||
1. `rawCount`
|
||
2. `filteredCount`
|
||
3. `dedupedCount`
|
||
4. `newItemCount`
|
||
5. `queryStatus`
|
||
6. `queryError`
|
||
7. `requestParam`
|
||
8. `readStepTraces`
|
||
|
||
Soft-error conditions:
|
||
|
||
1. `queryStatus != "ok"`
|
||
2. any `readStepTraces.status == "soft_error"`
|
||
3. extractor still able to produce a count but read quality degraded
|
||
|
||
#### `command-center-fee-control-monitor`
|
||
|
||
1. `skillName = "指挥中心费控异常监测"`
|
||
2. `category = "monitor"`
|
||
3. `resultType = "count_snapshot"`
|
||
4. `summary = "<time>--指挥中心费控异常监测检测到【数量】:<value>"`
|
||
|
||
Primary count source priority:
|
||
|
||
1. `auditPreview.detectReadDiagnostics.queryAbnorListCount`
|
||
2. `decisionPreview.summary.pending_count`
|
||
3. `decisionPreview.pendingList.length`
|
||
|
||
Primary diagnostics to preserve:
|
||
|
||
1. `queryAbnorListCount`
|
||
2. `queryHistoryEnergyChargeCount`
|
||
3. `getOrgTreeStatus`
|
||
4. `getMonitorLogStatus`
|
||
5. `getOtherIphonesStatus`
|
||
6. `readStepTraces`
|
||
|
||
Soft-error conditions:
|
||
|
||
1. primary abnormal-list query degraded
|
||
2. supporting reads degraded in a way that should remain operator-visible
|
||
3. business count remains present but read quality is partial
|
||
|
||
#### `sgcc-todo-crawler`
|
||
|
||
1. `skillName = "国网待办抓取"`
|
||
2. `category = "crawler"`
|
||
3. `resultType = "detail_snapshot"`
|
||
4. `summary = "<time>--国网待办抓取同步到【待办数量】:<value>"`
|
||
|
||
Primary count source priority:
|
||
|
||
1. `decisionPreview.pendingList.length`
|
||
|
||
Primary payload source:
|
||
|
||
1. `decisionPreview.pendingList`
|
||
|
||
Projected item fields:
|
||
|
||
1. `id`
|
||
2. `index`
|
||
3. `datetime`
|
||
4. `tag`
|
||
5. `title`
|
||
6. `processNode`
|
||
7. `titleWithProcess`
|
||
8. `user`
|
||
9. `unread`
|
||
10. `href`
|
||
|
||
The extractor must produce:
|
||
|
||
1. `payload.items`
|
||
2. `payload.rawItems`
|
||
3. `payload.aggregates`
|
||
|
||
## Why `sgcc-todo-crawler` Needs Both `items` And `rawItems`
|
||
|
||
The user requirement is not only to display a count.
|
||
Future dashboard or downstream logic may need to read the larger business JSON.
|
||
|
||
Directly exposing raw `run-record` is rejected.
|
||
Instead the normalized detail snapshot must preserve:
|
||
|
||
1. stable projected items for common UI use
|
||
2. raw source items for future richer access
|
||
|
||
This gives:
|
||
|
||
1. low coupling for normal UI
|
||
2. no future data loss
|
||
3. no need for the dashboard to parse runtime-private envelopes
|
||
|
||
## `index.json` Contract
|
||
|
||
The dashboard first screen must not fetch every skill individually.
|
||
It should start from one summary index.
|
||
|
||
Example:
|
||
|
||
```json
|
||
{
|
||
"schemaVersion": "1.0",
|
||
"generatedAt": "2026-04-25T19:48:00+08:00",
|
||
"revision": "2026-04-25T19:48:00+08:00",
|
||
"skills": [
|
||
{
|
||
"skillId": "available-balance-below-zero-monitor",
|
||
"skillName": "可用电费小于零监测",
|
||
"category": "monitor",
|
||
"resultType": "count_snapshot",
|
||
"status": "ok",
|
||
"observedAt": "2026-04-25T19:46:24+08:00",
|
||
"metric": {
|
||
"label": "数量",
|
||
"value": 4265,
|
||
"unit": "items"
|
||
},
|
||
"summary": "2026-04-25 19:46:24--可用电费小于零监测检测到【数量】:4265",
|
||
"currentFile": "latest/available-balance-below-zero-monitor.json",
|
||
"lastGoodFile": "last-good/available-balance-below-zero-monitor.json"
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
Required properties:
|
||
|
||
1. `schemaVersion`
|
||
2. `generatedAt`
|
||
3. `revision`
|
||
4. `skills[]`
|
||
|
||
The index must contain enough information for:
|
||
|
||
1. top-level dashboard cards
|
||
2. quick health/status summary
|
||
3. later detail-file lookup
|
||
|
||
## Historical `claw-new` Normalizer Design
|
||
|
||
In the pre-extraction design, the normalizer was intended to live as a first-class module, not ad hoc file-writing logic.
|
||
|
||
Migration note: `normalized_writer` was later extracted into the standalone repository `D:\data\ideaSpace\rust\sgClaw\normalized_writer`. The structure below is the original in-repo design and is no longer the active `claw-new` layout.
|
||
|
||
Recommended structure:
|
||
|
||
```text
|
||
src/result_normalization/
|
||
mod.rs
|
||
contract.rs
|
||
registry.rs
|
||
extractors/
|
||
available_balance.rs
|
||
archive_workorder.rs
|
||
command_center_fee_control.rs
|
||
sgcc_todo_crawler.rs
|
||
writer.rs
|
||
index_builder.rs
|
||
```
|
||
|
||
### `contract.rs`
|
||
|
||
Defines public normalized structures:
|
||
|
||
1. envelope
|
||
2. count snapshot payload model
|
||
3. detail snapshot payload model
|
||
4. index model
|
||
|
||
### `registry.rs`
|
||
|
||
Defines:
|
||
|
||
1. skill ids
|
||
2. display names
|
||
3. result types
|
||
4. stale defaults
|
||
5. extractor binding
|
||
|
||
### `extractors/*`
|
||
|
||
Each extractor:
|
||
|
||
1. reads one raw `run-record`
|
||
2. validates required fields
|
||
3. derives business result
|
||
4. derives public diagnostics
|
||
5. assigns `status`
|
||
6. produces normalized contract
|
||
|
||
Extractors must not write files directly.
|
||
|
||
### `writer.rs`
|
||
|
||
Responsible for:
|
||
|
||
1. writing `latest/`
|
||
2. conditionally updating `last-good/`
|
||
3. writing immutable `history/`
|
||
4. atomic replace semantics
|
||
|
||
### `index_builder.rs`
|
||
|
||
Reads generated normalized skill outputs and rebuilds `index.json`.
|
||
|
||
## `claw-new` CLI / Trigger Entry
|
||
|
||
The first implementation must support both:
|
||
|
||
1. on-demand local backfill after copying raw results
|
||
2. future automatic generation at runtime completion
|
||
|
||
Required CLI shape:
|
||
|
||
```text
|
||
sg_claw normalize-results --results-dir <path>
|
||
sg_claw normalize-results --results-dir <path> --skills skill1,skill2
|
||
```
|
||
|
||
Why this is required:
|
||
|
||
1. current workflow copies raw results from the inner network to local machine
|
||
2. the design cannot assume automatic end-to-end generation exists on day one
|
||
3. manual backfill keeps current operational workflow usable
|
||
|
||
## File Write And Atomicity Rules
|
||
|
||
The normalizer must never write directly into final targets without staging.
|
||
|
||
Required write protocol:
|
||
|
||
1. write `*.tmp`
|
||
2. fsync/flush if practical
|
||
3. rename over final target
|
||
|
||
Write order:
|
||
|
||
1. skill `latest/`
|
||
2. skill `last-good/` if eligible
|
||
3. skill `history/`
|
||
4. rebuild `index.json` last
|
||
|
||
This prevents dashboard from reading partially-written JSON.
|
||
|
||
## `last-good` Update Policy
|
||
|
||
The policy must be explicit.
|
||
|
||
Update `last-good` when:
|
||
|
||
1. status is `ok`
|
||
2. status is `empty`
|
||
3. status is `soft_error` only if the extractor explicitly marks the business value as trustworthy
|
||
|
||
Do not update `last-good` when:
|
||
|
||
1. status is `error`
|
||
2. extractor cannot certify the primary metric
|
||
|
||
This protects dashboard against losing prior valid business state because of one bad latest run.
|
||
|
||
## `digital-employee` Local Reader Design
|
||
|
||
The local reader must be embedded inside `digital-employee`, not implemented as a separate external service.
|
||
|
||
Recommended structure:
|
||
|
||
```text
|
||
D:\data\ideaSpace\rust\sgClaw\digital-employee\
|
||
server\
|
||
index.js
|
||
config.js
|
||
routes\
|
||
results.js
|
||
services\
|
||
fileRepository.js
|
||
resultStore.js
|
||
validators.js
|
||
```
|
||
|
||
Recommended technology:
|
||
|
||
1. Node
|
||
2. Express or a minimal native HTTP wrapper
|
||
|
||
It must:
|
||
|
||
1. bind only to `127.0.0.1`
|
||
2. expose only read-only APIs
|
||
3. validate skill ids
|
||
4. read only normalized result files
|
||
5. never expose arbitrary path traversal
|
||
|
||
The local reader must not:
|
||
|
||
1. parse raw `run-record.json`
|
||
2. expose raw result directory as static file root
|
||
3. implement business transformation logic
|
||
|
||
## Local Reader API Contract
|
||
|
||
Required endpoints:
|
||
|
||
### `GET /api/health`
|
||
|
||
Returns:
|
||
|
||
1. local reader status
|
||
2. configured results dir
|
||
3. configured normalized dir
|
||
4. current known revision
|
||
|
||
### `GET /api/results`
|
||
|
||
Returns:
|
||
|
||
1. parsed `index.json`
|
||
|
||
Used by:
|
||
|
||
1. dashboard first load
|
||
2. periodic summary polling
|
||
|
||
### `GET /api/results/:skillId`
|
||
|
||
Returns:
|
||
|
||
1. `current`
|
||
2. `lastGood`
|
||
|
||
This supports:
|
||
|
||
1. current health/status display
|
||
2. fallback display when latest is degraded
|
||
|
||
### `GET /api/results/:skillId/history?limit=30`
|
||
|
||
Returns:
|
||
|
||
1. most recent N history snapshots
|
||
|
||
This is for:
|
||
|
||
1. future trends
|
||
2. drill-down panels
|
||
3. audit-friendly exploration
|
||
|
||
### `GET /api/results/:skillId/items`
|
||
|
||
Valid only for `detail_snapshot`.
|
||
|
||
Returns:
|
||
|
||
1. projected `payload.items`
|
||
|
||
### `GET /api/results/:skillId/raw-items`
|
||
|
||
Valid only for `detail_snapshot`.
|
||
|
||
Returns:
|
||
|
||
1. `payload.rawItems`
|
||
|
||
This endpoint exists because future consumers may need the larger business JSON without depending on raw run-record envelope.
|
||
|
||
## Polling Strategy
|
||
|
||
First version uses polling only.
|
||
|
||
Recommended behavior:
|
||
|
||
1. dashboard polls `GET /api/results` every 30 seconds
|
||
2. dashboard fetches per-skill detail lazily when the relevant card/detail view opens
|
||
3. no websocket
|
||
4. no browser filesystem access
|
||
|
||
This is sufficient because source data is minute-scale scheduled output.
|
||
|
||
## Dashboard Consumption Model
|
||
|
||
The dashboard must become index-driven, not hardcoded by private source structure.
|
||
|
||
Current static data files such as:
|
||
|
||
1. `src/data/work-reports.json`
|
||
2. `src/data/anomaly-logs.json`
|
||
|
||
are implementation placeholders and must not remain the live results source for these skills.
|
||
|
||
Recommended dashboard data structure:
|
||
|
||
```text
|
||
src/
|
||
api/
|
||
results.js
|
||
store/
|
||
modules/
|
||
results.js
|
||
```
|
||
|
||
The UI should consume:
|
||
|
||
1. top-level card data from `/api/results`
|
||
2. detail view data from `/api/results/:skillId`
|
||
3. list/table panels from `/api/results/:skillId/items`
|
||
|
||
Dashboard rendering must branch only on:
|
||
|
||
1. `resultType`
|
||
2. `status`
|
||
3. `metric`
|
||
4. `freshness`
|
||
|
||
It must not branch on raw runtime field names.
|
||
|
||
## New Skill Onboarding Contract
|
||
|
||
Future new skills should reuse the same path.
|
||
|
||
To onboard a new skill:
|
||
|
||
1. add registry entry in `claw-new`
|
||
2. add extractor
|
||
3. emit `count_snapshot` or `detail_snapshot`
|
||
4. rebuild `index.json`
|
||
|
||
If the skill fits one of those two result types, then:
|
||
|
||
1. normalized writer does not need redesign
|
||
2. local reader does not need redesign
|
||
3. dashboard data access layer does not need redesign
|
||
|
||
Only if a truly new output shape appears should a new `resultType` be added.
|
||
|
||
## Current Workflow Compatibility
|
||
|
||
The design must support the current operational workflow:
|
||
|
||
1. run inside the inner network
|
||
2. copy raw `run-record.json` artifacts to local machine
|
||
3. locally normalize those copied files
|
||
4. serve normalized files into `digital-employee`
|
||
|
||
Therefore the first implementation must support this exact path:
|
||
|
||
```text
|
||
copy raw results
|
||
-> run local normalize command
|
||
-> generate normalized/
|
||
-> local reader serves normalized/
|
||
-> dashboard polls local reader
|
||
```
|
||
|
||
This avoids blocking the design on immediate upstream runtime automation.
|
||
|
||
## Error And Fallback Behavior
|
||
|
||
The dashboard must not go blank simply because the latest run degraded.
|
||
|
||
Therefore the local reader must expose:
|
||
|
||
1. `current`
|
||
2. `lastGood`
|
||
|
||
Dashboard may then display:
|
||
|
||
1. latest status badge from `current.status`
|
||
2. latest observed time from `current.observedAt`
|
||
3. fallback metric from `lastGood.metric` when `current.status == "error"`
|
||
|
||
This is the core resilience feature of the design.
|
||
|
||
## Staleness Rules
|
||
|
||
Staleness is independent of `status`.
|
||
|
||
A result may be:
|
||
|
||
1. `ok` but stale
|
||
2. `empty` but stale
|
||
3. `soft_error` and stale
|
||
|
||
This is why `freshness` exists as a separate field group.
|
||
|
||
Default stale threshold for first version:
|
||
|
||
1. 900 seconds
|
||
|
||
This default may be overridden per skill in the registry if needed later.
|
||
|
||
## Security Constraints
|
||
|
||
The local reader must obey these restrictions:
|
||
|
||
1. bind only to localhost
|
||
2. expose no write endpoints
|
||
3. expose no arbitrary filesystem browsing
|
||
4. expose only normalized result content
|
||
5. validate skill ids against registry/index
|
||
|
||
The normalizer must obey:
|
||
|
||
1. no destructive modification of raw `run-record` artifacts
|
||
2. no in-place partial overwrite without temp file + rename
|
||
|
||
## Configuration
|
||
|
||
Required configurable values:
|
||
|
||
1. `SGCLAW_RESULTS_DIR`
|
||
2. `SGCLAW_NORMALIZED_DIR`
|
||
3. `LOCAL_READER_HOST`
|
||
4. `LOCAL_READER_PORT`
|
||
5. `RESULT_STALE_AFTER_SECONDS`
|
||
|
||
Suggested defaults for local workflow:
|
||
|
||
1. `SGCLAW_RESULTS_DIR = D:\desk\sgclaw\sgclaw\results`
|
||
2. `SGCLAW_NORMALIZED_DIR = D:\desk\sgclaw\sgclaw\results\normalized`
|
||
3. `LOCAL_READER_HOST = 127.0.0.1`
|
||
4. `LOCAL_READER_PORT = 31337`
|
||
5. `RESULT_STALE_AFTER_SECONDS = 900`
|
||
|
||
## Testing Scope
|
||
|
||
### `claw-new` Normalizer Tests
|
||
|
||
Required test categories:
|
||
|
||
1. fixture-based extractor tests for all four current skills
|
||
2. file-writing tests for `latest`, `last-good`, `history`
|
||
3. `index.json` build tests
|
||
4. malformed raw file tests
|
||
5. missing raw file tests
|
||
6. `soft_error` preservation tests
|
||
7. `0` count but successful-empty tests
|
||
|
||
### `digital-employee` Local Reader Tests
|
||
|
||
Required test categories:
|
||
|
||
1. `/api/health`
|
||
2. `/api/results`
|
||
3. `/api/results/:skillId`
|
||
4. `/api/results/:skillId/history`
|
||
5. `/api/results/:skillId/items`
|
||
6. `/api/results/:skillId/raw-items`
|
||
7. unknown skill rejection
|
||
8. missing normalized file handling
|
||
|
||
### Dashboard Data-Layer Tests
|
||
|
||
Required test categories:
|
||
|
||
1. initial results load
|
||
2. polling refresh
|
||
3. `current.status == "error"` with `lastGood` fallback
|
||
4. `detail_snapshot` list rendering
|
||
5. stale marker rendering
|
||
|
||
## Out Of Scope For This Design
|
||
|
||
This design does not define:
|
||
|
||
1. final visual layout of the dashboard cards
|
||
2. websocket push transport
|
||
3. modification of original monitoring business logic
|
||
4. active side effects
|
||
5. inner-network deployment automation
|
||
|
||
Those may be handled by later plans.
|
||
|
||
## Final Design Summary
|
||
|
||
The final design deliberately creates one stable seam:
|
||
|
||
1. raw runtime output stays raw
|
||
2. `claw-new` produces normalized business results
|
||
3. `digital-employee` reads only normalized files through a local API
|
||
4. dashboard consumes only local HTTP contract
|
||
|
||
This is the minimum architecture that gives:
|
||
|
||
1. high cohesion
|
||
2. low coupling
|
||
3. current-workflow compatibility
|
||
4. support for the existing four skills
|
||
5. extensibility for future new skills
|