Files
claw/docs/superpowers/specs/2026-04-26-skill-normalized-result-and-dashboard-local-reader-design.md

1193 lines
28 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Skill Normalized Result And Dashboard Local Reader Design
Date: 2026-04-26
Status: Historical design archive; the in-repo `normalized_writer` design described here was later migrated out of `claw-new`
Historical note (2026-05-06): this document records the pre-extraction design that originally placed `normalized_writer` inside `claw-new`. The canonical implementation now lives in `D:\data\ideaSpace\rust\sgClaw\normalized_writer`.
Related baseline designs:
- `docs/superpowers/specs/2026-04-22-scheduled-monitoring-action-end-to-end-architecture-design.md`
- `docs/superpowers/specs/2026-04-23-archive-workorder-grid-push-monitor-design.md`
- `docs/superpowers/specs/2026-04-24-available-balance-below-zero-monitor-design.md`
- `docs/superpowers/specs/2026-04-26-scheduled-monitoring-generated-scene-adaptation-hardening-design.md`
Related skill/result sources:
- `D:/desk/sgclaw/sgclaw/results/archive-workorder-grid-push-monitor.run-record.json`
- `D:/desk/sgclaw/sgclaw/results/available-balance-below-zero-monitor.run-record.json`
- `D:/desk/sgclaw/sgclaw/results/command-center-fee-control-monitor.run-record.json`
- `D:/desk/sgclaw/sgclaw/results/sgcc-todo-crawler.run-record.json`
Target dashboard project:
- `D:/data/ideaSpace/rust/sgClaw/digital-employee`
## Plain-Language Goal
Turn raw scheduled-skill run results into a stable, dashboard-friendly result contract.
The system must support:
1. current four skills:
- `archive-workorder-grid-push-monitor`
- `available-balance-below-zero-monitor`
- `command-center-fee-control-monitor`
- `sgcc-todo-crawler`
2. future new skills without redesigning the dashboard data path
3. low-coupling consumption where dashboard does not parse runtime-private `run-record` structure
4. safe fallback when the latest run fails but the last successful business result should still remain visible
This design is not about changing business detection logic.
It is about adding a stable result contract and a stable local read path for dashboard consumption.
## Historical Architecture Constraints
The pre-extraction design required these boundaries:
1. `run-record.json` remains the raw runtime/audit artifact
2. dashboard must not directly consume `run-record.json`
3. normalized result files become the only supported dashboard-facing data contract
4. at the time, `claw-new` was responsible for producing normalized files
5. `digital-employee` is responsible for reading normalized files through a local API
6. dashboard is responsible only for rendering and polling the local API
7. first implementation uses HTTP polling, not websocket
## Why Direct Dashboard Parsing Is Rejected
The current raw result files are large, runtime-private, and still evolving.
If `digital-employee` directly parses:
1. `decisionPreview`
2. `auditPreview.detectReadDiagnostics`
3. skill-specific nested fields
then the dashboard becomes tightly coupled to runtime internals.
That would make every runtime refinement risky for UI consumption.
The dashboard must instead consume a stable, normalized contract that:
1. preserves business values
2. hides runtime-private structure
3. supports future extractor growth without UI rewrites
## Why Websocket Is Not The First Transport
The source data is not user-interactive event flow.
It is scheduled task output that refreshes at minute-level cadence.
Therefore the first transport must be:
1. normalized file snapshot
2. local HTTP reader
3. dashboard polling
This keeps the system simpler and more stable:
1. no connection state
2. no reconnect logic
3. no push protocol versioning
4. no incremental event semantics
If true push is needed later, it can be added inside the local reader layer without changing normalized file contract.
## End-State Architecture
The full data chain is:
```text
scheduled skill runtime
-> *.run-record.json
-> normalized result extractor
-> normalized/*.json
-> digital-employee local reader API
-> dashboard polling
```
The architecture is intentionally split into four layers.
### Layer 1: Raw Result Layer
Location:
```text
D:\desk\sgclaw\sgclaw\results\*.run-record.json
```
Responsibility:
1. preserve raw runtime result
2. preserve diagnostics/audit structure
3. remain implementation-private to runtime and normalizer
This layer is allowed to be large and unstable.
### Layer 2: Normalized Result Layer
Location:
```text
D:\desk\sgclaw\sgclaw\results\normalized\
```
Responsibility:
1. expose stable dashboard-facing contract
2. split one file per skill
3. preserve both latest state and last-known-good state
4. provide index-level aggregation for dashboard first screen
This is the single consumption contract for downstream readers.
### Layer 3: Local Reader Layer
Location:
```text
D:\data\ideaSpace\rust\sgClaw\digital-employee\server\
```
Responsibility:
1. read normalized files from disk
2. expose stable local HTTP endpoints
3. validate skill ids and schema shape
4. hide file-path and filesystem concerns from the browser
This layer does not parse raw `run-record.json`.
It only reads normalized files.
### Layer 4: Dashboard Consumption Layer
Location:
```text
D:\data\ideaSpace\rust\sgClaw\digital-employee\src\
```
Responsibility:
1. poll local reader API
2. render cards, lists, trend widgets, and detail views
3. react to `status`, `metric`, `payload`, and `freshness`
This layer must not know anything about raw runtime-private nesting.
## Required Directory Layout
The normalized result tree must follow this layout:
```text
D:\desk\sgclaw\sgclaw\results\
archive-workorder-grid-push-monitor.run-record.json
available-balance-below-zero-monitor.run-record.json
command-center-fee-control-monitor.run-record.json
sgcc-todo-crawler.run-record.json
normalized\
index.json
latest\
archive-workorder-grid-push-monitor.json
available-balance-below-zero-monitor.json
command-center-fee-control-monitor.json
sgcc-todo-crawler.json
last-good\
archive-workorder-grid-push-monitor.json
available-balance-below-zero-monitor.json
command-center-fee-control-monitor.json
sgcc-todo-crawler.json
history\
archive-workorder-grid-push-monitor\
2026\
04\
2026-04-25T19-46-06+08-00.json
available-balance-below-zero-monitor\
command-center-fee-control-monitor\
sgcc-todo-crawler\
```
Required semantics:
1. `latest/` always represents most recent normalized extraction
2. `last-good/` updates only when normalized result is still business-usable:
- `ok`
- `empty`
- optionally `soft_error` only if the extractor explicitly certifies business values are still trustworthy
3. `history/` stores immutable snapshots for audit and later trend construction
4. `index.json` summarizes current state of all known normalized skills
## Normalized Contract
All normalized files must share one public envelope.
### Common Envelope
```json
{
"schemaVersion": "1.0",
"skillId": "available-balance-below-zero-monitor",
"skillName": "可用电费小于零监测",
"category": "monitor",
"resultType": "count_snapshot",
"observedAt": "2026-04-25T19:46:24+08:00",
"generatedAt": "2026-04-25T19:46:26+08:00",
"status": "ok",
"freshness": {
"staleAfterSeconds": 900,
"isStale": false
},
"summary": "2026-04-25 19:46:24--可用电费小于零监测检测到【数量】4265",
"metric": {
"label": "数量",
"value": 4265,
"unit": "items"
},
"payload": null,
"diagnostics": {},
"source": {
"kind": "run_record",
"runRecordPath": "D:\\desk\\sgclaw\\sgclaw\\results\\available-balance-below-zero-monitor.run-record.json",
"runRecordMtime": "2026-04-25T19:46:26+08:00",
"extractorVersion": "1.0"
}
}
```
### Envelope Field Rules
#### `schemaVersion`
This is the public normalized schema version.
It is not the raw runtime version.
#### `skillId`
This must equal the stable skill directory / scene id.
It is the primary machine identity.
#### `skillName`
This must come from a normalizer registry, not from opportunistic raw JSON text extraction.
This avoids encoding drift and log-text coupling.
#### `category`
Current categories:
1. `monitor`
2. `crawler`
This is mainly for dashboard grouping and future filtering.
#### `resultType`
Current supported public result types:
1. `count_snapshot`
2. `detail_snapshot`
New skill onboarding should fit one of these two whenever possible.
#### `observedAt`
This is the business observation timestamp.
Priority:
1. stable run completion / business timestamp from raw result if available
2. raw `run-record.json` file mtime as fallback
#### `generatedAt`
This is the time normalized file is written.
#### `status`
Allowed values:
1. `ok`
2. `empty`
3. `soft_error`
4. `error`
#### `freshness`
This communicates staleness independent of success/failure.
Required fields:
1. `staleAfterSeconds`
2. `isStale`
The first version may compute `isStale` at generation time or local-reader response time.
Local-reader recomputation is preferred because it reflects current wall-clock state.
#### `summary`
This is a stable human-readable sentence assembled by the normalizer from structured fields.
It must never be parsed back by the dashboard.
#### `metric`
This is the primary business indicator for card rendering.
All result types must still expose `metric`.
#### `payload`
This is `null` for simple count snapshots.
This is a structured object for detail snapshots.
#### `diagnostics`
This is public-but-auxiliary.
It is for drill-down, debugging, and support tooling.
Dashboard must not rely on it for primary rendering.
#### `source`
This preserves audit linkage back to raw result source.
Required fields:
1. `kind`
2. `runRecordPath`
3. `runRecordMtime`
4. `extractorVersion`
## Result Types
### Type 1: `count_snapshot`
This type is used for:
1. `available-balance-below-zero-monitor`
2. `archive-workorder-grid-push-monitor`
3. `command-center-fee-control-monitor`
Characteristics:
1. the business primary output is one main count
2. the dashboard mostly needs a number card and a summary line
3. drill-down diagnostics may exist, but primary payload is count-focused
Example:
```json
{
"schemaVersion": "1.0",
"skillId": "archive-workorder-grid-push-monitor",
"skillName": "归档工单配网推送监测",
"category": "monitor",
"resultType": "count_snapshot",
"observedAt": "2026-04-25T19:46:06+08:00",
"generatedAt": "2026-04-25T19:46:06+08:00",
"status": "soft_error",
"freshness": {
"staleAfterSeconds": 900,
"isStale": false
},
"summary": "2026-04-25 19:46:06--归档工单配网推送监测检测到【数量】0",
"metric": {
"label": "数量",
"value": 0,
"unit": "items"
},
"payload": null,
"diagnostics": {
"pendingCount": 0,
"queryStatus": "soft_error",
"queryError": "XHR 0 for .../getWkorderAll"
},
"source": {
"kind": "run_record",
"runRecordPath": "D:\\desk\\sgclaw\\sgclaw\\results\\archive-workorder-grid-push-monitor.run-record.json",
"runRecordMtime": "2026-04-25T19:46:06+08:00",
"extractorVersion": "1.0"
}
}
```
### Type 2: `detail_snapshot`
This type is used for:
1. `sgcc-todo-crawler`
Characteristics:
1. the business primary output is a collection of items
2. the dashboard still needs a count for summary cards
3. future downstream consumers may need full item detail
The public contract must therefore preserve both:
1. projected stable items for easy UI rendering
2. raw item collection for future richer extraction needs
Example:
```json
{
"schemaVersion": "1.0",
"skillId": "sgcc-todo-crawler",
"skillName": "国网待办抓取",
"category": "crawler",
"resultType": "detail_snapshot",
"observedAt": "2026-04-25T19:47:36+08:00",
"generatedAt": "2026-04-25T19:47:36+08:00",
"status": "ok",
"freshness": {
"staleAfterSeconds": 900,
"isStale": false
},
"summary": "2026-04-25 19:47:36--国网待办抓取同步到【待办数量】42",
"metric": {
"label": "待办数量",
"value": 42,
"unit": "items"
},
"payload": {
"items": [
{
"id": "todo_8f9b1c",
"index": 1,
"datetime": "2026-04-24 08:25",
"tag": "会议",
"title": "...",
"processNode": "...",
"titleWithProcess": "...",
"user": "...",
"unread": true,
"href": ""
}
],
"rawItems": [
{
"datetime": "...",
"href": "",
"index": 1,
"processNode": "...",
"tag": "...",
"title": "...",
"titleWithProcess": "...",
"unread": true,
"user": "..."
}
],
"aggregates": {
"total": 42,
"unread": 17,
"read": 25
}
},
"diagnostics": {
"pendingCount": 42,
"itemSchemaVersion": "1.0"
},
"source": {
"kind": "run_record",
"runRecordPath": "D:\\desk\\sgclaw\\sgclaw\\results\\sgcc-todo-crawler.run-record.json",
"runRecordMtime": "2026-04-25T19:47:36+08:00",
"extractorVersion": "1.0"
}
}
```
## Status Semantics
Status must not collapse business emptiness and technical failure.
### `ok`
Use when:
1. raw source was readable
2. extractor succeeded
3. business result is trustworthy
4. any auxiliary diagnostics do not undermine the main business output
### `empty`
Use when:
1. the query path succeeded
2. the business result is truly empty
3. `metric.value == 0`
4. there is no technical indication that `0` is merely a failed read
`empty` is a business conclusion, not a technical fallback.
### `soft_error`
Use when:
1. a normalized result can still be produced
2. but some query stage, partial path, helper path, or side read degraded
3. and the extractor wants that degradation visible to the dashboard or operators
`soft_error` may still include a usable `metric`.
### `error`
Use when:
1. raw file missing
2. JSON parse failure
3. required business fields missing
4. extractor cannot certify a business result
In this case `latest/` updates, but `last-good/` must not be replaced.
## Skill Registry And Mapping Rules
The normalizer must maintain a registry that defines for each known skill:
1. `skillId`
2. `skillName`
3. `category`
4. `resultType`
5. summary template
6. extractor implementation binding
7. stale timeout default
This registry is the only place that should encode display name and primary result classification.
### Current Skill Registry
#### `available-balance-below-zero-monitor`
1. `skillName = "可用电费小于零监测"`
2. `category = "monitor"`
3. `resultType = "count_snapshot"`
4. `summary = "<time>--可用电费小于零监测检测到【数量】:<value>"`
Primary count source priority:
1. `auditPreview.detectReadDiagnostics.rawMergedCount`
2. `decisionPreview.summary.pending_count`
3. `decisionPreview.pendingList.length`
Primary diagnostics to preserve:
1. `slice01Count`
2. `slice02Count`
3. `slice03Count`
4. `queriedSlices`
5. `sliceErrors`
6. `requestTimeoutMs`
7. `readStepTraces`
Soft-error conditions:
1. `sliceErrors` non-empty
2. any `readStepTraces.status != "ok"`
3. missing required slice with fallback count still present
#### `archive-workorder-grid-push-monitor`
1. `skillName = "归档工单配网推送监测"`
2. `category = "monitor"`
3. `resultType = "count_snapshot"`
4. `summary = "<time>--归档工单配网推送监测检测到【数量】:<value>"`
Primary count source priority:
1. `auditPreview.detectReadDiagnostics.newItemCount`
2. `auditPreview.detectReadDiagnostics.filteredCount`
3. `decisionPreview.summary.pending_count`
4. `decisionPreview.pendingList.length`
Primary diagnostics to preserve:
1. `rawCount`
2. `filteredCount`
3. `dedupedCount`
4. `newItemCount`
5. `queryStatus`
6. `queryError`
7. `requestParam`
8. `readStepTraces`
Soft-error conditions:
1. `queryStatus != "ok"`
2. any `readStepTraces.status == "soft_error"`
3. extractor still able to produce a count but read quality degraded
#### `command-center-fee-control-monitor`
1. `skillName = "指挥中心费控异常监测"`
2. `category = "monitor"`
3. `resultType = "count_snapshot"`
4. `summary = "<time>--指挥中心费控异常监测检测到【数量】:<value>"`
Primary count source priority:
1. `auditPreview.detectReadDiagnostics.queryAbnorListCount`
2. `decisionPreview.summary.pending_count`
3. `decisionPreview.pendingList.length`
Primary diagnostics to preserve:
1. `queryAbnorListCount`
2. `queryHistoryEnergyChargeCount`
3. `getOrgTreeStatus`
4. `getMonitorLogStatus`
5. `getOtherIphonesStatus`
6. `readStepTraces`
Soft-error conditions:
1. primary abnormal-list query degraded
2. supporting reads degraded in a way that should remain operator-visible
3. business count remains present but read quality is partial
#### `sgcc-todo-crawler`
1. `skillName = "国网待办抓取"`
2. `category = "crawler"`
3. `resultType = "detail_snapshot"`
4. `summary = "<time>--国网待办抓取同步到【待办数量】:<value>"`
Primary count source priority:
1. `decisionPreview.pendingList.length`
Primary payload source:
1. `decisionPreview.pendingList`
Projected item fields:
1. `id`
2. `index`
3. `datetime`
4. `tag`
5. `title`
6. `processNode`
7. `titleWithProcess`
8. `user`
9. `unread`
10. `href`
The extractor must produce:
1. `payload.items`
2. `payload.rawItems`
3. `payload.aggregates`
## Why `sgcc-todo-crawler` Needs Both `items` And `rawItems`
The user requirement is not only to display a count.
Future dashboard or downstream logic may need to read the larger business JSON.
Directly exposing raw `run-record` is rejected.
Instead the normalized detail snapshot must preserve:
1. stable projected items for common UI use
2. raw source items for future richer access
This gives:
1. low coupling for normal UI
2. no future data loss
3. no need for the dashboard to parse runtime-private envelopes
## `index.json` Contract
The dashboard first screen must not fetch every skill individually.
It should start from one summary index.
Example:
```json
{
"schemaVersion": "1.0",
"generatedAt": "2026-04-25T19:48:00+08:00",
"revision": "2026-04-25T19:48:00+08:00",
"skills": [
{
"skillId": "available-balance-below-zero-monitor",
"skillName": "可用电费小于零监测",
"category": "monitor",
"resultType": "count_snapshot",
"status": "ok",
"observedAt": "2026-04-25T19:46:24+08:00",
"metric": {
"label": "数量",
"value": 4265,
"unit": "items"
},
"summary": "2026-04-25 19:46:24--可用电费小于零监测检测到【数量】4265",
"currentFile": "latest/available-balance-below-zero-monitor.json",
"lastGoodFile": "last-good/available-balance-below-zero-monitor.json"
}
]
}
```
Required properties:
1. `schemaVersion`
2. `generatedAt`
3. `revision`
4. `skills[]`
The index must contain enough information for:
1. top-level dashboard cards
2. quick health/status summary
3. later detail-file lookup
## Historical `claw-new` Normalizer Design
In the pre-extraction design, the normalizer was intended to live as a first-class module, not ad hoc file-writing logic.
Migration note: `normalized_writer` was later extracted into the standalone repository `D:\data\ideaSpace\rust\sgClaw\normalized_writer`. The structure below is the original in-repo design and is no longer the active `claw-new` layout.
Recommended structure:
```text
src/result_normalization/
mod.rs
contract.rs
registry.rs
extractors/
available_balance.rs
archive_workorder.rs
command_center_fee_control.rs
sgcc_todo_crawler.rs
writer.rs
index_builder.rs
```
### `contract.rs`
Defines public normalized structures:
1. envelope
2. count snapshot payload model
3. detail snapshot payload model
4. index model
### `registry.rs`
Defines:
1. skill ids
2. display names
3. result types
4. stale defaults
5. extractor binding
### `extractors/*`
Each extractor:
1. reads one raw `run-record`
2. validates required fields
3. derives business result
4. derives public diagnostics
5. assigns `status`
6. produces normalized contract
Extractors must not write files directly.
### `writer.rs`
Responsible for:
1. writing `latest/`
2. conditionally updating `last-good/`
3. writing immutable `history/`
4. atomic replace semantics
### `index_builder.rs`
Reads generated normalized skill outputs and rebuilds `index.json`.
## `claw-new` CLI / Trigger Entry
The first implementation must support both:
1. on-demand local backfill after copying raw results
2. future automatic generation at runtime completion
Required CLI shape:
```text
sg_claw normalize-results --results-dir <path>
sg_claw normalize-results --results-dir <path> --skills skill1,skill2
```
Why this is required:
1. current workflow copies raw results from the inner network to local machine
2. the design cannot assume automatic end-to-end generation exists on day one
3. manual backfill keeps current operational workflow usable
## File Write And Atomicity Rules
The normalizer must never write directly into final targets without staging.
Required write protocol:
1. write `*.tmp`
2. fsync/flush if practical
3. rename over final target
Write order:
1. skill `latest/`
2. skill `last-good/` if eligible
3. skill `history/`
4. rebuild `index.json` last
This prevents dashboard from reading partially-written JSON.
## `last-good` Update Policy
The policy must be explicit.
Update `last-good` when:
1. status is `ok`
2. status is `empty`
3. status is `soft_error` only if the extractor explicitly marks the business value as trustworthy
Do not update `last-good` when:
1. status is `error`
2. extractor cannot certify the primary metric
This protects dashboard against losing prior valid business state because of one bad latest run.
## `digital-employee` Local Reader Design
The local reader must be embedded inside `digital-employee`, not implemented as a separate external service.
Recommended structure:
```text
D:\data\ideaSpace\rust\sgClaw\digital-employee\
server\
index.js
config.js
routes\
results.js
services\
fileRepository.js
resultStore.js
validators.js
```
Recommended technology:
1. Node
2. Express or a minimal native HTTP wrapper
It must:
1. bind only to `127.0.0.1`
2. expose only read-only APIs
3. validate skill ids
4. read only normalized result files
5. never expose arbitrary path traversal
The local reader must not:
1. parse raw `run-record.json`
2. expose raw result directory as static file root
3. implement business transformation logic
## Local Reader API Contract
Required endpoints:
### `GET /api/health`
Returns:
1. local reader status
2. configured results dir
3. configured normalized dir
4. current known revision
### `GET /api/results`
Returns:
1. parsed `index.json`
Used by:
1. dashboard first load
2. periodic summary polling
### `GET /api/results/:skillId`
Returns:
1. `current`
2. `lastGood`
This supports:
1. current health/status display
2. fallback display when latest is degraded
### `GET /api/results/:skillId/history?limit=30`
Returns:
1. most recent N history snapshots
This is for:
1. future trends
2. drill-down panels
3. audit-friendly exploration
### `GET /api/results/:skillId/items`
Valid only for `detail_snapshot`.
Returns:
1. projected `payload.items`
### `GET /api/results/:skillId/raw-items`
Valid only for `detail_snapshot`.
Returns:
1. `payload.rawItems`
This endpoint exists because future consumers may need the larger business JSON without depending on raw run-record envelope.
## Polling Strategy
First version uses polling only.
Recommended behavior:
1. dashboard polls `GET /api/results` every 30 seconds
2. dashboard fetches per-skill detail lazily when the relevant card/detail view opens
3. no websocket
4. no browser filesystem access
This is sufficient because source data is minute-scale scheduled output.
## Dashboard Consumption Model
The dashboard must become index-driven, not hardcoded by private source structure.
Current static data files such as:
1. `src/data/work-reports.json`
2. `src/data/anomaly-logs.json`
are implementation placeholders and must not remain the live results source for these skills.
Recommended dashboard data structure:
```text
src/
api/
results.js
store/
modules/
results.js
```
The UI should consume:
1. top-level card data from `/api/results`
2. detail view data from `/api/results/:skillId`
3. list/table panels from `/api/results/:skillId/items`
Dashboard rendering must branch only on:
1. `resultType`
2. `status`
3. `metric`
4. `freshness`
It must not branch on raw runtime field names.
## New Skill Onboarding Contract
Future new skills should reuse the same path.
To onboard a new skill:
1. add registry entry in `claw-new`
2. add extractor
3. emit `count_snapshot` or `detail_snapshot`
4. rebuild `index.json`
If the skill fits one of those two result types, then:
1. normalized writer does not need redesign
2. local reader does not need redesign
3. dashboard data access layer does not need redesign
Only if a truly new output shape appears should a new `resultType` be added.
## Current Workflow Compatibility
The design must support the current operational workflow:
1. run inside the inner network
2. copy raw `run-record.json` artifacts to local machine
3. locally normalize those copied files
4. serve normalized files into `digital-employee`
Therefore the first implementation must support this exact path:
```text
copy raw results
-> run local normalize command
-> generate normalized/
-> local reader serves normalized/
-> dashboard polls local reader
```
This avoids blocking the design on immediate upstream runtime automation.
## Error And Fallback Behavior
The dashboard must not go blank simply because the latest run degraded.
Therefore the local reader must expose:
1. `current`
2. `lastGood`
Dashboard may then display:
1. latest status badge from `current.status`
2. latest observed time from `current.observedAt`
3. fallback metric from `lastGood.metric` when `current.status == "error"`
This is the core resilience feature of the design.
## Staleness Rules
Staleness is independent of `status`.
A result may be:
1. `ok` but stale
2. `empty` but stale
3. `soft_error` and stale
This is why `freshness` exists as a separate field group.
Default stale threshold for first version:
1. 900 seconds
This default may be overridden per skill in the registry if needed later.
## Security Constraints
The local reader must obey these restrictions:
1. bind only to localhost
2. expose no write endpoints
3. expose no arbitrary filesystem browsing
4. expose only normalized result content
5. validate skill ids against registry/index
The normalizer must obey:
1. no destructive modification of raw `run-record` artifacts
2. no in-place partial overwrite without temp file + rename
## Configuration
Required configurable values:
1. `SGCLAW_RESULTS_DIR`
2. `SGCLAW_NORMALIZED_DIR`
3. `LOCAL_READER_HOST`
4. `LOCAL_READER_PORT`
5. `RESULT_STALE_AFTER_SECONDS`
Suggested defaults for local workflow:
1. `SGCLAW_RESULTS_DIR = D:\desk\sgclaw\sgclaw\results`
2. `SGCLAW_NORMALIZED_DIR = D:\desk\sgclaw\sgclaw\results\normalized`
3. `LOCAL_READER_HOST = 127.0.0.1`
4. `LOCAL_READER_PORT = 31337`
5. `RESULT_STALE_AFTER_SECONDS = 900`
## Testing Scope
### `claw-new` Normalizer Tests
Required test categories:
1. fixture-based extractor tests for all four current skills
2. file-writing tests for `latest`, `last-good`, `history`
3. `index.json` build tests
4. malformed raw file tests
5. missing raw file tests
6. `soft_error` preservation tests
7. `0` count but successful-empty tests
### `digital-employee` Local Reader Tests
Required test categories:
1. `/api/health`
2. `/api/results`
3. `/api/results/:skillId`
4. `/api/results/:skillId/history`
5. `/api/results/:skillId/items`
6. `/api/results/:skillId/raw-items`
7. unknown skill rejection
8. missing normalized file handling
### Dashboard Data-Layer Tests
Required test categories:
1. initial results load
2. polling refresh
3. `current.status == "error"` with `lastGood` fallback
4. `detail_snapshot` list rendering
5. stale marker rendering
## Out Of Scope For This Design
This design does not define:
1. final visual layout of the dashboard cards
2. websocket push transport
3. modification of original monitoring business logic
4. active side effects
5. inner-network deployment automation
Those may be handled by later plans.
## Final Design Summary
The final design deliberately creates one stable seam:
1. raw runtime output stays raw
2. `claw-new` produces normalized business results
3. `digital-employee` reads only normalized files through a local API
4. dashboard consumes only local HTTP contract
This is the minimum architecture that gives:
1. high cohesion
2. low coupling
3. current-workflow compatibility
4. support for the existing four skills
5. extensibility for future new skills