feat: add initial skill authoring workspace
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
46
skills/zhihu-hotlist/references/data-quality.md
Normal file
46
skills/zhihu-hotlist/references/data-quality.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# Data Quality
|
||||
|
||||
This skill can return useful partial data, but it must not overclaim completeness.
|
||||
|
||||
## Main Quality Risks
|
||||
|
||||
- comment areas may not load for every hot item
|
||||
- the DOM may expose only visible comments, not the full set
|
||||
- generic selectors may match the wrong footer controls
|
||||
- compact text counts can be parsed but still reflect display approximations
|
||||
|
||||
## Partial Success Rule
|
||||
|
||||
The source implementation tracks partial item failures during comment collection. If some detail pages fail but the run still returns a snapshot:
|
||||
|
||||
- report the run as partial
|
||||
- include how many items were missing comment metrics
|
||||
- keep the successful hotlist capture separate from comment-metric completeness
|
||||
|
||||
## Snapshot Caveats
|
||||
|
||||
The source store design keeps:
|
||||
|
||||
- `snapshot_id`
|
||||
- capture timestamp
|
||||
- page URL
|
||||
- collector version
|
||||
- item list
|
||||
- collection stats
|
||||
|
||||
This is enough for reproducible reporting, but it does not guarantee that every metric field was fully captured.
|
||||
|
||||
## Recommended Caution Language
|
||||
|
||||
Use wording like:
|
||||
|
||||
- `热榜列表已采集,评论指标为部分完成。`
|
||||
- `报告基于最新快照生成,部分条目缺少评论指标。`
|
||||
- `数字来自页面可见指标,可能低于完整站内统计。`
|
||||
|
||||
Avoid wording like:
|
||||
|
||||
- `全部评论指标已准确采集`
|
||||
- `完整真实热度`
|
||||
- `无缺失`
|
||||
|
||||
Reference in New Issue
Block a user