1.3 KiB
1.3 KiB
Data Quality
This skill can return useful partial data, but it must not overclaim completeness.
Main Quality Risks
- comment areas may not load for every hot item
- the DOM may expose only visible comments, not the full set
- generic selectors may match the wrong footer controls
- compact text counts can be parsed but still reflect display approximations
Partial Success Rule
The source implementation tracks partial item failures during comment collection. If some detail pages fail but the run still returns a snapshot:
- report the run as partial
- include how many items were missing comment metrics
- keep the successful hotlist capture separate from comment-metric completeness
Snapshot Caveats
The source store design keeps:
snapshot_id- capture timestamp
- page URL
- collector version
- item list
- collection stats
This is enough for reproducible reporting, but it does not guarantee that every metric field was fully captured.
Recommended Caution Language
Use wording like:
热榜列表已采集,评论指标为部分完成。报告基于最新快照生成,部分条目缺少评论指标。数字来自页面可见指标,可能低于完整站内统计。
Avoid wording like:
全部评论指标已准确采集完整真实热度无缺失