feat: add initial skill authoring workspace
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
68
skills/zhihu-hotlist/references/collection-flow.md
Normal file
68
skills/zhihu-hotlist/references/collection-flow.md
Normal file
@@ -0,0 +1,68 @@
|
||||
# Collection Flow
|
||||
|
||||
This skill uses the preserved source flow in `assets/zhihu_hotlist_flow.source.json`.
|
||||
|
||||
## Source Model
|
||||
|
||||
The source implementation does four things:
|
||||
|
||||
1. ensure the browser is on the hotlist page
|
||||
2. capture hotlist HTML
|
||||
3. extract the top N items from the page
|
||||
4. visit each item detail page and try to collect visible comment metrics
|
||||
|
||||
## Hotlist Page Detection
|
||||
|
||||
- Preferred page URL: `https://www.zhihu.com/hot`
|
||||
- Domain: `www.zhihu.com`
|
||||
- Guard text: `热榜`
|
||||
|
||||
The source flow first probes the current page for the guard text before deciding whether it must navigate.
|
||||
|
||||
## Hotlist Extraction
|
||||
|
||||
The source selectors look for:
|
||||
|
||||
- hotlist root
|
||||
- hotlist item
|
||||
- title link
|
||||
- summary
|
||||
- heat text
|
||||
|
||||
If the page HTML is empty or exposes no items, the collection should be treated as failed.
|
||||
|
||||
## Comment Metric Collection
|
||||
|
||||
For each hot item:
|
||||
|
||||
1. navigate to the item detail page
|
||||
2. wait for page root
|
||||
3. scroll toward comments
|
||||
4. wait for comment list
|
||||
5. scroll comment list into view
|
||||
6. capture page HTML
|
||||
7. parse visible metrics from comment items
|
||||
|
||||
## Parsed Metrics
|
||||
|
||||
The source collector tries to extract:
|
||||
|
||||
- reply count
|
||||
- upvote count
|
||||
- favorite count
|
||||
- heart count
|
||||
|
||||
It also preserves unmatched numeric metrics as raw metric fields when possible.
|
||||
|
||||
## Count Parsing
|
||||
|
||||
The source parser recognizes compact counts such as:
|
||||
|
||||
- plain integers
|
||||
- `万`
|
||||
- `亿`
|
||||
- `k`
|
||||
- `m`
|
||||
|
||||
Use caution when summarizing parsed counts from compact display text.
|
||||
|
||||
Reference in New Issue
Block a user