Kol Content Screening

Screen and rank Chinese social media KOLs by matching keyword content within a time window using web search aggregation, reporting evidence and confidence.

Audits

Pass

ClawScanPass

Agentic behavior and permission review.

Static analysisPass

Pattern checks against bundled files.

VirusTotalPass

Multi-engine malware detections and file reputation.

Install

openclaw skills install kol-content-screening

KOL Content Screening (Web-Search Based)

Screen Chinese social media KOL lists for keyword-matching content within a time window. Output a ranked, evidence-backed report. Used heavily in PR / marketing / competitive intel work where you receive a "已知账号清单 + 关键词 + 时间窗" and need to know "谁发过、谁没发"。

Hard Truths Up Front

Tell the user these before promising anything:

No reliable per-video interaction counts via web search. 抖音/小红书 single-video 点赞/评论/收藏 are not stably indexed by general search engines. Mark as "无公开数据" rather than guess. For real numbers the user must use 蝉妈妈 / 灰豚 / 新红 (抖音), 千瓜 / 新红 (小红书), or platform open APIs.
"未发现" ≠ "没发过". Web search has indexing gaps. Always frame negatives as "公开检索未发现证据 (within N months)". Never claim a creator definitely hasn't posted X.
Same nickname ≠ same person. Verify by handle/UID/homepage URL, not by name. 抖音号 / 小红书 user_id / 头条 author UID are the only reliable identifiers.
Time window matters. State the explicit window (e.g. 2025-05-05 ~ 2026-05-05) in the report header. Old content (>1 year) gets marked separately, not mixed into "active" set.

If the user asks for accurate single-video互动量排序, stop and warn: this needs paid data services. Get explicit acknowledgement before proceeding with web-only screening.

Core Workflow

1. Intake (clarify before running)

Always confirm 5 parameters before spawning sub-agents:

Parameter	Example	Notes
Platforms	抖音 + 小红书 + 头条	Each platform = independent sub-task
Account list	(CSV/table from user)	Need: handle/UID + nickname + fan count + homepage URL
Keywords	比亚迪 / BYD / 王传福 / DM-i / 仰望	Include EN + CN + product lines + key person names
Time window	近 12 个月 (`YYYY-MM-DD ~ YYYY-MM-DD`)	Compute exact dates; don't pass "近一年" verbatim
Sort dimension	有内容档→粉丝量降序 / 互动量 / 关键词命中数	Without 互动量数据来源, default to fan count desc within match-tier

Common intake mistake: User pastes a Windows-clipboard HTML fragment (Version:1.0 StartHTML:...) — that's the raw clipboard envelope. The actual table data is below it. Parse account list directly from the rest of the paste.

2. Group & Parallelize

For >15 accounts on one platform, split into groups of 8–10 and spawn parallel sub-agents. Empirically: 36 抖音 accounts → 4 groups of ~9, 24 小红书 accounts → 3 groups of 8.

Per platform → 拆 N 组 → 每组 1 sub-agent → 并行 → 各自写文件 → 主 session 汇总排序

Each sub-agent writes ONE file. File naming convention:

{platform-prefix}-{keyword-slug}-research-group{N}.md

where platform-prefix is douyin / xhs / tt / sph (视频号) / bilibili.

Sub-agent prompt template: see references/subagent-prompt-template.md.

3. Per-account search procedure

Each sub-agent, for each account, runs at least two queries on the chosen web search tool (xiaosu-search or equivalent):

Q1: "{nickname}" {handle} {keyword}
Q2: "{nickname}" {keyword} site:{platform-domain}
Q3 (if Q1+Q2 weak): {nickname} {keyword} {YYYY}   # last 12 months explicit

Where {platform-domain} is douyin.com / xiaohongshu.com / toutiao.com / etc.

For each hit, the sub-agent records:

Title (or first line of post)
Date (verify within window — outside-window hits noted separately)
URL (must point to the creator's own post, not third-party reposts/quotes)
Stance (正向 / 中性 / 负向 / 仅提及) — affects PR usability
Confidence (高 / 中 / 低) — based on evidence strength

For each account, the sub-agent must explicitly check for ID collision: search for the nickname alone, see if the top hits are this person's handle. If collision is detected (e.g. "南希Nancy" — multiple persons), flag it.

See references/platform-search-tips.md for platform-specific quirks (site filters, profile URL formats, common false positives).

4. Aggregate & Rank

Main session reads all group files and merges into one ranked table. Default ranking:

Tier 1 🟢  — 近一年内有明确证据（带 URL、日期、内容摘要）
Tier 2 🟡  — 仅旧内容（>窗口）/ 间接提及 / 证据较弱
Tier 3 🔴  — 公开检索未发现

Within each tier: sort by fan count desc by default. If user asked for interaction-based ranking but data is unavailable, state this explicitly in the report and fall back to fan count + provide caveat.

Final report structure: see references/output-schema.md.

5. Deliver

Output to <workdir>/<keyword-slug>-kol-screening-{YYYYMMDD}.md (markdown table) plus per-platform group files. If user wants 飞书 Sheet, build the markdown first, then offer to push via lark-cli sheets (separate skill).

Failure Modes Seen In The Wild

Document these in the report so the user can interpret correctly:

Handle drift — User pastes "楠姐财经科技头条" but real similar accounts are "楠姐聊财经" / "楠姐科技说" / "楠姐谈股论今". Report all candidates, flag uncertainty, ask user to confirm.
Cross-platform leak — A 视频号 creator's content shows up only via 新浪/百度 reposts. That's still valid evidence the post exists, but mark source as via 新浪 (转载).
Brand homonyms — "比亚迪" matches food brand 拿铁/拿铁酱 etc.; "宁德时代" rarely collides but "宁德" alone matches geography. Use full brand name + a disambiguator keyword (王传福, 刀片电池, 车型名 for BYD; 麒麟电池, 神行, 凝聚态 for CATL).
Stale fan counts — User-supplied fan numbers are snapshots. Don't recompute; record as-given with date if user provided one.
Profile-not-found — Sometimes the homepage URL in the user's list 404s. Report as "主页失效", do NOT skip the account silently.

Honest Reporting Discipline

Every report MUST include a methodology disclaimer block at the top:

Data source (web search, which provider)
What CAN'T be obtained (per-video interaction counts, follower-only content, etc.)
Time window (explicit dates)
Confidence framing ("未发现 ≠ 没发过")

Template in references/output-schema.md.

Quick Reference

Sub-agent prompt template → references/subagent-prompt-template.md
Platform-specific search tips → references/platform-search-tips.md
Output schema + methodology block → references/output-schema.md
Decision table: when to refuse / when to upgrade to paid data → references/escalation.md