Kol Content Screening

Screen and rank Chinese social media KOLs by matching keyword content within a time window using web search aggregation, reporting evidence and confidence.

Audits

Pass

Install

openclaw skills install kol-content-screening

KOL Content Screening (Web-Search Based)

Screen Chinese social media KOL lists for keyword-matching content within a time window. Output a ranked, evidence-backed report. Used heavily in PR / marketing / competitive intel work where you receive a "已知账号清单 + 关键词 + 时间窗" and need to know "谁发过、谁没发"。

Hard Truths Up Front

Tell the user these before promising anything:

  1. No reliable per-video interaction counts via web search. 抖音/小红书 single-video 点赞/评论/收藏 are not stably indexed by general search engines. Mark as "无公开数据" rather than guess. For real numbers the user must use 蝉妈妈 / 灰豚 / 新红 (抖音), 千瓜 / 新红 (小红书), or platform open APIs.
  2. "未发现" ≠ "没发过". Web search has indexing gaps. Always frame negatives as "公开检索未发现证据 (within N months)". Never claim a creator definitely hasn't posted X.
  3. Same nickname ≠ same person. Verify by handle/UID/homepage URL, not by name. 抖音号 / 小红书 user_id / 头条 author UID are the only reliable identifiers.
  4. Time window matters. State the explicit window (e.g. 2025-05-05 ~ 2026-05-05) in the report header. Old content (>1 year) gets marked separately, not mixed into "active" set.

If the user asks for accurate single-video互动量 排序, stop and warn: this needs paid data services. Get explicit acknowledgement before proceeding with web-only screening.

Core Workflow

1. Intake (clarify before running)

Always confirm 5 parameters before spawning sub-agents:

ParameterExampleNotes
Platforms抖音 + 小红书 + 头条Each platform = independent sub-task
Account list(CSV/table from user)Need: handle/UID + nickname + fan count + homepage URL
Keywords比亚迪 / BYD / 王传福 / DM-i / 仰望Include EN + CN + product lines + key person names
Time window近 12 个月 (YYYY-MM-DD ~ YYYY-MM-DD)Compute exact dates; don't pass "近一年" verbatim
Sort dimension有内容档→粉丝量降序 / 互动量 / 关键词命中数Without 互动量数据来源, default to fan count desc within match-tier

Common intake mistake: User pastes a Windows-clipboard HTML fragment (Version:1.0 StartHTML:...) — that's the raw clipboard envelope. The actual table data is below it. Parse account list directly from the rest of the paste.

2. Group & Parallelize

For >15 accounts on one platform, split into groups of 8–10 and spawn parallel sub-agents. Empirically: 36 抖音 accounts → 4 groups of ~9, 24 小红书 accounts → 3 groups of 8.

Per platform → 拆 N 组 → 每组 1 sub-agent → 并行 → 各自写文件 → 主 session 汇总排序

Each sub-agent writes ONE file. File naming convention:

{platform-prefix}-{keyword-slug}-research-group{N}.md

where platform-prefix is douyin / xhs / tt / sph (视频号) / bilibili.

Sub-agent prompt template: see references/subagent-prompt-template.md.

3. Per-account search procedure

Each sub-agent, for each account, runs at least two queries on the chosen web search tool (xiaosu-search or equivalent):

Q1: "{nickname}" {handle} {keyword}
Q2: "{nickname}" {keyword} site:{platform-domain}
Q3 (if Q1+Q2 weak): {nickname} {keyword} {YYYY}   # last 12 months explicit

Where {platform-domain} is douyin.com / xiaohongshu.com / toutiao.com / etc.

For each hit, the sub-agent records:

  • Title (or first line of post)
  • Date (verify within window — outside-window hits noted separately)
  • URL (must point to the creator's own post, not third-party reposts/quotes)
  • Stance (正向 / 中性 / 负向 / 仅提及) — affects PR usability
  • Confidence (高 / 中 / 低) — based on evidence strength

For each account, the sub-agent must explicitly check for ID collision: search for the nickname alone, see if the top hits are this person's handle. If collision is detected (e.g. "南希Nancy" — multiple persons), flag it.

See references/platform-search-tips.md for platform-specific quirks (site filters, profile URL formats, common false positives).

4. Aggregate & Rank

Main session reads all group files and merges into one ranked table. Default ranking:

Tier 1 🟢  — 近一年内有明确证据(带 URL、日期、内容摘要)
Tier 2 🟡  — 仅旧内容(>窗口)/ 间接提及 / 证据较弱
Tier 3 🔴  — 公开检索未发现

Within each tier: sort by fan count desc by default. If user asked for interaction-based ranking but data is unavailable, state this explicitly in the report and fall back to fan count + provide caveat.

Final report structure: see references/output-schema.md.

5. Deliver

Output to <workdir>/<keyword-slug>-kol-screening-{YYYYMMDD}.md (markdown table) plus per-platform group files. If user wants 飞书 Sheet, build the markdown first, then offer to push via lark-cli sheets (separate skill).

Failure Modes Seen In The Wild

Document these in the report so the user can interpret correctly:

  • Handle drift — User pastes "楠姐财经科技头条" but real similar accounts are "楠姐聊财经" / "楠姐科技说" / "楠姐谈股论今". Report all candidates, flag uncertainty, ask user to confirm.
  • Cross-platform leak — A 视频号 creator's content shows up only via 新浪/百度 reposts. That's still valid evidence the post exists, but mark source as via 新浪 (转载).
  • Brand homonyms — "比亚迪" matches food brand 拿铁/拿铁酱 etc.; "宁德时代" rarely collides but "宁德" alone matches geography. Use full brand name + a disambiguator keyword (王传福, 刀片电池, 车型名 for BYD; 麒麟电池, 神行, 凝聚态 for CATL).
  • Stale fan counts — User-supplied fan numbers are snapshots. Don't recompute; record as-given with date if user provided one.
  • Profile-not-found — Sometimes the homepage URL in the user's list 404s. Report as "主页失效", do NOT skip the account silently.

Honest Reporting Discipline

Every report MUST include a methodology disclaimer block at the top:

  • Data source (web search, which provider)
  • What CAN'T be obtained (per-video interaction counts, follower-only content, etc.)
  • Time window (explicit dates)
  • Confidence framing ("未发现 ≠ 没发过")

Template in references/output-schema.md.

Quick Reference

  • Sub-agent prompt template → references/subagent-prompt-template.md
  • Platform-specific search tips → references/platform-search-tips.md
  • Output schema + methodology block → references/output-schema.md
  • Decision table: when to refuse / when to upgrade to paid data → references/escalation.md