Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

X Profile Deep Dive

Deep-analyze X/Twitter user profiles — fetch tweets, followings, and followers via tweety-ns, then generate a comprehensive Chinese-language dossier with the...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 20 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description claim deep analysis of X/Twitter profiles and the code + SKILL.md actually implement that (uses tweety-ns, fetches tweets/followings/followers, collects Articles and external blogs). The requested capabilities are coherent with the stated purpose. However, the registry metadata does not declare the config path / cookie requirements that the skill explicitly needs (see environment_proportionality).
Instruction Scope
SKILL.md explicitly instructs the agent to (a) read a cookies JSON from <WORKSPACE>/config/twitter_cookies.json, (b) use tweety-ns to fetch data, and (c) perform browser navigation & incremental scrolling to harvest X Articles and external blog pages. These actions are within the scope of profile scraping but do require the agent to access local cookie files and control a browser (DOM snapshots, scrolling). The instructions are prescriptive (not vague) but grant the skill access to sensitive session data and browsing actions.
Install Mechanism
No install spec in registry (instruction-only), which is lower risk. The README and SKILL.md ask to pip install tweety-ns — a normal dependency for this task. There are included Python scripts (x_profile_analyzer.py, tweety_utils.py) that will be executed; no remote downloads or obscure install URLs are used.
!
Credentials
The skill requires a Twitter cookies JSON and a tweety session directory (sensitive login/session material) and the code reads TWITTER_COOKIES_PATH and TWITTER_SESSION_DIR env vars (with defaults). However, the registry metadata lists no required env vars or config paths. That omission is a mismatch: the skill will access local cookie/session files (high-sensitivity secrets) but the metadata does not declare this, so users might not expect to supply session cookies. Requesting a user's live X/Twitter cookies is proportionate for scraping authenticated content but it's sensitive — the user should understand risks before providing them.
Persistence & Privilege
The skill is not marked always:true and does not request system-wide changes in the files provided. It writes output to /tmp and into a collections directory under the workspace per SKILL.md; that is normal for a data-collection/reporting skill. The skill does create/expect a tweety session directory, but it does not modify other skills or global agent configuration.
What to consider before installing
This skill does what it says (deeply scrape and analyze X/Twitter profiles), but it requires your account session cookies and a tweety session directory — sensitive data that could allow the skill to act as you on X. Before installing: (1) review the included Python scripts yourself (they are present), (2) do not supply your primary/personal X cookies; prefer a throwaway account/session or API-based credentials if possible, (3) verify the origin/trustworthiness of tweety-ns and consider installing it in an isolated environment, (4) note the registry metadata does NOT declare the cookie/session requirement — treat that omission as a red flag, and (5) run the skill in a sandboxed environment (isolated workspace/container) if you must provide cookies. If you are uncomfortable providing cookies, do not install or only use the manual browser-based collection workflow described in the SKILL.md.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk97b6rsb0qtckq3nmneef21keh831mrr

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

X Profile Deep Dive

对 X/Twitter 博主进行深度画像分析:数据采集 → LLM 动态分类 → 摘要卡片 + 分类全集输出。

前置条件

  1. tweety-ns 已安装(pip3 install tweety-ns
  2. Twitter cookies 存在于 <WORKSPACE>/config/twitter_cookies.json
  3. tweety session 目录 存在于 <WORKSPACE>/config/tw_session/

首次运行前检查这三个条件。如果 cookies 缺失,提示用户通过 CDP 从 openclaw browser 提取。

完整流程

Phase 1: 数据采集

运行脚本采集原始数据:

python3 scripts/x_profile_analyzer.py \
  --handle {handle} \
  --tweet-pages 8 \
  --cookies <WORKSPACE>/config/twitter_cookies.json \
  --output /tmp/x-profile-raw-{handle}.json

参数说明:

  • --tweet-pages 8:默认 8 页(约 160 条推文),可根据需要调整
  • --following-pages 1:关注列表采样 70 人(通常够用)
  • --follower-pages 1:粉丝采样 70 人(可能因 elevated auth 失败,非关键)

脚本退出码:1=cookies missing, 2=login failed, 3=user not found

Phase 1.5: Articles 专项采集 + 外链探索

tweety-ns 的 tweet pages 可能采不全 X Articles(长文),且博主可能在外部平台有独占内容。此步骤必须执行。

1.5.1 X Articles 完整采集

用浏览器打开 x.com/{handle}/articles,完整滚动收集所有 Article:

  1. browser navigatex.com/{handle}/articles
  2. 逐段滚动(每次 scrollBy(0, 1500)),每段做 snapshot 记录当前可见的 article 标题+日期+URL
  3. 累计去重:X 使用虚拟滚动(virtual scrolling),会回收不在视口的 DOM 节点。绝不能用最后一次 snapshot 的 article 数量当总数。必须在脑中/笔记中累计所有已看到的唯一 article
  4. 持续滚动直到不再出现新 article(连续 2 次滚动无新增 = 到底)
  5. 记录总数,与 Phase 1 采到的 Articles 数量对比,补全遗漏

⚠️ 虚拟滚动陷阱(2026-03-16 教训):

  • X 的 timeline 页面使用虚拟滚动,视口外的 article 元素会被回收
  • document.querySelectorAll('article').length 只返回当前 DOM 中的数量,不是总数
  • 必须逐段滚动 + 手动累计去重,这是唯一可靠的计数方式

1.5.2 外链探索

检查博主 bio 中的外部链接(博客/Substack/Medium/Newsletter/GitHub):

  1. 从 Phase 1 的 profile 数据或 README 中提取 bio 链接
  2. 如果有博客/Newsletter → 用 browser 打开,找到长文列表页
  3. 对比博客长文列表 vs X Articles 列表,识别博客独占内容(博客有但 X 没有的)
  4. 对博客独占文章快速评估:深度原创 or 短新闻/聚合帖?主题相关度?
  5. 在 README.md 中增加「📎 外部平台」章节,记录博客 URL + 独占内容数量 + 推荐收藏清单

Phase 2: 数据分析

读取输出的 JSON,用 Python 提取关键统计信息(参考 data-analysis.md)。

Phase 3: LLM 动态分类

核心步骤——不使用预设分类,而是根据推文内容动态生成分类。

  1. 扫描所有推文全文,识别 3-6 个主题分类
  2. 每个分类需要:名称(中文)、一句话描述、包含哪些推文
  3. 分类原则:
    • 按内容主题而非形式分(不要"长文"/"短文"这种分法,除非某人确实有独特的长文系列)
    • X Articles(长文)优先处理:如果博主有 X Articles,先从 Phase 1.5 的完整 Articles 列表中提取,确保每篇 Article 全文都被收录到对应主题分类中。Articles 是博主最有深度的内容,不能遗漏
    • 如果某人有明显的长篇深度文章(>2000字),单独分一类「深度长文」
    • 每个分类至少 3 条推文,否则合并到相近分类
    • 一条推文只归入一个最匹配的分类
  4. 对每条推文的分类排序:按 likes 降序

Phase 4: 生成输出文件

输出结构为一个目录,包含摘要卡片 + 分类全集:

collections/x-profiles/@{handle}/
├── README.md          ← 摘要卡片 + 目录导航表
├── {category-1}.md    ← 第一个主题分类(推文全文)
├── {category-2}.md    ← 第二个主题分类
├── ...
└── network.md         ← 社交网络分析

README.md 格式

参考 readme-template.md。包含:

  • 📂 内容全集导航表(分类名 + 数量 + 一句话说明)
  • 一句话概括
  • 基本信息表
  • 内容风格
  • 高互动推文 Top 10(一行摘要 + 链接)
  • 核心话题 Top 5
  • 互动圈
  • 关注列表分析(兴趣图谱)
  • 与我们的关联度(⭐评分)
  • 趋势判断

分类文件格式

每个分类文件:

# {分类名}

> {分类描述}

共 N 条推文

---

## [X,XXX❤️ X,XXX🔁 X,XXX,XXX👁] YYYY-MM-DD

[原文链接](url)

{推文全文,原样保留,不做任何压缩或摘要}

**附带链接**: {如果有}

---

关键原则:推文全文原样保留,不做压缩。

network.md 格式

参考 network-template.md。包含:

  • 关注列表分类表格(Handle / 名称 / 粉丝 / 简介)
  • 网络特征分析

Phase 5: 汇报结果

完成后向用户汇报:

  • 输出目录位置和文件结构
  • 几个关键发现/亮点
  • 数据完整性校验
    • X Articles 总数(Phase 1.5 实际计数) vs 分类文件中收录的 Articles 数量 → 必须一致
    • 如果有博客等外部平台,报告独占内容数量和推荐收藏清单
    • 采样覆盖范围说明(如"最近 N 条推文,约 X 天;Articles 全量 M 篇")

注意事项

  • 调用频率:tweety-ns 保守使用,单次分析 < 5 个 API 调用
  • Cookie 过期:如果登录失败(exit code 2),提示用户刷新 cookies
  • t.co 链接:国内网络无法解析 t.co 短链接,推文全文已在数据中,不需要追踪外链
  • Python 版本:系统 Python 3.14 可能有 SSL 兼容问题,确保 tweety-ns 对应的 httpx 版本兼容
  • followers API:部分用户的 followers 数据会返回 "elevated authorization" 错误,这是非关键数据,跳过即可

Files

9 total
Select a file
Select a file to preview.

Comments

Loading comments…