Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Data Intelligence

v1.0.0

综合数据智能平台 - 整合 Apify 云端爬虫、PinchTab 浏览器自动化、内容分析与数据工作流。支持 55+ 平台的网络爬虫、线索生成、电商情报、竞品分析、趋势研究,以及浏览器自动化测试和数据提取。

1· 317·1 current·1 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for huamu668/data-intelligence.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Data Intelligence" (huamu668/data-intelligence) from ClawHub.
Skill page: https://clawhub.ai/huamu668/data-intelligence
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Canonical install target

openclaw skills install huamu668/data-intelligence

ClawHub CLI

Package manager switcher

npx clawhub@latest install data-intelligence
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description (Apify + PinchTab + content analysis) align with the included scripts and templates: the shell scripts call Apify actors and the docs reference PinchTab. Functionality requested by the SKILL.md matches the declared purpose.
!
Instruction Scope
Runtime instructions and included scripts instruct the agent/user to read a .env file and export APIFY_TOKEN, run mcpc against mcp.apify.com, invoke pinchtab commands, run npm global installs and curl|bash installers, and write output to local files. These actions are within the skill's stated purpose but involve network calls, local file creation, and executing downloaded installers — all of which expand the runtime scope and require explicit user review.
!
Install Mechanism
There is no formal install spec in the registry (instruction-only), but the README and SKILL.md recommend `npm install -g @apify/mcpc` and executing `curl -fsSL https://pinchtab.com/install.sh | bash`. Download-and-execute from an external domain and global npm installs are higher-risk operations and should be verified before running.
!
Credentials
Registry metadata lists no required environment variables, yet the SKILL.md and both scripts explicitly require an APIFY_TOKEN (read from .env or env). This mismatch is concerning: a credential is necessary for operation but not declared. No unrelated secrets are requested, but the omission reduces transparency about credential use.
Persistence & Privilege
The skill does not request always:true, does not claim to modify other skills, and only writes data to local files/directories it creates. Autonomous invocation is allowed (platform default) but not combined with other excessive privileges.
What to consider before installing
This skill appears to do what it says (Apify scraping + PinchTab automation + analysis templates) but there are two practical risks you should address before installing: - Credentials: The scripts and SKILL.md require an APIFY_TOKEN (read from .env or env) but the registry metadata does not declare this. Treat the token as sensitive — prefer a least-privilege or ephemeral token and do not reuse a high-privilege key. - Installer commands: The README suggests running `curl https://pinchtab.com/install.sh | bash` and global npm installs. Download-and-execute flows should be inspected manually — review the install script contents on pinchtab.com, verify the domain is legitimate, and consider installing via a package manager or by inspecting code first. - Operational hygiene: Review the shell scripts (they are simple) and any referenced but missing artifacts (e.g., analyze-competitor.js is referenced but not included). Run the tools in an isolated environment (container or VM) if you plan to test. Ensure your scraping use complies with target platforms' terms of service and applicable laws. If you want to proceed safely, ask the publisher to update the registry metadata to declare APIFY_TOKEN as a required credential, provide provenance for PinchTab (release URL or checksum), and include any missing helper scripts referenced by the README.

Like a lobster shell, security has layers — review code before you run it.

analysisvk971rpnbteedd70ffnextz6sjx82g51rapifyvk971rpnbteedd70ffnextz6sjx82g51rdatavk971rpnbteedd70ffnextz6sjx82g51rlatestvk971rpnbteedd70ffnextz6sjx82g51rpinchtabvk971rpnbteedd70ffnextz6sjx82g51rscrapingvk971rpnbteedd70ffnextz6sjx82g51r
317downloads
1stars
1versions
Updated 16h ago
v1.0.0
MIT-0

Data Intelligence 数据智能平台

综合数据智能解决方案,整合云端爬虫、浏览器自动化和内容分析,构建完整的数据采集与分析工作流。

系统架构

┌─────────────────────────────────────────────────────────────────┐
│                     Data Intelligence 平台                       │
├─────────────────┬─────────────────┬─────────────────────────────┤
│   云端爬虫层      │  浏览器自动化层   │      内容分析层            │
├─────────────────┼─────────────────┼─────────────────────────────┤
│ • Apify Actors  │ • PinchTab      │ • 内容工厂                  │
│ • 55+ 平台支持   │ • 多实例编排     │ • 趋势分析                  │
│ • 无服务器架构   │ • Token高效提取  │ • 竞品监测                  │
│ • 弹性扩展      │ • 自动化测试     │ • 数据可视化                │
└─────────────────┴─────────────────┴─────────────────────────────┘
         │                │                   │
         └────────────────┼───────────────────┘
                          ↓
              ┌─────────────────────┐
              │    数据工作流引擎     │
              │  • 数据采集          │
              │  • 清洗转换          │
              │  • 分析洞察          │
              │  • 报告生成          │
              └─────────────────────┘

一、云端爬虫层 (Apify)

1.1 支持的 55+ 平台

社交媒体 (45 Actors)

平台Actor 数量主要用途
Instagram12个人资料、帖子、评论、标签、Reels
Facebook14页面、帖子、评论、广告、群组、活动
TikTok14视频、评论、用户、标签、趋势、直播
YouTube5视频、频道、评论、Shorts

商业与本地 (10 Actors)

平台Actor 数量主要用途
Google Maps4商家信息、评论、邮箱提取
Booking.com2酒店数据、评论
TripAdvisor1评论分析
Google Search1搜索结果
Google Trends1趋势数据

1.2 核心 Actor 速查表

线索生成

需求Actor ID输出
本地商家compass/crawler-google-places名称、地址、电话、评分
邮箱提取poidata/google-maps-email-extractor邮箱列表
联系信息vdrmota/contact-info-scraper邮箱、电话、社交媒体
Instagram 用户apify/instagram-profile-scraper个人资料、粉丝数
TikTok 创作者clockworks/tiktok-profile-scraper创作者信息

内容分析

需求Actor ID输出
Instagram 帖子apify/instagram-post-scraper内容、点赞、评论数
TikTok 视频clockworks/tiktok-scraper视频、播放量、分享数
YouTube 视频streamers/youtube-scraper标题、观看、点赞
Facebook 页面apify/facebook-pages-scraper页面信息、帖子

竞品监测

需求Actor ID输出
Google Maps 评论compass/Google-Maps-Reviews-Scraper评论、评分、情感
Booking 评论voyager/booking-reviews-scraper住客评价
TripAdvisormaxcopell/tripadvisor-reviews旅游评论

1.3 Apify 使用工作流

前置条件:

# 1. 安装依赖
npm install -g @apify/mcpc

# 2. 配置 Token
echo "APIFY_TOKEN=your_token_here" > .env

# 3. 验证
export $(grep APIFY_TOKEN .env | xargs) && mcpc --version

标准工作流:

## 数据采集任务清单

- [ ] 步骤 1: 明确目标 - 需要什么数据?从哪个平台?
- [ ] 步骤 2: 选择 Actor - 根据平台速查表选择
- [ ] 步骤 3: 获取 Schema - 了解输入参数
- [ ] 步骤 4: 配置参数 - 设置搜索关键词、数量等
- [ ] 步骤 5: 运行采集 - 执行 Actor
- [ ] 步骤 6: 数据清洗 - 处理缺失值、格式转换
- [ ] 步骤 7: 分析洞察 - 生成报告

执行命令:

# 快速预览(仅显示结果,不保存文件)
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com \
  --header "Authorization: Bearer $APIFY_TOKEN" \
  tools-call run-actor \
  actor:="compass/crawler-google-places" \
  input:='{"searchStrings": ["coffee shop"], "location": "New York"}'

# 导出 CSV
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com \
  --header "Authorization: Bearer $APIFY_TOKEN" \
  tools-call run-actor \
  actor:="compass/crawler-google-places" \
  input:='{"searchStrings": ["coffee shop"], "maxCrawledPlaces": 50}' \
  | jq -r '.content[0].text' > results.csv

# 导出 JSON
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com \
  --header "Authorization: Bearer $APIFY_TOKEN" \
  tools-call run-actor \
  actor:="apify/instagram-profile-scraper" \
  input:='{"usernames": ["example_user"]}' \
  | jq '.content[0].text | fromjson' > results.json

二、浏览器自动化层 (PinchTab)

2.1 与 Apify 的互补关系

场景使用 Apify使用 PinchTab
大规模数据采集✅ 云端 Actor,并发高❌ 本地运行,资源有限
需要登录/认证⚠️ 需要 Cookie✅ 支持登录态保留
实时交互测试❌ 不适合✅ 点击、输入、验证
视觉回归测试❌ 不支持✅ 截图对比
Token 敏感场景❌ 成本高✅ 文本提取省 Token
动态内容渲染✅ 云端渲染✅ 本地渲染

2.2 混合工作流示例

场景:监测竞品网站 + 分析其社交媒体

# Step 1: 使用 PinchTab 访问竞品网站,提取关键信息
pinchtab nav https://competitor.com
sleep 3
pinchtab text > competitor-content.txt

# Step 2: 从网站提取社交媒体链接
grep -oE '(instagram|facebook|tiktok)\.com/[^" ]+' competitor-content.txt > social-links.txt

# Step 3: 使用 Apify 分析其社交媒体
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com \
  --header "Authorization: Bearer $APIFY_TOKEN" \
  tools-call run-actor \
  actor:="apify/instagram-profile-scraper" \
  input:='{"usernames": ["competitor_ig"]}' \
  > competitor-ig-data.json

# Step 4: 数据分析
node analyze-competitor.js competitor-ig-data.json

2.3 数据采集黄金组合

数据类型Apify ActorPinchTab 补充
商家信息Google Maps Actor官网详情验证
产品信息电商 Actor价格实时监控
用户评论平台评论 Actor情感分析可视化
社交媒体Instagram/TikTok Actor内容趋势监测

三、内容分析层

3.1 数据采集后的内容工作流

Apify 采集数据
    ↓
数据清洗 (Python/pandas)
    ↓
内容分析 (内容工厂技能)
    ↓
生成报告 / 发布内容

3.2 数据分析模板

竞品分析报告模板:

---
title: 竞品分析报告 - {{competitor_name}}
date: {{date}}
tags: [competitor, analysis]
---

# {{competitor_name}} 竞品分析

## 数据来源
- 网站: {{website_url}}
- Instagram: {{ig_followers}} 粉丝
- 数据采集时间: {{date}}

## 核心指标

| 指标 | 数值 | 趋势 |
|------|------|------|
| 网站流量 | {{traffic}} | {{trend}} |
| 社媒粉丝 | {{followers}} | {{trend}} |
| 内容发布频率 | {{frequency}} | {{trend}} |
| 平均互动率 | {{engagement}} | {{trend}} |

## 内容策略

### 高表现内容类型
1. {{top_content_1}}
2. {{top_content_2}}
3. {{top_content_3}}

### 发布时机
- 最佳时间: {{best_time}}
- 发布频率: {{post_frequency}}

## 差异化建议
- [ ] 建议 1
- [ ] 建议 2
- [ ] 建议 3

四、实战案例

案例 1:本地商家线索挖掘

目标: 收集某城市所有咖啡店的信息和联系方式

#!/bin/bash
# coffee-shop-leads.sh

CITY="Los Angeles"
OUTPUT_FILE="coffee-shops-$(date +%Y%m%d).csv"

# Step 1: Apify 采集 Google Maps
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com \
  --header "Authorization: Bearer $APIFY_TOKEN" \
  tools-call run-actor \
  actor:="compass/crawler-google-places" \
  input:="{\"searchStrings\": [\"coffee shop\"], \"location\": \"$CITY\", \"maxCrawledPlaces\": 100}"

# Step 2: 提取有网站的数据,用 PinchTab 验证详情
# (可选:访问官网提取更多联系信息)

# Step 3: 数据清洗
cat raw-data.json | jq -r '.[] | [.title, .address, .phone, .website] | @csv' > "$OUTPUT_FILE"

echo "找到 $(wc -l < $OUTPUT_FILE) 家咖啡店,数据已保存到 $OUTPUT_FILE"

案例 2:竞品社交媒体监测

目标: 监测 3 个竞品的 Instagram 表现

#!/bin/bash
# competitor-monitoring.sh

COMPETITORS=("brand_a" "brand_b" "brand_c")
DATE=$(date +%Y%m%d)

for competitor in "${COMPETITORS[@]}"; do
  echo "分析 $competitor..."

  # Apify 采集数据
  export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com \
    --header "Authorization: Bearer $APIFY_TOKEN" \
    tools-call run-actor \
    actor:="apify/instagram-profile-scraper" \
    input:="{\"usernames\": [\"$competitor\"]}" \
    > "data/${competitor}-${DATE}.json"

  # 提取关键指标
  followers=$(cat "data/${competitor}-${DATE}.json" | jq -r '.[0].followersCount')
  posts=$(cat "data/${competitor}-${DATE}.json" | jq -r '.[0].postsCount')

  echo "$competitor: $followers 粉丝, $posts 帖子"
done

# 生成对比报告
node generate-report.js $DATE

案例 3:趋势研究 + 内容创作

目标: 发现 TikTok 趋势,快速创作相关内容

#!/bin/bash
# trend-to-content.sh

# Step 1: Apify 采集 TikTok 趋势
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com \
  --header "Authorization: Bearer $APIFY_TOKEN" \
  tools-call run-actor \
  actor:="clockworks/tiktok-trends-scraper" \
  input:='{"resultsLimit": 20}' \
  > trends.json

# Step 2: 提取热门标签
TOP_HASHTAG=$(cat trends.json | jq -r '.[0].hashtag.name')
TOP_VIEWS=$(cat trends.json | jq -r '.[0].stats.playCount')

echo "热门趋势: #$TOP_HASHTAG ($TOP_VIEWS 播放)"

# Step 3: 使用内容工厂技能创作文章
# /content-create "如何蹭 $TOP_HASHTAG 趋势涨粉"

五、安装与配置

5.1 安装依赖

# 1. Apify MCP CLI
npm install -g @apify/mcpc

# 2. PinchTab 浏览器自动化
curl -fsSL https://pinchtab.com/install.sh | bash

# 3. 配置环境变量
cat > .env << EOF
APIFY_TOKEN=your_apify_token_here
PINCHTAB_PORT=9867
EOF

5.2 验证安装

# 验证 Apify
export $(grep APIFY_TOKEN .env | xargs) && mcpc --version

# 验证 PinchTab
pinchtab --version

# 测试连接
pinchtab health

5.3 Claude Code 集成

.claude/settings.json 中添加:

{
  "env": {
    "APIFY_TOKEN": "${APIFY_TOKEN}",
    "CLAUDE_PLUGIN_ROOT": "${workspaceFolder}"
  },
  "skills": [
    "data-intelligence"
  ]
}

六、命令速查

6.1 Apify 常用命令

# 搜索 Actor
mcpc tools-call search-actors keywords:="instagram" limit:=10

# 获取 Actor 详情
mcpc tools-call fetch-actor-details actor:="apify/instagram-profile-scraper"

# 运行 Actor
mcpc tools-call run-actor actor:="ACTOR_ID" input:='{}'

# 查看运行状态
mcpc tools-call get-run runId:="RUN_ID"

6.2 PinchTab 常用命令

# 启动服务
pinchtab

# 创建实例
pinchtab instances create --mode=headless

# 导航
pinchtab nav https://example.com

# 提取文本
pinchtab text

# 执行动作
pinchtab click e5
pinchtab fill e3 "text"

6.3 组合命令

# 数据采集 + 分析一站式
export $(grep APIFY_TOKEN .env | xargs) && \
mcpc --json mcp.apify.com \
  --header "Authorization: Bearer $APIFY_TOKEN" \
  tools-call run-actor \
  actor:="compass/crawler-google-places" \
  input:='{"searchStrings": ["keyword"]}' | \
jq -r '.content[0].text' | \
python analyze.py

七、最佳实践

7.1 成本控制

工具成本模式适用场景
Apify按结果付费大规模数据采集
PinchTab免费(本地)小批量、实时测试
组合混合使用大规模 + 实时验证

7.2 数据质量

  • 验证样本:大规模采集前,先用小样本验证数据质量
  • 交叉验证:同一数据用多个 Actor 采集,对比结果
  • 时效性:注意数据更新时间,避免使用过期数据

7.3 合规性

  • 遵守各平台的服务条款
  • 尊重 robots.txt
  • 不采集个人隐私数据
  • 合理使用频率,避免对目标网站造成压力

八、故障排除

问题原因解决方案
APIFY_TOKEN not found环境变量未设置export APIFY_TOKEN=xxx
mcpc not foundCLI 未安装npm install -g @apify/mcpc
Actor not foundActor ID 错误检查拼写或搜索可用 Actor
Rate limit请求过快增加延时或减少并发
PinchTab timeout页面加载慢增加 sleep 时间

九、参考资源


让数据驱动决策,用智能提升效率。

Comments

Loading comments...