Amazon Scraper

v1.0.0

爬取 Amazon 商品数据,支持关键词搜索,提取价格、评分、评论数,生成结构化 Excel 和 Markdown 报告,含品牌与爆款分析。

0· 280·0 current·0 all-time
by十三先生@zuokun300
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The skill's name/description (Amazon scraping, Excel/Markdown reports) aligns with the code and SKILL.md which call Apify and process Amazon results. However the registry metadata declares no required environment variables while the code and docs clearly require an Apify API key (APIFY_API_KEY). That mismatch reduces trustworthiness of the metadata.
Instruction Scope
SKILL.md and USAGE.md give concrete instructions limited to cloning, installing dependencies, setting APIFY_API_KEY, and running the script (including cron). Instructions do not ask for unrelated system files or credentials. They do however suggest embedding an API token into the script as an alternative, which is insecure practice.
Install Mechanism
There is no formal install spec (instruction-only), which is low-risk. The docs reference a GitHub repo URL and recommend pip installing openpyxl and requests — expected for this functionality. Registry 'Source: unknown' / 'Homepage: none' is inconsistent with the clone instructions and should be verified.
!
Credentials
The code and documentation require an external credential (APIFY_API_KEY) to call Apify's API, which is appropriate for using Apify. However the registry metadata does not list any required env vars or primary credential, creating an incoherence (the skill asks for a secret but doesn't declare it). The docs also advise optionally hard-coding the token into the script — a security risk.
Persistence & Privilege
The skill is not marked always:true and does not request persistent or system-wide privileges. It writes output files to a local output directory and doesn't modify other skills or global config.
What to consider before installing
What to consider before installing: - The skill legitimately needs an Apify API key to run; SKILL.md and the Python file expect APIFY_API_KEY, but the registry metadata does not declare this — confirm you are comfortable providing that key. Do not paste your API key into source files; prefer environment variables. - Review the repository the docs point to (https://github.com/zuokun300/amazon-scraper) before cloning to ensure the code matches the packaged files. The registry shows Source: unknown / no homepage — verify origin. - The included script appears to be executable and calls Apify (https://api.apify.com). This is expected for scraping, but means the skill will send your requests and the scraped data to Apify-managed endpoints; ensure that aligns with your privacy/compliance requirements and Apify account usage/costs. - The python code shows at least one bug/incompleteness (the generate_report function returns unique_products[:max_products] while max_products is undefined in that scope, and the main function appears truncated). Expect runtime errors; test in an isolated environment before running on production data. Consider fixing or validating the code or asking the author for a complete release. - If you plan scheduled runs (cron), run them from an isolated machine or container and monitor network/API calls and account billing. If the publisher updates the registry metadata to explicitly declare APIFY_API_KEY, provides a verified source/homepage, and publishes a complete, tested script (no undefined variables), my confidence the package is internally coherent would increase.

Like a lobster shell, security has layers — review code before you run it.

latestvk972tyecff49p61jbkeq6143rh829ver
280downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

Amazon 数据爬虫技能

技能描述

爬取 Amazon 商品数据,生成结构化报告(Markdown + Excel)。支持商品搜索、数据提取、品牌分析、爆款识别。

适用场景:

  • 电商选品调研
  • 竞品价格监控
  • 爆款商品分析
  • 市场趋势研究

安装方法

方法 1:ClawHub 安装(推荐)

# 访问 ClawHub 页面
https://clawhub.ai/zuokun300/amazon-scraper

# 或使用 CLI 安装
clawhub install zuokun300/amazon-scraper

方法 2:手动安装

# 1. 克隆技能到 workspace
git clone https://github.com/zuokun300/amazon-scraper.git ~/.openclaw/workspace/skills/amazon-scraper

# 2. 安装依赖
pip3 install openpyxl requests --break-system-packages

# 3. 配置 Apify API Key
export APIFY_API_KEY="apify_api_xxxxx"

使用方法

基础用法

对 OpenClaw 说:

帮我爬取 Amazon 上的 "women fashion shoes",生成 Excel 报告

高级用法

帮我爬取 Amazon 上的 "[关键词]",需要:
- 抓取前 50 个商品
- 提取价格、评分、评论数
- 生成 Excel 和 Markdown 报告
- 分析品牌分布

自定义参数

# 修改脚本中的参数
KEYWORDS = ["women fashion shoes", "men sneakers"]
MAX_PRODUCTS = 50
OUTPUT_DIR = "/path/to/output"

输出文件

1. Excel 报告 (amazon-data.xlsx)

包含两个 Sheet:

Sheet 1: 商品数据

列名说明
排名搜索排名
商品名称完整标题
品牌自动识别的品牌
类型商品分类
ASINAmazon 标准 ID
价格当前价格(需深度爬取)
图片链接可点击的主图链接
Amazon 链接可点击的商品页链接

Sheet 2: 统计摘要

  • 品牌分布统计
  • 价格区间分析
  • 爆款识别

2. Markdown 报告 (amazon-report.md)

包含:

  • 数据概览
  • 爆款商品列表(表格)
  • 商品详情链接

配置说明

Apify API Key

获取方法:

  1. 访问 https://console.apify.com/signup 注册
  2. 登录后访问 https://console.apify.com/account/integrations
  3. 复制 API Key

配置方式:

# 方式 1:环境变量
export APIFY_API_KEY="apify_api_xxxxx"

# 方式 2:修改脚本
APIFY_TOKEN = "apify_api_xxxxx"

免费额度: $5(约 500-1000 次商品爬取)


依赖项

openpyxl>=3.0.0
requests>=2.28.0

示例输出

爆款商品表格

排名商品品牌类型ASIN
1adidas Women's VL Court 3.0AdidasSneakerB0C2JY169J
2Adokoo Women's Fashion SneakersAdokooSneakerB0CH9FJY8V
3Adidas Women's Vl Court 3.0AdidasSneakerB0F1XH7M8F

品牌分布统计

品牌商品数量占比
Adidas735%
ODOLY630%
LUCKY STEP525%

进阶功能

1. 深度爬取(价格/评分/销量)

# 启用深度爬取模式
python3 amazon_scraper.py --deep

额外数据:

  • 当前价格
  • 用户评分(1-5 星)
  • 评论总数
  • 销量排名(BSR)
  • 高清商品图片

2. 定时监控

# 配置 cron 每天执行
0 3 * * * python3 /path/to/amazon_scraper.py

用途:

  • 价格变化监控
  • 新品上架提醒
  • 竞品动态追踪

3. 多关键词批量爬取

KEYWORDS = [
    "women fashion shoes",
    "men sneakers",
    "kids boots",
    "running shoes"
]

注意事项

1. API 成本

  • Apify 免费额度:$5/月
  • 单次爬取成本:约 $0.002/商品
  • 建议:合理控制爬取数量

2. 反爬措施

  • Amazon 反爬严格,建议使用 Apify 等专业服务
  • 不要高频爬取同一关键词
  • 遵守 Amazon 服务条款

3. 数据准确性

  • 价格可能实时变化
  • 销量为估算值(基于评论数推算)
  • 建议定期更新数据

常见问题

Q: 为什么有些商品价格显示"待爬取"?

A: 基础模式只抓取搜索页数据,价格需要访问商品详情页。使用 --deep 参数启用深度爬取。

Q: Apify 运行失败怎么办?

A: 检查:

  1. API Key 是否正确
  2. 网络连接是否正常
  3. Apify 账户是否有余额

Q: 如何自定义输出格式?

A: 修改 generate_excel()generate_report() 函数,调整列名和样式。


更新日志

v1.0.0 (2026-03-03)

  • ✅ 基础爬取功能
  • ✅ Excel + Markdown 报告生成
  • ✅ 品牌分布统计
  • ✅ 可点击链接

TODO

  • 深度爬取(价格/评分/销量)
  • 图片下载功能
  • 多语言支持
  • 定时监控告警

作者与许可


相关技能

  • web-scraping-router - 爬虫工具路由技能
  • apify-scraper - Apify 工业级爬虫模板
  • price-monitor - 价格监控哨兵

最后更新:2026-03-03

Comments

Loading comments...