Install
openclaw skills install amazon-review-analysisAmazon竞品评论深度分析 — 从产品页面采集评论、NLP关键词/痛点/场景分析、Chart.js交互式HTML报告+Excel数据文件生成。
openclaw skills install amazon-review-analysis对Amazon竞品进行7维度评论深度分析,输出Chart.js交互式HTML报告+Excel数据文件。
当用户用以下格式请求时,直接按此模板执行:
请专业分析亚马逊美国站竞品的全部评论,Asin为{ASIN1}和{ASIN2},按以下维度整理:
1、好评Top5关键词、用户满意点
2、差评Top5痛点(质量、物流、尺寸、色差、功能、包装等分类)
3、统计抱怨最多的5个问题
4、挖掘用户未被满足的需求、隐藏刚需
5、给出差异化选品改进点、避坑要点
6、客户主要使用场景(比如卧室,地下室等)
7、全文采用标准数据分析格式进行输出,词频/情感直观比例+趋势可视化+折线图展示差评率变化,给出一份带有图表和详细分析的可视化分析报告(HTML),另外再生成一份带有调研数据和分析的EXCEL文件
Amazon评论页面(product-reviews/{asin})需要登录,不可直接访问。
正确方法:从产品页面内提取评论
# 导航到产品页面
browser_navigate(url=f"https://www.amazon.com/dp/{asin}")
// browser_console — 必须用IIFE避免变量冲突
(function() {
const results = [];
document.querySelectorAll('[data-hook="review"]').forEach(el => {
// 标题 — 在h5元素内
let title = '';
const titleEl = el.querySelector('h5[data-hook="reviewTitle"]');
if (titleEl) title = titleEl.textContent.trim();
// 正文 — reviewText容器内的文本,需清理模板文字
let body = '';
const bodyContainer = el.querySelector('[data-hook="reviewText"]');
if (bodyContainer) {
body = bodyContainer.textContent
.replace(/Brief content visible.*?content\./g, '')
.replace(/Full content visible.*?content\./g, '')
.replace(/Read moreRead less/g, '')
.trim();
}
// 评分 — 从icon-alt提取
let rating = 0;
const ratingSpan = el.querySelector('[data-hook="review-star-rating"] .a-icon-alt');
if (ratingSpan) {
const m = ratingSpan.textContent.match(/(\d+(\.\d+)?)/);
if (m) rating = parseFloat(m[1]);
}
// 日期
let date = '';
const dateEl = el.querySelector('[data-hook="review-date"]');
if (dateEl) date = dateEl.textContent.trim();
// 已验证购买
const verified = !!el.querySelector('[data-hook="avp-badge"]');
// 有用投票数
let helpful = '';
const helpEl = el.querySelector('[data-hook="helpful-vote-statement"]');
if (helpEl) helpful = helpEl.textContent.trim();
results.push({rating, title, body, date, verified, helpful});
});
// 评分分布 — 从histogram链接提取
const histLinks = document.querySelectorAll('a[href*="acr_dp_hist"]');
const hist = {};
histLinks.forEach(a => {
const text = a.textContent;
const starMatch = text.match(/(\d)\s*star/);
const pctMatch = text.match(/(\d+)%/);
if (starMatch && pctMatch) {
hist[starMatch[1] + '_star'] = parseInt(pctMatch[1]);
}
});
// 总评价数
let totalReviews = '';
const totalEl = document.querySelector('#acrCustomerReviewText');
if (totalEl) totalReviews = totalEl.textContent.trim();
// 总体评分
let overallRating = '';
const ratingEl = document.querySelector('#acrPopover .a-icon-alt');
if (ratingEl) overallRating = ratingEl.textContent.trim();
return JSON.stringify({totalReviews, overallRating, hist, count: results.length, reviews: results});
})()
(function() {
let title = document.querySelector('#productTitle')?.textContent.trim() || '';
let price = document.querySelector('.a-price .a-offscreen')?.textContent.trim() || '';
let features = '';
const featEl = document.querySelector('#featurebullets_feature_div');
if (featEl) features = featEl.textContent.trim().substring(0, 500);
return JSON.stringify({title, price, features});
})()
histogram链接是提取评分分布的最可靠方法。页面HTML结构:
5 star4 star3 star2 star1 star5 star76%12%3%4%5%76%
解析:每个链接文本包含所有百分比,最后一个%是该星级的比例。
产品页面通常只显示8-11条评论。如需更多数据,通过web_search获取第三方评论摘要:
# 搜索第三方评论汇总
web_search(f"amazon {asin} reviews summary analysis")
web_search(f"site:reddit.com amazon {product_name} review")
将提取的评论保存为JSON:
import os, json
os.makedirs('/tmp/amazon_reviews', exist_ok=True)
with open('/tmp/amazon_reviews/review_data.json', 'w') as f:
json.dump(all_data, f, indent=2, ensure_ascii=False)
import re
from collections import Counter
stop_words = set(['the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for',
'of', 'with', 'by', 'from', 'is', 'was', 'are', 'were', 'be', 'been', 'being',
'have', 'has', 'had', 'do', 'does', 'did', 'will', 'would', 'could', 'should',
'may', 'might', 'can', 'this', 'that', 'these', 'those', 'i', 'you', 'he', 'she',
'it', 'we', 'they', 'what', 'which', 'who', 'when', 'where', 'why', 'how',
'all', 'each', 'every', 'both', 'few', 'more', 'most', 'other', 'some', 'such',
'no', 'not', 'only', 'own', 'same', 'so', 'than', 'too', 'very', 'just',
'about', 'above', 'after', 'again', 'also', 'am', 'any', 'as', 'because',
'before', 'between', 'into', 'like', 'me', 'my', 'nor', 'off', 'once',
'out', 'over', 'then', 'there', 'through', 'under', 'until', 'up', 'while',
'product', 'item', 'bought', 'purchased', 'one', 'get', 'got', 'really',
'much', 'well', 'still', 'even', 'make', 'made', 'use', 'used', 'using',
'thing', 'things', 'way', 'want', 'needed', 'need', 'come', 'came',
'work', 'working', 'don', 'didn', 'doesn', 'isn', 'wasn', 'weren', 'won',
'wouldn', 'couldn', 'shouldn', 'haven', 'hasn', 'hadn'])
def extract_keywords(reviews, min_rating=None, max_rating=None):
"""从指定评分范围的评论中提取关键词"""
text = []
for r in reviews:
if min_rating and r['rating'] < min_rating: continue
if max_rating and r['rating'] > max_rating: continue
text.append(r.get('title', '').lower())
text.append(r.get('body', '').lower())
combined = ' '.join(text)
words = re.findall(r'\b[a-z]{3,}\b', combined)
filtered = [w for w in words if w not in stop_words and len(w) > 2]
return Counter(filtered).most_common(20)
# 好评关键词(4-5星)
positive_kw = extract_keywords(reviews, min_rating=4)
# 差评关键词(1-2星)
negative_kw = extract_keywords(reviews, max_rating=2)
pain_categories = {
'质量': ['quality', 'cheap', 'flimsy', 'broke', 'broken', 'defective', 'poor',
'damage', 'ruin', 'ruined', 'fell apart', 'durable', 'fragile', 'crack',
'cracked', 'rip', 'ripped', 'tear', 'torn', 'peel', 'peeled'],
'物流': ['shipping', 'delivery', 'arrived', 'late', 'delayed', 'package',
'packaging', 'box', 'damaged in transit', 'missing', 'lost'],
'尺寸': ['size', 'inch', 'fit', 'small', 'big', 'thin', 'thick', 'dimension',
'measurement', 'too small', 'too large', 'oversized', 'undersized',
'doesn\'t fit', 'tight', 'loose'],
'色差': ['color', 'colour', 'different color', 'not as pictured', 'shade',
'lighter', 'darker', 'faded', 'mismatch', 'looks different'],
'功能': ['function', 'feature', 'doesn\'t work', 'not work', 'fail', 'failed',
'malfunction', 'defective', 'useless', 'disappoint', 'advertised'],
'包装': ['packaging', 'package', 'box', 'wrapper', 'unpacked', 'unboxing',
'no instructions', 'missing parts', 'incomplete'],
'粘性/安装': ['stick', 'sticky', 'adhesive', 'glue', 'tape', 'peel', 'fell',
'falling', 'detach', 'nonstick', 'residue', 'install'],
'气味': ['smell', 'odor', 'stink', 'musty', 'mildew', 'chemical', 'smelly'],
'噪音': ['sound', 'noise', 'echo', 'acoustic', 'absorb', 'dampening', 'quiet', 'loud'],
'价格': ['price', 'expensive', 'cost', 'value', 'cheap', 'pricey', 'money', 'overpriced'],
'耐久性': ['last', 'lasting', 'durable', 'wear', 'fade', 'worn', 'degrade']
}
def classify_pain(reviews, max_rating=2):
"""对差评进行分类"""
pain_counts = {cat: 0 for cat in pain_categories}
pain_examples = {cat: [] for cat in pain_categories}
for r in reviews:
if r['rating'] > max_rating: continue
text = (r.get('title', '') + ' ' + r.get('body', '')).lower()
for cat, keywords in pain_categories.items():
for kw in keywords:
if kw in text:
pain_counts[cat] += 1
if len(pain_examples[cat]) < 3:
pain_examples[cat].append(r.get('body', '')[:200])
break
return pain_counts, pain_examples
scenario_keywords = {
'卧室': ['bedroom', 'bed room', 'master bedroom', 'guest room'],
'客厅': ['living room', 'family room', 'den'],
'录音室/工作室': ['studio', 'music room', 'recording', 'home studio'],
'办公室': ['office', 'home office', 'work', 'workspace'],
'浴室': ['bathroom', 'water closet', 'bath'],
'厨房': ['kitchen', 'cabinet', 'refrigerator'],
'地下室': ['basement', 'downstairs', 'garage'],
'隔音墙': ['accent wall', 'feature wall', 'wall', 'ceiling'],
'降噪': ['neighbor', 'noise', 'soundproofing', 'hvac', 'closet', 'quiet']
}
def extract_scenarios(reviews):
"""从评论中提取使用场景"""
scenario_counts = {s: 0 for s in scenario_keywords}
scenario_examples = {s: [] for s in scenario_keywords}
for r in reviews:
text = (r.get('title', '') + ' ' + r.get('body', '')).lower()
for scene, keywords in scenario_keywords.items():
for kw in keywords:
if kw in text:
scenario_counts[scene] += 1
if len(scenario_examples[scene]) < 2:
scenario_examples[scene].append(r.get('body', '')[:150])
break
return scenario_counts, scenario_examples
from collections import defaultdict
def review_trend(reviews):
"""按月份统计差评率趋势"""
monthly = defaultdict(lambda: {'total': 0, 'negative': 0})
for r in reviews:
date_str = r.get('date', '')
# 解析日期 "Reviewed in the United States on April 27, 2026"
import re
m = re.search(r'on\s+(\w+)\s+\d+,\s*(\d{4})', date_str)
if m:
month_key = f"{m.group(2)}-{m.group(1)[:3]}"
monthly[month_key]['total'] += 1
if r['rating'] <= 2:
monthly[month_key]['negative'] += 1
months = sorted(monthly.keys())
rates = []
for m in months:
total = monthly[m]['total']
neg = monthly[m]['negative']
rates.append(round(neg / total * 100, 1) if total > 0 else 0)
return months, rates
unmet_need_patterns = [
(r'(wish|hope|should|could|would be (nice|great|better)|need|lack|missing|want|expect)',
'用户期望'),
(r'(only|just) (complain|issue|problem)', '局限性吐槽'),
(r'(compare|compared) (to|with)', '竞品对比'),
(r'(return|refund|replace)', '退货/换货需求'),
]
def extract_unmet_needs(reviews):
"""挖掘未被满足的需求"""
needs = []
for r in reviews:
text = (r.get('title', '') + ' ' + r.get('body', '')).lower()
for pattern, category in unmet_need_patterns:
if re.search(pattern, text):
needs.append({
'rating': r['rating'],
'category': category,
'text': r.get('body', '')[:300]
})
break
return needs
使用Chart.js CDN生成交互式HTML报告。
用户偏好的配色方案:
#0f3460(深蓝)#e94560(红)#38a169 / #48bb78 / #68d391(绿系)#f6ad55(橙)#e53e3e / #fc8181(红系)#3182ce / #4299e1 / #63b3edHTML报告结构(按顺序):
CSS卡片风格:
.card { background:white; border-radius:12px; padding:24px; margin-bottom:20px; box-shadow:0 2px 12px rgba(0,0,0,0.06); }
.card h2 { color:#0f3460; border-left:4px solid #e94560; padding-left:12px; margin-bottom:16px; }
.insight { background:#fffbeb; border-left:4px solid #f6ad55; padding:16px; border-radius:0 8px 8px 0; margin:12px 0; }
.insight.green { background:#f0fff4; border-color:#48bb78; }
.insight.red { background:#fff5f5; border-color:#fc8181; }
.insight.blue { background:#ebf8ff; border-color:#4299e1; }
.metric-card { display:inline-block; background:#f7fafc; border-radius:8px; padding:16px 24px; margin:8px; text-align:center; }
.metric-value { font-size:28px; font-weight:bold; color:#0f3460; }
.metric-label { font-size:12px; color:#718096; text-transform:uppercase; }
差评率趋势折线图(必须包含):
new Chart(document.getElementById('trendChart'), {
type: 'line',
data: {
labels: months, // ['2025-Jan', '2025-Feb', ...]
datasets: [
{
label: 'Product A 差评率',
data: ratesA,
borderColor: '#0f3460',
backgroundColor: 'rgba(15,52,96,0.1)',
fill: true,
tension: 0.3,
pointRadius: 4
},
{
label: 'Product B 差评率',
data: ratesB,
borderColor: '#e94560',
backgroundColor: 'rgba(233,69,96,0.1)',
fill: true,
tension: 0.3,
pointRadius: 4
}
]
},
options: {
responsive: true,
plugins: { title: { display: true, text: '差评率月度趋势变化' } },
scales: { y: { beginAtZero: true, title: { display: true, text: '差评率 (%)' } } }
}
});
使用openpyxl生成,7个工作表:
| 工作表 | 内容 | 颜色编码 |
|---|---|---|
| 原始评论数据 | 所有评论原始数据 | 绿=4-5★, 黄=3★, 红=1-2★ |
| 评分分布 | 评分统计汇总 | 无 |
| 好评关键词分析 | Top10关键词+频次+维度 | 绿色背景 |
| 差评痛点分析 | 痛点分类+严重程度+建议 | 红=高/极高, 黄=中 |
| 差异化改进建议 | 维度+问题+创新方向+优先级 | 红=P0, 黄=P1, 蓝=P2 |
| 使用场景分析 | 场景+提及次数+开发启示 | 无 |
| 竞品对比总表 | 多维度横向对比 | 无 |
表头样式:
header_font = Font(bold=True, color='FFFFFF', size=11)
header_fill = PatternFill(start_color='0F3460', end_color='0F3460', fill_type='solid')
red_fill = PatternFill(start_color='FFF5F5', end_color='FFF5F5', fill_type='solid')
green_fill = PatternFill(start_color='F0FFF4', end_color='F0FFF4', fill_type='solid')
yellow_fill = PatternFill(start_color='FFFFF0', end_color='FFFFF0', fill_type='solid')
自动列宽:
def auto_width(ws, min_width=10, max_width=50):
for col in range(1, ws.max_column + 1):
max_len = min_width
for row in range(1, ws.max_row + 1):
val = ws.cell(row=row, column=col).value
if val:
max_len = max(max_len, min(len(str(val)), max_width))
ws.column_dimensions[get_column_letter(col)].width = max_len + 2
import shutil
dst_dir = os.path.expanduser('~/.hermes/workspace/amazon_review_analysis')
os.makedirs(dst_dir, exist_ok=True)
# shutil.copy2 from /tmp/amazon_review_analysis/
HTML报告必须上传到Dinzee获取公网链接,方便浏览器直接打开查看图表交互。
# curl上传(推荐,最稳定)
curl -sL -X POST "https://report.dinzee.ai/report/upload" \
-H "Authorization: 538f4ad962266f5bc62dabda825e43021820988fdeeca85caaa1aca20e49a0eb" \
-F "file=@/tmp/amazon_review_analysis/review-analysis-ASIN1-ASIN2.html;filename=review-analysis-ASIN1-ASIN2.html"
返回: {"url": "https://report.dinzee.ai/review-analysis-ASIN1-ASIN2.html"}
命名规范: review-analysis-{ASIN1}-{ASIN2}.html 或 {类型}-{关键词}-{日期}.html
⚠️ Feishu文件发送必须用MEDIA附件方式(已验证2026-05-20):
send_message 工具在Feishu上下文中不可靠(缺少home channel时会报错)MEDIA: 路径,系统自动上传为飞书附件MEDIA:/tmp/amazon_review_analysis/review-analysis.html
MEDIA:/tmp/amazon_review_analysis/review-analysis.xlsx
公网链接输出为裸链明文(不要Markdown链接格式):
https://report.dinzee.ai/review-analysis-ASIN1-ASIN2.html
⚠️ 文件类型限制: Dinzee仅支持 .html, .csv, .json, .xlsx, .xls, .png, .jpg。Excel(.xlsx)可直接上传。
# 1. 生成HTML报告 → Python写入 /tmp/amazon_review_analysis/
# 2. 生成Excel文件 → openpyxl写入 /tmp/amazon_review_analysis/
# 3. 上传HTML到Dinzee
curl -sL -X POST "https://report.dinzee.ai/report/upload" \
-H "Authorization: 538f4ad962266f5bc62dabda825e43021820988fdeeca85caaa1aca20e49a0eb" \
-F "file=@/tmp/amazon_review_analysis/review-analysis-{ASIN1}-{ASIN2}.html;filename=review-analysis-{ASIN1}-{ASIN2}.html"
# 4. 上传Excel到Dinzee
curl -sL -X POST "https://report.dinzee.ai/report/upload" \
-H "Authorization: 538f4ad962266f5bc62dabda825e43021820988fdeeca85caaa1aca20e49a0eb" \
-F "file=@/tmp/amazon_review_analysis/review-analysis-{ASIN1}-{ASIN2}.xlsx;filename=review-analysis-{ASIN1}-{ASIN2}.xlsx"
# 5. 复制到workspace备份
cp /tmp/amazon_review_analysis/* ~/.hermes/workspace/amazon_review_analysis/
# 6. 在最终回复中用 MEDIA: 路径发送飞书附件 + 裸链
最终输出给用户时,用中文大白话总结7个维度的分析结论,然后附上HTML和Excel文件。
格式:
📊 **竞品评论深度分析报告**
**ASIN: {A} vs {B}**
### 1️⃣ 好评Top5关键词
- 关键词1 (出现X次) — 用户认为...
- ...
### 2️⃣ 差评Top5痛点
- 痛点1 (X条提及, 严重程度: 高) — 典型问题:...
### 3️⃣ 抱怨最多的5个问题
1. 问题1 — X条相关评论
2. ...
### 4️⃣ 未被满足的需求/隐藏刚需
- 需求1:用户期望...但目前产品...
### 5️⃣ 差异化选品改进点 & 避坑要点
✅ 保留:...
❌ 避免:...
💡 创新方向:...
### 6️⃣ 客户主要使用场景
- 场景1 (X次提及) — 用户描述:...
### 7️⃣ 可视化报告
HTML报告:[链接]
Excel数据:[链接]
product-reviews/{asin} URL会重定向到登录页。必须从产品页面内提取评论(产品页面显示约10条最新评论)。(function(){...})()包装。a-cardui-deck组件包裹评论,需从[data-hook="reviewText"]提取并清理模板文字(Brief content visible.../Read moreRead less)。a[href*="acr_dp_hist"]链接文本,不是histogramTable(可能为空)。pip install openpyxl。on\s+(\w+)\s+\d+,\s*(\d{4})提取月份。send_message工具在Feishu上下文中不可靠(会报"no home channel"错误)。必须用MEDIA路径写在回复文本中,系统自动上传为飞书附件。review-analysis-{ASIN1}-{ASIN2}.html 格式。Excel(.xlsx)可直接上传。set/pack/first/second/third 应加入停用词表,过滤包装量词噪声(如"18 Pack"、"6-Piece")。references/amazon-dom-selectors.md — Amazon评论DOM选择器备忘录 + Anti-Bot工作区 + 评分分布提取逻辑references/excel-generation-pattern.md — openpyxl Excel生成标准模式,7个工作表结构+颜色编码references/nlp-analysis-constants.md — NLP分析常量:停用词表、11维痛点分类、场景关键词、满意度映射templates/chartjs-review-report.html — Chart.js交互式HTML报告模板,7种图表类型+CSS卡片样式