Install
openclaw skills install xhs-search-summarizer-seckhoSearches Xiaohongshu(小红书) for a given keyword, extracts the top N posts (including texts, images, and user comments), and then synthesizes a comprehensive fi...
openclaw skills install xhs-search-summarizer-seckhoThis skill automates the process of extracting high-quality multi-modal content (text + images) from Xiaohongshu (小红书) and actively assists you in generating a deeply integrated, analytical final report for the user. Due to Xiaohongshu's aggressive anti-scraping mechanisms, direct HTTP requests or naive scraping often result in 404s or blocks. This skill natively bypasses these by simulating a real user through the playwright-cli in a headed browser window.
It operates in two distinct phases:
[keyword]_raw_data.md).[keyword]_raw_data.md file.playwright-cli (Must be available on the path)python3 (Required to download images and stitch the raw data markdown)requests Python package (pip install requests) — used by parse.py to download imagesExecute the wrapper script in scripts/run.sh. It accepts the following arguments:
/bin/bash <skill_dir>/scripts/run.sh "YOUR KEYWORD" <MAX_POSTS> <OUTPUT_DIRECTORY>
YOUR KEYWORD: The search term to look up on Xiaohongshu.<MAX_POSTS>: (Optional, default = 10) The number of top posts to scan.<OUTPUT_DIRECTORY>: (Optional, default = ./) Directory where the raw data and images will be saved.Example execution:
/bin/bash ~/.claude/skills/xiaohongshu-search-summarizer/scripts/run.sh "openclaw使用场景" 10 "./xhs_report_openclaw_scenarios"
Once the bash script finishes successfully, navigate to the OUTPUT_DIRECTORY and use your file reading capabilities to ingest the generated [keyword]_raw_data.md file.
Inside this file, you will find descriptions, comments, and file paths pointing to post_X_img_Y.webp or post_X_img_Y.jpg.
This is the most critical step. Do not just return the raw markdown file to the user. Instead, write a polished comprehensive markdown report that reorganizes the information logically, while retaining a high level of detail.
Follow these strict compilation rules:
.webp or .jpg image files found in the raw data directory to interpret their contents.<OUTPUT_DIRECTORY> as the raw data (e.g., <OUTPUT_DIRECTORY>/[keyword]_synthesis.md), and give the user the path to it.If you encounter 404 Not Found or "element not visible" errors during the browser invocation:
playwright-cli browser window and perform necessary authentication manually, then try the script again.