Clean Skill
ReviewAudited by ClawScan on May 10, 2026.
Overview
This skill is not clearly malicious, but it needs review because it uses logged-in scraping with persistent browser sessions and anti-scraping evasion, and its default code can present mock data as validated restaurant recommendations.
Only install this if you are comfortable with logged-in scraping of Dianping and Xiaohongshu and the related account, terms-of-service, and blocking risks. Verify whether you are running mock mode or real scraping mode, use isolated accounts if possible, inspect where browser sessions are stored, and pin dependencies before running.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A user could receive restaurant scores that look validated across platforms but are actually based on mock data.
The default Dianping fetcher returns synthetic data, while SKILL.md presents the workflow as automatically fetching and validating real restaurant data.
# Simulated data for demonstration # In production, this would scrape actual Dianping pages restaurants = self._fetch_mock_data(location, cuisine)
Clearly label mock/demo mode in user-facing output and make real-data mode explicit, opt-in, and documented.
Use of the skill could violate platform terms, trigger account or IP blocking, or create legal/compliance risk for the user.
The skill instructs the agent/user to work around anti-scraping controls for third-party platforms rather than using authorized APIs.
Anti-scraping: Use residential proxies, rotate user agents
Use official APIs or manual research where possible, avoid proxy-based evasion, and require explicit user confirmation before any real scraping.
The agent may use saved account sessions/cookies to access a user's platform account, creating account-risk and privacy-risk if invoked unintentionally.
The real scraper uses a persistent logged-in browser session for Xiaohongshu, but the metadata declares no primary credential and the instructions do not clearly bound session handling.
await self.session_manager.refresh_session_if_needed("xiaohongshu")
...
launch_persistent_context(
user_data_dir=str(self.session_manager.xhs_session_dir),
headless=TrueRequire explicit setup and per-use approval for logged-in scraping, document where sessions are stored, provide cleanup/logout instructions, and prefer isolated test accounts.
Dependency versions may change over time, which can affect behavior or introduce supply-chain risk.
The dependency installation is purpose-aligned, but the packages are unpinned and are not represented by a reviewed install spec.
pip install requests beautifulsoup4 pandas numpy thefuzz selenium lxml
Install in a virtual environment, pin dependency versions, and review the requirements file before running the scripts.
