Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Zoomin Docs Portal Scraper Tool

v1.0.2

Scrape documentation content from Zoomin Software portals using Playwright browser automation to handle dynamic content loading. Use when standard web fetchi...

0· 694·0 current·0 all-time
byJustin Paul@recklessop
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The skill's stated purpose is scraping Zoomin-powered docs using Playwright, which matches the inclusion of Playwright-based scraper code. However, the code is heavily tailored to Zerto (default filenames and directories reference zerto_hyperv, sanitization removes 'help_zerto_com' prefixes, and regex cleans up 'From the Zerto User Interface'), indicating the package was repurposed from a Zerto-specific scraper. The example CLI in SKILL.md (named parameters) does not match run_scraper.sh (positional args). These are coherence issues (likely sloppy repackaging) but do not by themselves indicate extra malicious capability.
Instruction Scope
SKILL.md instructs the user to manually install Playwright and to run the provided wrapper script which activates a virtualenv and runs the scraper. The runtime instructions only perform web navigation of user-supplied URLs, extract page content, and write text files to the specified output directory. The scripts do not attempt to read unrelated local files, access extra environment variables, or transmit scraped content to external endpoints other than the pages being scraped. Note: the script will visit arbitrary URLs provided by the user — only supply trusted/allowed targets and be mindful of legal/robots constraints.
Install Mechanism
There is no automated install spec; SKILL.md asks the user to run `pip install playwright` and `playwright install chromium`. That manual step downloads Playwright and browser binaries from upstream, which is expected for Playwright but does involve fetching executables over the network. The skill itself does not perform automatic remote installs or fetch arbitrary remote code.
Credentials
The skill requires no environment variables or credentials and the scripts do not access secrets. The only runtime requirement is a Python virtual environment path to activate. No disproportionate credential requests were found.
Persistence & Privilege
The skill is not always-enabled and does not alter other skills or global agent settings. It writes only to the output directory you provide and prints results to stdout; it does not persist credentials or attempt to install itself persistently.
What to consider before installing
This skill appears to be a straightforward Playwright-based scraper and contains no requests for credentials or hidden network exfiltration. However, before installing or running it: - Note the mismatch: the SKILL.md claims 'Zoomin' but the code contains many Zerto-specific assumptions (default output names, URL sanitization and content cleanup). If you expect a generic Zoomin scraper, test on a small set of URLs first. - You must run `pip install playwright` and `playwright install chromium` yourself; these commands download browser binaries. Prefer doing this in an isolated virtualenv or sandbox. - The wrapper example in SKILL.md uses named parameters but the script expects positional args; call run_scraper.sh with: ./run_scraper.sh <urls_file> <output_dir> <venv_path>. - The scraper will visit arbitrary URLs you supply and write text files to disk. Only provide URLs you are permitted to scrape (observe robots.txt / terms of service) and run the script in a directory where writing files is acceptable. - If you want to be extra cautious, review/modify the code to remove or adapt Zerto-specific patterns, and run the scraper on a controlled test list before bulk use. Given the repackaging inconsistencies and guidance mismatches, treat this skill as safe-but-suspicious until you've validated it in your environment.

Like a lobster shell, security has layers — review code before you run it.

latestvk97a4qdmaq3bmasef7mxh6q5cs81ak84
694downloads
0stars
3versions
Updated 10h ago
v1.0.2
MIT-0

Zoomin Scraper Skill

This skill provides a mechanism to robustly scrape content from documentation portals powered by Zoomin Software. It leverages Playwright to launch a headless Chromium browser, execute JavaScript, wait for dynamic content to load, and then extract the rendered text from the main article body.

Usage

To use this skill, you need to provide a file containing a list of URLs, one URL per line. The skill will then process each URL, saving the extracted content to a specified output directory.

Prerequisites (Manual Setup)

This skill relies on Playwright. Before using this skill for the first time on a new system, you must manually install Playwright and its browser binaries by running the following commands in your terminal:

pip install playwright
playwright install chromium

These commands should be executed within the virtual environment you intend to use for this skill.

Running the Scraper

To run the scraper, you will invoke the run_scraper.sh script, which is located within this skill's scripts/ directory. This wrapper script will activate your specified Python virtual environment before executing the main Python Playwright script.

Parameters for run_scraper.sh:

  • urls_file: The path to a text file containing the URLs to scrape, one URL per line.
  • output_directory (optional): The directory where the scraped content will be saved. If not provided, it defaults to scraped_docs_output.
  • venv_path: The absolute path to your Python virtual environment (e.g., /home/justin/scraper/.env).

Example:

Assuming your list of URLs is in path/to/urls.txt, you want to save the output to my_scraped_docs/, and your virtual environment is at path/to/my_venv:

zoomin-scraper urls_file="path/to/urls.txt" output_directory="my_scraped_docs" venv_path="path/to/my_venv"

The script will launch a headless Chromium browser, navigate to each URL, wait for the main content to load (specifically targeting <article id="zDocsContent">), and then save the extracted text. It includes a user agent to mimic a regular browser and a small delay between requests to be polite to the server.

Comments

Loading comments...