Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Parallel Extract

v1.0.0

URL content extraction via Parallel API. Extracts clean markdown from webpages, articles, PDFs, and JS-heavy sites. Use for reading specific URLs with LLM-ready output.

0· 1.8k·2 current·2 all-time
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description match the instructions: the skill documents using parallel-cli to extract web/PDF content and produce LLM-ready markdown. That capability legitimately requires an API key and a client CLI, so the high-level purpose is coherent — but the registry metadata declares no required environment variables or primary credential while the SKILL.md explicitly instructs setting PARALLEL_API_KEY, which is an inconsistency.
!
Instruction Scope
Instructions instruct the agent/operator to run the parallel-cli to fetch and persist extracted content and to spawn sub-agents (sessions_spawn) reading /tmp files. They also instruct saving full extracted content and preserving verbatim text. These behaviors are within the stated purpose, but the skill also tells the user to run a remote install and export an API key; the SKILL.md gives the agent permission to write files in /tmp and to spawn sessions, which increases the practical surface for data leakage if the API/key or files are mishandled.
!
Install Mechanism
The Quickstart/prerequisites recommend running curl -fsSL https://parallel.ai/install.sh | bash — piping a remote script to bash is a high-risk install pattern. While the domain is the same as the claimed vendor (parallel.ai), downloading and executing an unchecked installer from the network is an elevated risk and should be audited before running.
!
Credentials
SKILL.md requires a PARALLEL_API_KEY to call the Parallel API, but the registry metadata did not declare any required env vars or a primary credential. That mismatch is a red flag: the skill will need an API key to function, and the key can grant access to extracted data and the API account — the registry should explicitly declare this and the skill should document required scopes/limits for the key.
Persistence & Privilege
The skill is instruction-only, has no install spec in the registry, and does not set always:true. It does instruct saving outputs to /tmp and spawning sessions, but it does not request persistent platform-level privileges or modification of other skills/configs. No excessive persistence is requested in metadata.
What to consider before installing
Before installing or invoking this skill: 1) Verify the publisher and the parallel.ai URLs (docs and install.sh) independently — don't run curl | bash without inspecting the script contents. 2) Expect to provide a PARALLEL_API_KEY; the registry should declare this as a required credential — only grant a key with the minimum scope and monitor its use. 3) Prefer installing the CLI from a vetted package or manually review the installer; avoid piping network scripts directly to a shell. 4) Understand that extracted content may be written to /tmp and used to spawn sub-agents; ensure sensitive URLs or paywalled content are handled according to your privacy policy. 5) Ask the publisher to update the skill metadata to declare required env vars and to provide a reproducible, auditable install method (package manager or checked installer). If you cannot inspect the installer or confirm the metadata, treat the skill with caution.

Like a lobster shell, security has layers — review code before you run it.

latestvk97ape3642hf3gxyperkbv7qdn80f9c0
1.8kdownloads
0stars
1versions
Updated 3h ago
v1.0.0
MIT-0

Parallel Extract

Extract clean, LLM-ready content from URLs. Handles webpages, articles, PDFs, and JavaScript-heavy sites that need rendering.

When to Use

Trigger this skill when the user asks for:

  • "read this URL", "fetch this page", "extract from..."
  • "get the content from [URL]"
  • "what does this article say?"
  • Reading PDFs, JS-heavy pages, or paywalled content
  • Getting clean markdown from messy web pages

Use Search to discover; use Extract to read.

Quick Start

parallel-cli extract "https://example.com/article" --json

CLI Reference

Basic Usage

parallel-cli extract "<url>" [options]

Common Flags

FlagDescription
--url "<url>"URL to extract (repeatable, max 10)
--objective "<focus>"Focus extraction on specific content
--jsonOutput as JSON
--excerpts / --no-excerptsInclude relevant excerpts (default: on)
--full-content / --no-full-contentInclude full page content
--excerpts-max-chars NMax chars per excerpt
--excerpts-max-total-chars NMax total excerpt chars
--full-max-chars NMax full content chars
-o <file>Save output to file

Examples

Basic extraction:

parallel-cli extract "https://example.com/article" --json

Focused extraction:

parallel-cli extract "https://example.com/pricing" \
  --objective "pricing tiers and features" \
  --json

Full content for PDFs:

parallel-cli extract "https://example.com/whitepaper.pdf" \
  --full-content \
  --json

Multiple URLs:

parallel-cli extract \
  --url "https://example.com/page1" \
  --url "https://example.com/page2" \
  --json

Default Workflow

  1. Search with an objective + keyword queries
  2. Inspect titles/URLs/dates; choose the best sources
  3. Extract the specific pages you need (top N URLs)
  4. Answer using the extracted excerpts/content

Best-Practice Prompting

Objective

When extracting, provide context:

  • What specific information you're looking for
  • Why you need it (helps focus extraction)

Good: --objective "Find the installation steps and system requirements"

Poor: --objective "Read the page"

Response Format

Returns structured JSON with:

  • url — source URL
  • title — page title
  • excerpts[] — relevant text excerpts (if enabled)
  • full_content — complete page content (if enabled)
  • publish_date — when available

Output Handling

When turning extracted content into a user-facing answer:

  • Keep content verbatim — do not paraphrase unnecessarily
  • Extract ALL list items exhaustively
  • Strip noise: nav menus, footers, ads, "click here" links
  • Preserve all facts, names, numbers, dates, quotes
  • Include URL + publish_date for transparency

Running Out of Context?

For long conversations, save results and use sessions_spawn:

parallel-cli extract "<url>" --json -o /tmp/extract-<topic>.json

Then spawn a sub-agent:

{
  "tool": "sessions_spawn",
  "task": "Read /tmp/extract-<topic>.json and summarize the key content.",
  "label": "extract-summary"
}

Error Handling

Exit CodeMeaning
0Success
1Unexpected error (network, parse)
2Invalid arguments
3API error (non-2xx)

Prerequisites

  1. Get an API key at parallel.ai
  2. Install the CLI:
curl -fsSL https://parallel.ai/install.sh | bash
export PARALLEL_API_KEY=your-key

References

Comments

Loading comments...