Tavily Crawl
Security checks across static analysis, malware telemetry, and agentic risk
Overview
The crawler mostly matches its stated purpose, but first-run login can automatically run an unpinned npm helper and the skill uses local Tavily credentials, so it should be reviewed before use.
Install only if you are comfortable with the first-run OAuth flow running `npx -y mcp-remote` or after the maintainer pins and documents that helper. Use a dedicated or scoped Tavily credential when possible, and constrain crawls with depth, limits, domain/path filters, and `allow_external: false` for narrow tasks.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
First-time authentication may run code obtained from npm as the local user, which could be risky if the package or resolution path is compromised.
On missing credentials, the script launches an unpinned npm package via npx with automatic yes mode. This runtime helper and dependency are not declared in the install metadata, creating a supply-chain review gap.
npx -y mcp-remote https://mcp.tavily.com/mcp </dev/null >/dev/null 2>&1 &
Pin the helper package/version, declare the Node/npx dependency, and require an explicit user approval step before running downloaded helper code; ideally vendor or document the reviewed helper.
The skill can use an existing Tavily session or API key to act against the Tavily MCP service on the user's behalf.
The script searches the local MCP auth cache for access tokens. It validates Tavily issuer and expiry before use, so this is purpose-aligned, but it is still local account credential access.
MCP_AUTH_DIR="$HOME/.mcp-auth" ... token=$(jq -r '.access_token // empty' "$token_file" 2>/dev/null)
Use only with a Tavily account you trust for this purpose, avoid sharing the token cache, and prefer explicitly setting a scoped API key if available.
If crawled pages are later given to an agent as context, malicious or misleading page text could influence the agent's reasoning.
The skill is intended to retrieve website content for LLM context. Web pages are untrusted and may contain prompt-injection text, even though this retrieval is central to the skill's purpose.
For agentic use (feeding results into context): Always use `instructions` + `chunks_per_source`.
Treat crawled content as untrusted source material, keep citations, and do not let instructions inside retrieved pages override the user's task or security rules.
A crawl may fetch more sites or pages than expected if the user does not set domain/path filters and limits.
The documented default can include links beyond the starting domain. This is disclosed and related to crawling, but it can broaden collection beyond what a user intended.
| `allow_external` | boolean | true | Include external domain links |
Start with low `max_depth` and `limit`, set `allow_external: false` when appropriate, and use `select_paths` or domain filters for focused crawls.
