Arxiv Search Collector

AdvisoryAudited by Static analysis on Apr 30, 2026.

Overview

No suspicious patterns detected.

Findings (0)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Paper titles, abstracts, or comments from arXiv are external text and could contain irrelevant or manipulative wording; they should not change the user's goal or operating instructions.

Why it was flagged

The workflow intentionally has the model read externally fetched arXiv result content and make relevance decisions from it.

Skill content
Read each query result list and decide keep indexes.
Recommendation

Treat fetched paper metadata as data only. Use it for relevance selection, not for following any instructions that may appear inside abstracts or comments.

What this means

This is expected for a script-based workflow, but pointing these options at an untrusted interpreter or script would run other local code.

Why it was flagged

The batch helper runs a Python subprocess, and it exposes optional overrides for the interpreter and fetch script.

Skill content
parser.add_argument("--python-bin", default="python3" ...); parser.add_argument("--fetch-script", default="" ...); ... proc = subprocess.run(cmd, text=True, capture_output=True)
Recommendation

Leave the default helper paths unless you intentionally trust an alternate Python executable or fetch script.

What this means

The agent can issue arXiv searches based on the planned queries and store returned metadata locally.

Why it was flagged

The fetch script makes external API requests to arXiv, which is central to the skill's stated paper-search purpose.

Skill content
ARXIV_API_URL = "https://export.arxiv.org/api/query"
Recommendation

Review broad or iterative query plans if request volume matters, and keep the provided rate-limit defaults.

What this means

If the run directory is reused carelessly, merge decisions can remove generated paper folders from earlier selections.

Why it was flagged

Repeated merges can delete previously generated per-paper output directories that are no longer in the selected set.

Skill content
Stale paper directories from previous merge outputs are removed when they are no longer selected.
Recommendation

Use a dedicated output/run directory for each collection task and keep backups if previous generated outputs must be preserved.