Autoresearch Agent

Security checks across malware telemetry and agentic risk

Overview

This skill is a coherent autonomous optimization tool, but it can repeatedly edit and commit repository changes, run user-supplied shell commands, create recurring jobs, and hard-reset git state without enough safety gating.

Review before installing. Use only on a clean, dedicated branch or disposable copy, inspect config.cfg and evaluate_cmd before running, avoid sensitive files with LLM judge evaluators, and confirm you know how to stop any recurring loop. Do not run it in repositories with uncommitted work or private content you are not willing to expose to configured evaluation tools.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
Findings (36)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
# Build if needed
if "BUILD_CMD" in dir() or "BUILD_CMD" in globals():
    result = subprocess.run(BUILD_CMD, shell=True, capture_output=True)
    if result.returncode != 0:
        print(f"Build failed: {result.stderr.decode()[:200]}", file=sys.stderr)
        sys.exit(1)
Confidence
93% confidence
Finding
result = subprocess.run(BUILD_CMD, shell=True, capture_output=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
if "DOCKER_IMAGE" in dir() or "DOCKER_IMAGE" in globals():
    if "DOCKER_BUILD_CMD" in dir():
        subprocess.run(DOCKER_BUILD_CMD, shell=True, capture_output=True)
    result = subprocess.run(
        f"docker image inspect {DOCKER_IMAGE} --format '{{{{.Size}}}}'",
        shell=True, capture_output=True, text=True
    )
Confidence
95% confidence
Finding
result = subprocess.run( f"docker image inspect {DOCKER_IMAGE} --format '{{{{.Size}}}}'", shell=True, capture_output=True, text=True )

subprocess module call

Medium
Category
Dangerous Code Execution
Content
# Measure
if "DOCKER_IMAGE" in dir() or "DOCKER_IMAGE" in globals():
    if "DOCKER_BUILD_CMD" in dir():
        subprocess.run(DOCKER_BUILD_CMD, shell=True, capture_output=True)
    result = subprocess.run(
        f"docker image inspect {DOCKER_IMAGE} --format '{{{{.Size}}}}'",
        shell=True, capture_output=True, text=True
Confidence
92% confidence
Finding
subprocess.run(DOCKER_BUILD_CMD, shell=True, capture_output=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
subprocess.run(CLEAN_CMD, shell=True, capture_output=True, timeout=60)

    t0 = time.perf_counter()
    result = subprocess.run(BUILD_CMD, shell=True, capture_output=True, timeout=600)
    elapsed = time.perf_counter() - t0

    if result.returncode != 0:
Confidence
94% confidence
Finding
result = subprocess.run(BUILD_CMD, shell=True, capture_output=True, timeout=600)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
for i in range(RUNS):
    # Clean if configured
    if CLEAN_CMD:
        subprocess.run(CLEAN_CMD, shell=True, capture_output=True, timeout=60)

    t0 = time.perf_counter()
    result = subprocess.run(BUILD_CMD, shell=True, capture_output=True, timeout=600)
Confidence
92% confidence
Finding
subprocess.run(CLEAN_CMD, shell=True, capture_output=True, timeout=60)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
if system == "Linux":
    # Use /usr/bin/time for peak RSS
    result = subprocess.run(
        f"/usr/bin/time -v {COMMAND}",
        shell=True, capture_output=True, text=True, timeout=300
    )
Confidence
98% confidence
Finding
result = subprocess.run( f"/usr/bin/time -v {COMMAND}", shell=True, capture_output=True, text=True, timeout=300 )

subprocess module call

Medium
Category
Dangerous Code Execution
Content
elif system == "Darwin":
    # macOS: use /usr/bin/time -l
    result = subprocess.run(
        f"/usr/bin/time -l {COMMAND}",
        shell=True, capture_output=True, text=True, timeout=300
    )
Confidence
98% confidence
Finding
result = subprocess.run( f"/usr/bin/time -l {COMMAND}", shell=True, capture_output=True, text=True, timeout=300 )

subprocess module call

Medium
Category
Dangerous Code Execution
Content
TEST_CMD = "pytest tests/ --tb=no -q"  # Test command
# --- END CONFIG ---

result = subprocess.run(TEST_CMD, shell=True, capture_output=True, text=True, timeout=300)
output = result.stdout + "\n" + result.stderr

# Try to parse pytest output: "X passed, Y failed, Z errors"
Confidence
95% confidence
Finding
result = subprocess.run(TEST_CMD, shell=True, capture_output=True, text=True, timeout=300)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
t0 = time.time()
    try:
        with open(log_file, "w") as lf:
            result = subprocess.run(
                eval_cmd, shell=True,
                stdout=lf, stderr=subprocess.STDOUT,
                cwd=str(project_root),
Confidence
99% confidence
Finding
result = subprocess.run( eval_cmd, shell=True, stdout=lf, stderr=subprocess.STDOUT, cwd=str(project_root), timeout=hard_limi

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def run_cmd(cmd, cwd=None, timeout=None):
    """Run shell command, return (returncode, stdout, stderr)."""
    result = subprocess.run(
        cmd, shell=True, capture_output=True, text=True,
        cwd=cwd, timeout=timeout
    )
Confidence
98% confidence
Finding
result = subprocess.run( cmd, shell=True, capture_output=True, text=True, cwd=cwd, timeout=timeout )

Lp3

Medium
Category
MCP Least Privilege
Confidence
96% confidence
Finding
The skill clearly instructs the agent to read and modify files, execute shell commands, and perform git operations, yet it declares no permissions or safety boundaries. This mismatch is dangerous because users and hosting platforms cannot accurately assess or constrain what the skill is allowed to do before it performs repository-altering actions.

Intent-Code Divergence

Medium
Confidence
96% confidence
Finding
The skill's constraints are internally inconsistent: earlier steps require reading config, strategy, history, and git metadata, while the later rule says to never read or modify files outside the target file and program.md. In an autonomous agent, contradictory authority boundaries can cause unsafe scope expansion or unreliable policy enforcement, making it easier for the agent to justify reading additional files beyond what the operator expects.

Intent-Code Divergence

Medium
Confidence
97% confidence
Finding
The instruction to 'edit only the target file' conflicts with the later requirement to update program.md every 10th experiment. That contradiction weakens change-control guarantees and can let the agent rationalize modifying additional files, which is especially risky in a looped autonomous system that already performs commits automatically.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The skill explicitly instructs the agent to edit a target file, commit it, and iterate, but it does not warn the user that repository contents will be modified and committed automatically. In a tool designed for autonomous optimization loops, this omission is meaningful because users may trigger destructive or hard-to-audit changes without fully understanding that the agent will persist modifications to git history.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The skill documents creation of a recurring autonomous job without warning about continued background execution, repeated file modifications, and repeated commits over time. This is more dangerous in context because the skill is explicitly designed to run indefinitely and optimize by making code changes, so a recurring job can amplify damage, consume resources, or introduce unnoticed repository drift long after the user stops actively supervising it.

Vague Triggers

Medium
Confidence
88% confidence
Finding
The activation rules are broad enough to match many common requests such as 'make this better' or 'improve my prompts,' which can trigger an autonomous loop in contexts where the user did not intend repository edits or repeated command execution. In this skill, overly broad invocation is more dangerous because activation leads directly to file modification, shell execution, and potentially indefinite experimentation.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The skill advertises indefinite autonomous editing, committing, and discarding changes, including hard resets, without an upfront user-facing warning about destructive repository effects. This is risky because users may invoke it expecting optimization help but not realize it can rewrite history, lose uncommitted work, or keep making changes unattended for long periods.

Missing User Warnings

Medium
Confidence
99% confidence
Finding
The skill instructs the agent to stage files, create git commits, and execute an evaluation script, but does not present a user-facing warning or consent boundary for repository modification and code execution. In this skill's context, that is more dangerous because the agent is designed for repeated autonomous experimentation, so these actions are not incidental—they are the core loop and can continuously alter the repo and run potentially unsafe commands.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The script reads a local file and sends its full contents to an external LLM CLI without any explicit user disclosure, consent check, or data-classification guard. In an autonomous optimization loop, this creates a real risk of unintended exfiltration of sensitive or proprietary content to a third-party service, especially if the target file is changed or repurposed.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The evaluator reads the full target file and sends it verbatim to an external LLM CLI, which can disclose proprietary, sensitive, or regulated content to a third-party model or service. In the context of an autonomous optimization loop, this may happen repeatedly and without clear operator awareness, increasing the chance of unintended data exfiltration.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The evaluator sends the full prompt, each test input, expected output, and generated model output to an external CLI tool without any warning, consent flow, or sensitivity checks. In an autonomous optimization loop, this can repeatedly exfiltrate proprietary prompts, test corpora, secrets embedded in cases, or sensitive outputs to a third-party model provider.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The marketing/content examples encourage LLM-based evaluation of user-authored materials such as headlines, email subjects, social posts, articles, prompts, and skills, but provide no warning that these workflows may send proprietary, personal, or sensitive text to external model providers. In this skill’s context, the agent is designed for autonomous repeated experimentation, which increases the chance that users will bulk-submit private materials without noticing the data handling risk.

Missing User Warnings

Medium
Confidence
84% confidence
Finding
The template explicitly recommends adding chain-of-thought instructions during prompt optimization, which can lead skill authors to request or normalize disclosure of internal reasoning. In an agent skill context, this is risky because it may encourage prompts that expose hidden reasoning traces, sensitive intermediate analysis, or policy-violating rationale instead of asking for concise answers or brief justifications.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The script performs automatic git rollback operations such as reset --hard HEAD~1 and checkout -- . without user confirmation or safety checks. In this skill's autonomous optimization loop, that can destroy local uncommitted work or remove commits unexpectedly, especially if run in the wrong repository or against a dirty working tree.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The script executes an arbitrary evaluation command during setup without a strong upfront safety warning, which can surprise users into running untrusted commands. In an autonomous experimentation skill, this context makes silent command execution more dangerous because users may treat evaluator configuration as data rather than code execution.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal