ResearchClaw

Security checks across malware telemetry and agentic risk

Overview

This skill is purpose-aligned for autonomous research, but it asks the agent to auto-run a pipeline that can execute generated code locally or over SSH without enough scoping or review controls.

Review this before installing. Use only with a trusted, pinned ResearchClaw implementation. Start in simulated mode or with manual gate approvals, avoid --auto-approve, inspect generated code before execution, protect the LLM API key, and do not allow SSH remote execution unless the remote account, files, network access, and spending limits are tightly scoped.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (6)

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The trigger conditions are broad enough to activate on ordinary research-help requests, which can unexpectedly route users into a highly autonomous workflow. In this skill, that matters because the pipeline can proceed to code generation, local subprocess execution, and remote SSH execution, so accidental invocation materially increases risk.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The skill description markets 'No babysitting required' and describes end-to-end autonomy without clearly warning that it may execute generated code locally or on a remote server. Missing disclosure reduces informed consent and makes unsafe execution more likely, particularly when combined with auto-approval and broad triggers.

Autonomous Decision Making

Medium

Category: Excessive Agency
Content: run_id="research-001", config=config, adapters=AdapterBundle(), auto_approve_gates=True, ) # Check results
Confidence: 91% confidence
Finding: auto_approve

Autonomous Decision Making

Medium

Category: Excessive Agency
Content: - **Config validation error**: Run `researchclaw validate --config config.yaml` - **LLM connection failure**: Check `llm.base_url` and API key - **Sandbox execution failure**: Verify `experiment.sandbox.python_path` exists and has numpy installed - **Gate rejection**: Use `--auto-approve` or manually approve at stages 5, 9, 20 ## Tools Required
Confidence: 88% confidence
Finding: auto-approve

Autonomous Decision Making

Medium

Category: Excessive Agency
Content: - **Config validation error**: Run `researchclaw validate --config config.yaml` - **LLM connection failure**: Check `llm.base_url` and API key - **Sandbox execution failure**: Verify `experiment.sandbox.python_path` exists and has numpy installed - **Gate rejection**: Use `--auto-approve` or manually approve at stages 5, 9, 20 ## Tools Required
Confidence: 88% confidence
Finding: --auto-approve

Unvalidated Output Injection

High

Category: Output Handling
Content: | Mode | Description | Config | |------|-------------|--------| | `simulated` | LLM generates synthetic results (no code execution) | `experiment.mode: simulated` | | `sandbox` | Execute generated code locally via subprocess | `experiment.mode: sandbox` | | `ssh_remote` | Execute on remote GPU server via SSH | `experiment.mode: ssh_remote` | ### Troubleshooting
Confidence: 98% confidence
Finding: Execute generated code

VirusTotal

59/59 vendors flagged this skill as clean.

View on VirusTotal