ResearchClaw

Security checks across malware telemetry and agentic risk

Overview

This skill is purpose-aligned for autonomous research, but it asks the agent to auto-run a pipeline that can execute generated code locally or over SSH without enough scoping or review controls.

Review this before installing. Use only with a trusted, pinned ResearchClaw implementation. Start in simulated mode or with manual gate approvals, avoid --auto-approve, inspect generated code before execution, protect the LLM API key, and do not allow SSH remote execution unless the remote account, files, network access, and spending limits are tightly scoped.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Findings (6)

Vague Triggers

Medium
Confidence
93% confidence
Finding
The trigger conditions are broad enough to activate on ordinary research-help requests, which can unexpectedly route users into a highly autonomous workflow. In this skill, that matters because the pipeline can proceed to code generation, local subprocess execution, and remote SSH execution, so accidental invocation materially increases risk.

Missing User Warnings

High
Confidence
97% confidence
Finding
The skill description markets 'No babysitting required' and describes end-to-end autonomy without clearly warning that it may execute generated code locally or on a remote server. Missing disclosure reduces informed consent and makes unsafe execution more likely, particularly when combined with auto-approval and broad triggers.

Autonomous Decision Making

Medium
Category
Excessive Agency
Content
run_id="research-001",
    config=config,
    adapters=AdapterBundle(),
    auto_approve_gates=True,
)

# Check results
Confidence
91% confidence
Finding
auto_approve

Autonomous Decision Making

Medium
Category
Excessive Agency
Content
- **Config validation error**: Run `researchclaw validate --config config.yaml`
- **LLM connection failure**: Check `llm.base_url` and API key
- **Sandbox execution failure**: Verify `experiment.sandbox.python_path` exists and has numpy installed
- **Gate rejection**: Use `--auto-approve` or manually approve at stages 5, 9, 20

## Tools Required
Confidence
88% confidence
Finding
auto-approve

Autonomous Decision Making

Medium
Category
Excessive Agency
Content
- **Config validation error**: Run `researchclaw validate --config config.yaml`
- **LLM connection failure**: Check `llm.base_url` and API key
- **Sandbox execution failure**: Verify `experiment.sandbox.python_path` exists and has numpy installed
- **Gate rejection**: Use `--auto-approve` or manually approve at stages 5, 9, 20

## Tools Required
Confidence
88% confidence
Finding
--auto-approve

Unvalidated Output Injection

High
Category
Output Handling
Content
| Mode | Description | Config |
|------|-------------|--------|
| `simulated` | LLM generates synthetic results (no code execution) | `experiment.mode: simulated` |
| `sandbox` | Execute generated code locally via subprocess | `experiment.mode: sandbox` |
| `ssh_remote` | Execute on remote GPU server via SSH | `experiment.mode: ssh_remote` |

### Troubleshooting
Confidence
98% confidence
Finding
Execute generated code

VirusTotal

59/59 vendors flagged this skill as clean.

View on VirusTotal