Ralph Mode - Autonomous Development Loops

Security checks across malware telemetry and agentic risk

Overview

Ralph Mode is a transparent autonomous coding workflow that can edit and commit project files, but its behavior is disclosed and aligned with its purpose.

Install this only when you want an agent to run a bounded autonomous coding loop. Use a feature branch or sandbox, set max iteration and timeout limits, keep confirmation checkpoints enabled, and review generated plans, progress files, diffs, and commits before pushing or deploying.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Memory PoisoningPersistent Context Injection, Context Window Stuffing, Memory Manipulation
  • Rogue AgentSelf-Modification, Session Persistence
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Findings (3)

Vague Triggers

Medium
Confidence
93% confidence
Finding
Because this is a markdown file, vague-trigger review applies. The 'Use Ralph Mode when' list relies on broad phrases like 'building features,' 'working on complex projects,' and 'prefer autonomous loops,' without clear boundaries, exclusions, or concrete trigger phrases, which could cause the skill to be invoked for many ordinary coding tasks.

Memory Manipulation

High
Category
Memory Poisoning
Content
|--------------|-------------|------------|
| No progress logging | Parent agent cannot determine status | Mandatory PROGRESS.md |
| Silent failure | Work lost, time wasted | Explicit error logging |
| Overlapping sessions | File conflicts, corrupt state | Check/cleanup before spawn |
| Path assumptions | Wrong directory, wrong files | Explicit verification |
| No completion signal | Parent waits indefinitely | Clear COMPLETE status |
| Infinite iteration | Resource waste, no progress | Time limits + blockers |
Confidence
90% confidence
Finding
corrupt state

Self-Modification

High
Category
Rogue Agent
Content
If cron reports same status repeatedly:
1. Check PROGRESS.md was updated by sub-agent
2. If not updated → sub-agent skipped documentation step
3. Update skill: Add "MANDATORY PROGRESS.md update" to prompt
4. Manual fix: Update PROGRESS.md to reflect actual state

## Summary
Confidence
85% confidence
Finding
Update skill

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal