Ralph Mode - Autonomous Development Loops

Security checks across malware telemetry and agentic risk

Overview

Ralph Mode is a transparent autonomous coding workflow that can edit and commit project files, but its behavior is disclosed and aligned with its purpose.

Install this only when you want an agent to run a bounded autonomous coding loop. Use a feature branch or sandbox, set max iteration and timeout limits, keep confirmation checkpoints enabled, and review generated plans, progress files, diffs, and commits before pushing or deploying.

SkillSpector

By NVIDIA

Vulnerability Patterns

Memory PoisoningPersistent Context Injection, Context Window Stuffing, Memory Manipulation
Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (3)

Vague Triggers

Medium

Confidence: 93% confidence
Finding: Because this is a markdown file, vague-trigger review applies. The 'Use Ralph Mode when' list relies on broad phrases like 'building features,' 'working on complex projects,' and 'prefer autonomous loops,' without clear boundaries, exclusions, or concrete trigger phrases, which could cause the skill to be invoked for many ordinary coding tasks.

Memory Manipulation

High

Category: Memory Poisoning
Content: |--------------|-------------|------------| | No progress logging | Parent agent cannot determine status | Mandatory PROGRESS.md | | Silent failure | Work lost, time wasted | Explicit error logging | | Overlapping sessions | File conflicts, corrupt state | Check/cleanup before spawn | | Path assumptions | Wrong directory, wrong files | Explicit verification | | No completion signal | Parent waits indefinitely | Clear COMPLETE status | | Infinite iteration | Resource waste, no progress | Time limits + blockers |
Confidence: 90% confidence
Finding: corrupt state

Self-Modification

High

Category: Rogue Agent
Content: If cron reports same status repeatedly: 1. Check PROGRESS.md was updated by sub-agent 2. If not updated → sub-agent skipped documentation step 3. Update skill: Add "MANDATORY PROGRESS.md update" to prompt 4. Manual fix: Update PROGRESS.md to reflect actual state ## Summary
Confidence: 85% confidence
Finding: Update skill

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal