Brainforge Autoresearch

Security checks across malware telemetry and agentic risk

Overview

This skill does what it claims: it uses LLM APIs to test and rewrite skill prompts, with local result files and backups, but users should treat runs as sending prompt data to a provider and modifying the target prompt file.

Install only if you are comfortable sending the target prompt, eval cases, and generated outputs to the LLM provider whose API key you supply. Run it on a copy or in version control, review the resulting prompt diff before committing, avoid sensitive secrets in prompts or evals, and avoid --verbose in shared terminals or CI logs.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (8)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The tool sends the target skill content, user-supplied eval inputs, and generated outputs to external LLM providers during experiments. That creates a real data-exposure risk if skills, prompts, test cases, or outputs contain proprietary, regulated, or secret material, especially because the manifest does not clearly warn that third-party transmission is central to operation.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The mutation stage transmits the complete current SKILL.md plus failing outputs and reasons to a remote LLM to produce revised content. This broadens disclosure beyond simple benchmarking into prompt rewriting, which can leak internal instructions, embedded secrets, or sensitive example data to third parties.

Context-Inappropriate Capability

Low

Confidence: 84% confidence
Finding: The generated dashboard loads Chart.js from a public CDN at view time, introducing a supply-chain and privacy risk outside the core optimization function. If the CDN is compromised or blocked, users may execute untrusted script or lose dashboard functionality.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The README explicitly describes an autonomous loop that reads, mutates, tests, and may keep changes to a target SKILL.md, and it also documents generated artifacts such as backups and results files. However, it does not clearly warn users up front that running the tool can modify the target prompt file and write multiple artifacts to disk, which can lead to unintended prompt corruption, accidental commits of generated files, or disruption of existing workflows.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The instructions tell the operator to copy the winning prompt over the original skill prompt, but they do not require backup, diff review, validation, or human approval before overwriting production prompt content. Because this skill performs automated prompt mutation, direct application can silently introduce regressions, unsafe behavior, or prompt-injection weaknesses into the target skill.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: Verbose logging prints raw remote responses and may expose model outputs, error bodies, and possibly sensitive prompt-derived content to stdout/stderr or log collectors. In shared terminals, CI, or centralized logging environments, this can leak confidential data without users realizing the scope of disclosure.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The optimizer writes mutated content directly back to the target file during each experiment, which can unintentionally corrupt or replace important prompt assets if the process fails, is mispointed, or produces harmful output. Although a baseline backup is created, users are not clearly warned before destructive in-place modification occurs.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The skill is triggered by broad optimization-related requests and is meant to run automated prompt experiments. Without tighter scope boundaries, it can activate on generic 'improve' or 'optimize' requests that are not actually about prompt tuning, causing the agent to apply the wrong workflow or recommend unsafe/irrelevant actions. In this skill's context, that is more dangerous because the content explicitly decides when to run an automation-oriented tool and should refuse code-bug scenarios, so over-triggering can directly misroute users into the wrong remediation path.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal