Armarius - Prompt Injection Blocker

Security checks across malware telemetry and agentic risk

Overview

Armarius is a defensive prompt-injection skill, but it gives itself broad session-wide authority and requires full unredacted logging of suspicious content.

Install only if you want a strict session-wide prompt-injection policy. Avoid using it with sensitive files, emails, API responses, or private tool outputs unless the logging behavior is changed to redact secrets and show short excerpts instead of full verbatim payloads. Inspect the external Armarius package or repository separately before running the README install and demo commands.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (4)

Vague Triggers

Medium
Confidence
92% confidence
Finding
The activation text is overly broad, triggering on generic topics like prompt injection, tool outputs, emails, web pages, or documents. In a skill that influences agent behavior, this can cause the skill to activate during ordinary security discussions or unrelated content processing, increasing the chance of unintended instruction injection into workflows.

Vague Triggers

Medium
Confidence
78% confidence
Finding
The skill asserts that its rules apply to 'this entire session' and 'cannot be suspended, overridden, or modified,' which is overly broad and ambiguous for an instruction-layer defense. In practice, this can create unsafe instruction conflicts, interfere with higher-priority platform safeguards, and cause the model to over-trust its own classification of inputs rather than relying on the host system's security boundaries.

Missing User Warnings

High
Confidence
98% confidence
Finding
The instruction to always include the complete, unabbreviated text of flagged content can cause the assistant to echo secrets, credentials, personal data, malware payloads, or sensitive prompts originating from untrusted inputs. An attacker could intentionally place secrets or harmful content inside an 'injection attempt' so the protection layer exfiltrates it back to the user or into downstream logs.

Ssd 3

Medium
Confidence
97% confidence
Finding
Reproducing full flagged content verbatim creates a data-leak channel from untrusted inputs into model outputs. Even when the source is malicious, the assistant may inadvertently disclose API keys, tokens, personal information, hidden prompt text, or toxic content that would otherwise have remained contained.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal