Canary

Security checks across malware telemetry and agentic risk

Overview

Canary is a disclosed local safety-monitoring helper, but users should treat it as advisory logging and checks, not a real sandbox or kill switch.

Install only if you want a lightweight local monitoring layer. Configure it with decoy tripwire paths, protect the log and .canary_tripwires directories, avoid clearing logs when audit history matters, and do not rely on Canary as a sandbox or process-level halt mechanism.

SkillSpector

By NVIDIA

Vulnerability Patterns

Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
Tool MisuseTool Parameter Abuse, Chaining Abuse, Unsafe Defaults
Rogue AgentSelf-Modification, Session Persistence
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (6)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 85% confidence
Finding: The skill documentation demonstrates capabilities for file reads/writes, shell execution, and network-related operations, but no explicit permission model is declared in the metadata. That mismatch can cause consumers or platforms to underestimate the skill's effective attack surface and approve it under weaker scrutiny.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The skill metadata promises 'Auto-halts on critical violations,' but this implementation only updates a registry, appends to a log, and prints alerts. In a safety-control component, this mismatch is dangerous because operators may rely on it as an active enforcement mechanism when it provides only passive detection, allowing unsafe agent activity to continue after a critical tripwire event.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The reset method can erase the entire log file when clear_logs is enabled, and the CLI exposes this through a single flag without any confirmation, backup, or authorization control. In a safety-monitoring tool, log integrity is security-relevant because an operator or compromised workflow could wipe audit evidence of prior violations and reduce forensic visibility.

Unvalidated Output Injection

High

Category: Output Handling
Content: import subprocess cmd_list = cmd.split() if isinstance(cmd, str) else cmd return subprocess.run(cmd_list, capture_output=True) ``` ### Pre-Deployment Checks
Confidence: 94% confidence
Finding: subprocess.run(cmd_list, capture_output

Session Persistence

Medium

Category: Rogue Agent
Content: - Requires manual review to restart **Tripwire Files:** - Create honeypot files that should never be accessed - Detect modifications, deletions, or access - Hash verification for file integrity
Confidence: 79% confidence
Finding: Create honeypot files that should never be accessed - Detect modifications, deletions, or access - Hash verification for file integrity **Audit Trail:** - Complete action logs - Violation history - P

Unsafe Defaults

Medium

Category: Tool Misuse
Content: - **Use decoy paths only** — never point tripwires at real files containing sensitive data. Tripwires are honeypots; treat them as bait, not protection. - **`create_tripwire` will not overwrite existing files** — it checks for pre-existing files and refuses to proceed. Use dedicated empty paths for tripwires. - **Test in a sandbox first** — verify where logs, tripwires, and registry files are created before deploying. Confirm protected paths and auto-halt behavior in an isolated environment. - **Protect log and alert directories** — set filesystem permissions so alert logs are not world-readable. Canary writes plaintext logs; restrict access accordingly. - **Canary only blocks when called** — it is not an OS-level enforcement mechanism. Layer it with containers, filesystem permissions, and `auditd` for production deployments. ## ⚠️ Disclaimer
Confidence: 81% confidence
Finding: world-readable

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal