Back to skill

Security audit

Auto Arena

Security checks across malware telemetry and agentic risk

Overview

This is a coherent model-benchmarking skill that uses external AI endpoints and local result files in ways that match its stated purpose.

Use this skill with non-sensitive benchmark data unless you are comfortable sending the content to the configured providers. Use dedicated API keys with limits, verify the py-openjudge package before installing, run it in a controlled environment, and keep or disable saved response/detail files according to your privacy needs.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Findings (1)

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The skill explicitly instructs users to send prompts and model outputs to multiple third-party endpoints and later save artifacts such as responses, rubrics, reports, and checkpoints, but it does not prominently warn that potentially sensitive task data, generated queries, and model responses will leave the local environment and be persisted to disk. This can cause unintentional disclosure of confidential inputs, evaluation content, or proprietary outputs when users assume benchmarking is local-only.

VirusTotal

53/53 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.