Token Usage Dashboard

Security checks across malware telemetry and agentic risk

Overview

This skill is a local usage dashboard, but it bundles broader admin and report-writing features than its narrow model-summary label makes clear.

Install only if you want the broader local dashboard and reporting tool, not just a simple per-model cost summary. Keep output directories private, avoid feeding it payloads containing raw prompts or identity-linked usage unless local exports are acceptable, and treat tenant-management flags as administrative operations that can persistently change or delete access configuration.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (15)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 86% confidence
Finding: The skill advertises and demonstrates shell execution, file input/output, opening local HTML, and invoking an external CLI, but does not declare any permissions for those capabilities. This creates a transparency and policy-enforcement gap: users or orchestrators may approve the skill assuming it is low-risk text summarization, while it can read local files, write artifacts, and execute commands against locally available tools.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 91% confidence
Finding: The documented purpose is a narrow per-model usage summary, but the skill content indicates materially broader behavior including dashboard generation, artifact export, analysis features, scheduled reporting, and opening a browser. This mismatch is dangerous because it can bypass user expectations and agent safety controls: a caller may invoke a seemingly simple reporting skill that actually performs broader data processing, writes files, and triggers additional actions not implied by the description.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The README describes a much broader dashboard and administrative platform than the skill metadata claims. This capability mismatch is dangerous because an agent or user invoking a supposedly narrow model-usage summarization skill could be exposed to unexpected data processing, multi-tenant handling, report automation, and policy features that materially expand privilege and attack surface.

Context-Inappropriate Capability

High

Confidence: 94% confidence
Finding: User, role, and dashboard-view administration are privileged control-plane functions unrelated to simple per-model usage summarization. Embedding admin capabilities in a reporting-oriented skill increases the risk of unauthorized access changes, tenant isolation mistakes, and misuse by agents that were only expected to read cost data.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: Real-time cost-control actions such as degrade, switch_model, or stop_calls go beyond reporting and can directly affect production behavior. In the context of a skill advertised as usage summarization, hidden or unexpected control actions create a serious risk of disruptive changes, denial of service, or policy manipulation if invoked improperly.

Context-Inappropriate Capability

Medium

Confidence: 83% confidence
Finding: Scheduled report generation, retained history, and a download center expand the data lifecycle beyond on-demand summarization. This broadens exposure of potentially sensitive usage data, creates persistence concerns, and may surprise users who expect a transient local summary tool rather than a reporting system with stored artifacts.

Description-Behavior Mismatch

High

Confidence: 92% confidence
Finding: The tests demonstrate a much broader capability surface than the skill metadata claims: full dashboard rendering, alerting, access control, multi-tenant resolution, config mutation, and scheduling. That scope expansion is dangerous because it can hide privileged behaviors from reviewers and users, increasing the chance of unauthorized data access, persistence, or operational side effects in a skill expected to only summarize local per-model usage.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: Tenant configuration management allows creation, assignment, and modification of users, groups, and dashboard views, which is a privileged administrative function unrelated to simple usage summarization. If exposed through the skill, it could enable unauthorized policy changes, privilege misconfiguration, and cross-tenant data exposure, especially because configuration is persisted to disk.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: Scheduled report generation and delivery add persistence, file artifact creation, and recipient-routing behavior beyond a local ad hoc model-usage summary tool. This increases the attack surface for data leakage through generated files or unauthorized recipients, and may create unattended exfiltration paths if configuration is abused.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: This skill claims to summarize model usage from CodexBar JSON, but the file implements a much broader system: HTML dashboarding, prompt analysis, cost attribution, report automation, RBAC, and tenant administration. That unnecessary expansion increases attack surface, introduces persistent state and access-control logic, and makes it easier for a simple data-summary skill to perform higher-risk actions than users would reasonably expect.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The script includes tenant/user/view management operations that can modify configuration state on disk, despite the skill being described as a model-usage summary helper. In an agent-skill context, hidden administrative mutation capabilities are dangerous because they can alter authorization boundaries, dashboard visibility, or future data access beyond the user's apparent request.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: Scheduled report generation and persistent artifact writing exceed the stated purpose of a scriptable per-model summary tool and create a durable data-exfiltration surface on local disk. Because reports include detailed summaries, role context, tenant metadata, artifacts, and recipient information, this feature expands both the sensitivity and persistence of processed data.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The module documentation presents the script as a local dashboard generator, but the code also performs tenant configuration mutation and scheduled artifact management. This mismatch is security-relevant because users and orchestrating agents may grant trust or permissions based on the narrower description while the code can do substantially more.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The scheduler writes detailed usage summaries, artifact paths, delivery status, tenant context, and recipient metadata to disk without prominent disclosure or consent in the user flow. In a local-agent setting, quietly persisting operational and potentially sensitive analytics data can create confidentiality and retention risks, especially on shared systems.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: Tenant-management operations persistently rewrite the tenant configuration file without an explicit warning, confirmation step, or transactional safeguards. In practice this can silently change users, roles, groups, and dashboard access, causing privilege changes or policy drift that outlast the current session.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal