Research Claim Checker

Security checks across malware telemetry and agentic risk

Overview

This skill is a local research-review helper that reads user-provided material and can write a report, with no evidence of networking, credential use, persistence, or hidden execution.

Install only if you are comfortable with a local Python helper processing files you explicitly provide. Prefer dry-run or stdout when reviewing sensitive material, avoid pointing output at important files, and treat the result as an evidence-checking aid rather than peer review or final factual validation.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (7)

Lp3

Medium
Category
MCP Least Privilege
Confidence
85% confidence
Finding
The skill advertises itself as a research claim checker, but it explicitly allows shell execution via python3 and implies file input/output capabilities without declaring corresponding permissions. This creates a trust and review gap: a user or orchestrator may treat the skill as low-risk text analysis while it can actually read files, write outputs, and invoke local code.

Tp4

High
Category
MCP Tool Poisoning
Confidence
92% confidence
Finding
A description-behavior mismatch is a real security issue because the skill claims to validate research evidence, yet the detected behavior includes broad directory scanning, content inspection, regex-based secret/high-risk pattern scanning, and generic audit/report generation. That broader behavior increases the chance of unintended data exposure, overcollection, and misuse under the guise of a benign research workflow.

Description-Behavior Mismatch

High
Confidence
95% confidence
Finding
The dispatcher implements multiple unrelated audit modes such as directory, CSV, pattern, and skill-package auditing instead of constraining behavior to research claim/evidence checking. This creates a capability mismatch: a user invoking a seemingly narrow research-analysis skill can use it to inspect arbitrary local files and repositories, increasing the risk of unauthorized data discovery and misuse in agent workflows.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
The skill_audit feature inspects package structure and metadata files unrelated to the stated purpose of checking whether research conclusions are supported by evidence. In the context of an agent skill, hidden extra auditing functionality is dangerous because it broadens access to local project contents and can be repurposed to enumerate files or analyze repositories under a misleading trust boundary.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
The pattern scanner searches arbitrary files for secrets, private URLs, and shell-execution indicators, which is outside the expected scope of a research claim checker. Even though it only reads and reports matches, this capability can expose sensitive material from local files and makes the skill materially more dangerous because its description would not lead users to expect security scanning behavior.

Vague Triggers

Medium
Confidence
89% confidence
Finding
The trigger examples are very broad natural-language phrases that plausibly overlap with ordinary user requests, which can cause the skill to activate in unintended contexts. In a skill-routing system, overbroad triggers increase the chance of misrouting sensitive research, review, or analysis tasks into this skill without explicit user intent, potentially causing incorrect framing or unintended processing of provided materials.

Natural-Language Policy Violations

Medium
Confidence
78% confidence
Finding
The documentation is entirely in Chinese and presents Chinese-language usage/output expectations without indicating that language should follow user preference. In multi-user or multilingual environments, this can lead to unexpected language forcing, reducing reviewability and increasing the risk that users misunderstand outputs or fail to notice important caveats in the generated analysis.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal