OpenClaw Skill Tester

Security checks across malware telemetry and agentic risk

Overview

This skill is not destructive, but it presents simulated and hard-coded test results as if they were real skill validation.

Install only if you treat this as a demo or scaffold. Do not rely on its pass/fail reports, comparison metrics, or generated summaries for quality, security, CI, or deployment decisions until it runs real target skills, records actual exit codes, and builds reports from collected results.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (6)

Tp4

High
Category
MCP Tool Poisoning
Confidence
96% confidence
Finding
The documented behavior indicates the skill may fabricate or simulate test outcomes, metrics, and reports instead of measuring real executions. In a testing framework, this is dangerous because it can create false assurance about skill quality, performance, and safety, causing unsafe or broken skills to be trusted, deployed, or approved based on misleading evidence.

Intent-Code Divergence

Medium
Confidence
96% confidence
Finding
This script advertises comparison testing but never executes the requested skill or baseline; it returns fixed simulated metrics regardless of input. In a testing framework, this can mislead users into trusting fabricated performance results, causing unsafe deployment decisions or masking regressions and security issues in the actual skill.

Intent-Code Divergence

Medium
Confidence
93% confidence
Finding
The script advertises functionality verification but never invokes the actual target skill; it only validates a hardcoded simulated response. This can create a false sense of assurance, allowing broken or unsafe skills to be marked as working and potentially pass release gates without real testing.

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
This file is presented as an automated functionality tester, but its logic only checks canned simulated output and uses weak string-based field validation. In a testing framework context, that is dangerous because it can conceal regressions, integration failures, or policy-violating behavior while reporting success to developers or CI systems.

Intent-Code Divergence

High
Confidence
99% confidence
Finding
The script writes a report with fixed pass/fail counts and performance metrics regardless of the actual test outcomes, which creates falsified assurance about skill quality. In a testing framework, this can mislead operators into deploying broken or unsafe skills and can conceal failures that should block release.

Intent-Code Divergence

Medium
Confidence
98% confidence
Finding
Every test command is followed by '|| true', which suppresses failures and allows the runner to continue as if tests succeeded. This undermines the integrity of the test process and, combined with the generated report, can hide malfunctioning or insecure skills from maintainers.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal