test
Security checks across static analysis, malware telemetry, and agentic risk
Overview
The skill is mostly a coherent MCP-building guide, but its evaluation harness can let Claude automatically call any exposed MCP tool, including potentially write-capable tools, without enforcing read-only limits.
Install/read this skill only if you are comfortable reviewing and controlling the evaluation scripts. If you run them, use a test MCP server or a read-only account, avoid sensitive production data, and add tool allowlists or confirmations before allowing Claude to call tools that can modify external services.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If run against an MCP server connected to real accounts or systems, Claude could call tools that change data or perform external actions during an evaluation.
The harness exposes all tools from the connected MCP server to Claude and automatically executes the model-selected tool calls. The documentation says evaluations should be read-only, but this code does not enforce read-only/destructive annotations or require per-call approval.
tools = await connection.list_tools() ... while response.stop_reason == "tool_use": ... tool_result = await connection.call_tool(tool_name, tool_input)
Run evaluations only against test data or read-only servers, add an allowlist or annotation filter for read-only tools, and require explicit confirmation before any write-capable tool call.
A malicious or untrusted MCP server command could run code on the user's machine.
The helper can launch a user-specified local command for stdio MCP servers. This is normal MCP plumbing, but it means running the evaluator can execute local server code.
return stdio_client(StdioServerParameters(command=self.command, args=self.args, env=self.env))
Only use stdio mode with MCP servers and command lines you trust, and prefer isolated test environments for evaluations.
Running the helper may use the user's Anthropic account and associated API quota.
The evaluator uses the Anthropic SDK, which typically requires an Anthropic API credential from the user's environment. The registry metadata declares no required environment variables, so users may not notice this dependency until running the script.
client = Anthropic()
Document the required Anthropic API key and expected billing/usage behavior before running evaluations.
Data returned by MCP tools, potentially including private service data, may be sent to Anthropic and may also be summarized in the evaluation output.
MCP tool outputs are added to the conversation sent to Anthropic for the next model call. This is expected for a Claude-based evaluator, but it crosses a data boundary.
messages.append({"role": "user", "content": [{"type": "tool_result", ... "content": tool_response}]}) ... client.messages.create(... messages=messages, tools=tools)Use non-sensitive fixtures where possible, redact sensitive results, and make sure users understand which provider receives tool outputs.
A future package version could change behavior or introduce incompatibilities when users manually install the helper dependencies.
The helper script dependencies use lower-bound version constraints rather than pinned versions. There is no install spec, so this is not automatically installed, but manual installs may resolve to newer unreviewed package versions.
anthropic>=0.39.0 mcp>=1.1.0
Pin and verify dependency versions for reproducible evaluation runs.
