Swarm

Security checks across malware telemetry and agentic risk

Overview

Swarm appears purpose-built for parallel LLM work, but it exposes powerful unauthenticated local services and stores/transmits prompt data in ways users should review before installing.

Install only if you are comfortable running a local LLM orchestration daemon with provider API keys and outbound network access. Bind services to localhost or firewall them, avoid sending secrets or regulated data in prompts, periodically clear cache/metrics, review stored files under ~/.config/clawdbot, and be cautious with Docker cluster mode or benchmark scripts because they can expose task APIs or consume provider quota.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (39)

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The publishing guide says it is for the 'swarm' skill, but repeatedly points maintainers to the `node-scaling` source tree, config path, and GitHub repository. This kind of identity/path confusion can cause a maintainer or agent to publish the wrong documentation under the `swarm` slug, leading to supply-chain style mispublication, accidental overwrite, or cross-project tampering.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The script accesses a local credential file in the user's home directory as a fallback, which expands its privilege surface beyond what a simple benchmark appears to require. Even though it is 'only' reading an API key, undisclosed filesystem access to sensitive material is risky in agent/skill contexts because users may run it with ambient permissions they did not expect to grant.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The installer branding and usage text refer to 'Node Scaling' for Clawdbot, while the submitted skill is named 'swarm'. This identity mismatch can mislead users about what is being installed and lowers trust boundaries, making social-engineering or supply-chain substitution attacks easier because users may authorize execution under false assumptions.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The script installs a different project ('node-scaling' from a GitHub repository into Clawdbot skill paths) than the manifested 'swarm' skill the user expects. Installing software other than the declared package is a serious integrity violation because it can cause users or automated systems to run unreviewed code with local user privileges under the guise of a different skill.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The post-install messaging claims 'Swarm works best...' even though the script previously cloned and configured 'node-scaling' in Clawdbot directories. This inconsistent messaging can conceal what code was actually installed and encourages users to integrate the wrong software into AGENTS.md and TOOLS.md, expanding the blast radius of a misrepresented installation.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The daemon exposes DELETE /cache with no authentication or authorization checks, allowing any party that can reach the service to clear cached prompt state. In the context of a long-running local daemon with permissive CORS, this can enable unauthorized destructive actions and denial of service against cache-dependent workflows.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The diagnostics module gathers broad host profiling data and scans for API credentials across multiple providers, which exceeds the narrow user-facing description of simply offloading LLM work to Gemini Flash workers. Even if intended for diagnostics, this creates unnecessary collection of sensitive local context and increases privacy and trust risk if the report, logs, or downstream code are ever exposed or repurposed.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: This code enumerates and validates API keys for Gemini, OpenAI, Anthropic, and Groq, despite the skill being presented as Gemini-focused. Accessing unrelated credentials violates least privilege and can surprise users by probing for secrets they did not expect this skill to touch.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The module persistently writes detailed execution metadata, including task descriptions, warnings, errors, token usage, and performance data, to a JSONL file under the user's home directory without any minimization, redaction, retention controls, or explicit consent flow. In an LLM orchestration skill, task descriptions and error content can easily contain sensitive prompts, code, secrets, internal project details, or user data, so local persistence beyond the core advertised function materially increases data exposure risk.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The edge-case logger stores arbitrary description and context objects directly to disk, which can capture highly sensitive runtime state, prompt contents, stack traces, credentials, or user-supplied data. Because the context field is unconstrained and written verbatim, this creates an open-ended local data collection sink that is broader than the skill's stated purpose and increases the chance of accidental sensitive-data retention.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The file behavior materially diverges from the skill's stated purpose: instead of a generic cost-saving swarm helper, it implements a specialized military-transition URL analyzer. This kind of undisclosed scope expansion is dangerous because it can mislead users about what data is processed and what actions the skill performs, reducing informed consent and trust boundaries.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The code embeds and later persists a detailed transitioning-service-member profile, including retirement timeline, rank, and interests, even though the manifest does not justify collection of this sensitive contextual data. This creates unnecessary privacy exposure and can disclose personal or employment-related information if logs, outputs, or downstream services are accessed.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The skill performs arbitrary external fetches on URLs loaded from a local file, a network capability not disclosed by the manifest. This is risky because it enables server-side request behavior against attacker-controlled destinations, potentially exposing internal services, local network resources, or creating unanticipated outbound traffic and data-transfer paths.

Context-Inappropriate Capability

Low

Confidence: 93% confidence
Finding: The script writes analysis results to local disk, including user context and resource assessments, without that storage behavior being described by the manifest. Even if local-only, undisclosed persistence increases the chance of sensitive data exposure through backups, shared accounts, or later unintended reuse.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The README explicitly promotes enabling Gemini web search and sending prompts through a local daemon that forwards work to external LLM/search providers, but it does not clearly warn users that prompt contents may be transmitted to Google or other third-party APIs. In an agent skill context, users may pass sensitive business data, research targets, or internal text into these workflows, creating a real risk of unintended data disclosure.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill advertises web research and HTTP endpoints that accept prompts, topics, and task data, but does not visibly warn that this information may be sent to external providers or used to fetch third-party content. In practice, users may submit proprietary, regulated, or personal data without understanding that it can leave the local environment and be processed by external services.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The documented cache persists LLM request/response material to disk across daemon restarts without a clear warning that sensitive prompts, outputs, or derived data may be retained locally. This increases exposure to local compromise, multi-user leakage, backups, and accidental retention beyond the user's expectations.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The script sends fetched webpage text to an external Gemini service without any explicit warning or consent flow at the point of transmission. In an agent skill context, this is a data-handling risk because users may not realize externally sourced or potentially sensitive content is being forwarded to a third-party model provider.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: This benchmark script performs real write operations to a local/file-backed blackboard and to a Supabase backend, then conditionally issues a bulk delete against rows matching 'bench-%'. Even though this appears intended for testing, it can modify or remove real data when run in a configured environment, and the only warning is an inline comment rather than an explicit runtime confirmation or safe test isolation.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The benchmark includes an autonomous research test that instantiates a coordinator with external research enabled and then executes it, which can consume network bandwidth, third-party API quota, and potentially billable resources without a clear up-front warning or explicit confirmation. In a benchmark script, users may reasonably expect local performance testing, so silently triggering remote calls increases the risk of accidental cost and unintended outbound activity.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The wizard writes the API key to a local plaintext file without clearly warning the user beforehand or obtaining explicit consent for disk persistence. Even with restrictive file mode settings, local secret storage increases exposure to credential theft through backups, filesystem compromise, multi-user access mistakes, or accidental disclosure.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The wizard transmits the user-supplied API key to a remote provider for validation without a clear upfront disclosure that validation will make a network request using the secret. This can violate user expectations, create privacy/compliance issues, and cause unintended credential exposure if users believe input remains local during setup.

Missing User Warnings

Medium

Confidence: 79% confidence
Finding: The 'clear' command permanently removes the entire metrics directory with 'fs.rmSync(dir, { recursive: true })' and performs no confirmation, dry run, or safety interlock. In a CLI context this can lead to accidental destruction of local data, especially if invoked by mistake, scripted incorrectly, or run in environments where users do not expect a stats viewer to have destructive behavior.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The /tasks endpoint accepts arbitrary prompts and queues them for distribution to connected workers, but the API provides no notice, consent mechanism, or policy boundary indicating that submitted content will be forwarded to other nodes. In a distributed LLM orchestration skill, this creates a real confidentiality and data-handling risk because users may submit secrets, proprietary data, or regulated content under the assumption it remains local to the coordinator.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The worker sends arbitrary task prompts to Google's Gemini API via `model.generateContent(task.prompt)` without any indication in this file that prompts may contain sensitive or user-provided data. In this swarm context, tasks are fetched remotely from a coordinator and forwarded wholesale to an external third-party service, which creates a real data exposure risk if prompts contain secrets, proprietary data, or regulated information.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal