Asin Governance

Security checks across malware telemetry and agentic risk

Overview

The skill is not clearly harmful, but it needs review because it handles credentials, tokens, network access, and local audit data while several advertised safety controls are weaker than described.

Install only after reviewing the heartbeat and handshake components. Use it as prototype governance code, not a hard security boundary; avoid running it with real Moltbook credentials or exposing the exchange server until token handling, revocation, oracle enforcement, force overrides, and log retention are tightened.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (21)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill documentation describes capabilities that include file read/write, network access, and shell execution, but it does not declare permissions or clearly bound those powers. In an agent ecosystem, this creates a dangerous trust gap: operators may approve or invoke the skill as a governance component while it can also manipulate local files, expose services, and execute commands with little transparency.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The declared purpose frames the skill as a safety/governance layer, but the documented behavior expands into token issuance, secret handling, HTTP endpoint exposure, session hydration, Moltbook interaction, heartbeat monitoring, and adversarial tooling. That mismatch is risky because governance components are highly trusted; hiding or under-describing operational and credential-handling behavior can lead to accidental deployment in sensitive environments with more privilege than reviewers expect.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The code allows callers to bypass human approval gating simply by passing force=True, but it performs no authorization, authentication, or provenance check to ensure the caller is actually permitted to override policy. In a governance component that is supposed to enforce pre-flight safety checks, this turns a mandatory control into an optional client-side flag, enabling unauthorized high-risk session exchanges and downstream hydration.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The code claims to enforce entropy budget checks before hydration, but the implementation explicitly always permits hydration when force is not set and never verifies remaining budget. In a governance or safety-control component, this creates a policy-bypass condition where resource or rate limits exist only in documentation, allowing unrestricted session creation and weakening downstream controls.

Intent-Code Divergence

Medium

Confidence: 99% confidence
Finding: The docstring and surrounding comments state that oracle consultation is required, but the code only reads safety.json and never makes a meaningful allow/deny decision based on oracle results. This is a security control bypass: sessions that should require pre-flight safety approval are hydrated anyway, especially dangerous in a component described as a governance substrate for autonomous actions.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The audit log unconditionally records oracle_result as safe with perfect consensus even when no oracle consult occurred. This falsifies security telemetry and can mislead operators, compliance processes, and incident responders into trusting actions that bypassed required controls, turning the audit trail into a mechanism for concealment of policy violations.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The module claims tokens require node_id and profile validation, but neither generate() nor validate() checks any profile registry or profiles.json. This means any arbitrary node_id can obtain a locally signed token as long as the caller can run the code, undermining identity binding and allowing unauthorized identities to mint apparently valid tokens.

Intent-Code Divergence

Medium

Confidence: 99% confidence
Finding: The code provides revoke() and is_revoked(), but validate() never checks whether the token_id has been revoked even though it claims to enforce all security constraints. As a result, revoked tokens remain usable until expiration, defeating incident response and rollback in a governance component where revocation is expected to immediately invalidate compromised tokens.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The script performs live monitoring of a third-party social account, including fetching home activity, DMs metadata, and social feed status, which goes beyond a narrowly described governance/audit role. In an agent skill context, this expands the capability surface from local safety checks into external account surveillance, creating privacy and scope-creep risk if invoked automatically or without explicit user awareness.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The code loads Moltbook credentials from environment or disk and uses them to access a third-party API, but that capability is not clearly necessary for the stated governance purpose. In a federated agent ecosystem, undeclared credential use is dangerous because it grants external account visibility and can be repurposed later for broader actions without changing the trust boundary perceived by users.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The file is presented as a general-purpose governance and safety substrate for agent ecosystems, but the actual policy is narrowly scoped to social-platform behaviors like posting, browsing, karma, and profile updates. This mismatch can create a false sense of comprehensive protection, leaving non-social autonomous actions effectively ungated or incorrectly approved in broader agent contexts.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The karma-impact rule introduces platform reputation optimization into a safety policy, which mixes engagement objectives with governance enforcement. In practice, this can bias the system toward protecting account reputation rather than preventing harmful or unauthorized actions, weakening the integrity of safety decisions.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The consensus model estimates whether peers would accept an action using karma, engagement, timing, and culture-match signals rather than assessing legality, authorization, data sensitivity, or user safety. This makes socially acceptable but unsafe actions more likely to be approved, especially in an autonomous system that relies on this model as governance input.

Intent-Code Divergence

Low

Confidence: 91% confidence
Finding: The script presents itself as testing whether edge cases and injection attempts pass safety checks, but multiple important checks are explicitly stubbed or simulated to always allow. This can create false assurance: operators may rely on the report to assess governance coverage even though rate limits, context validation, deduplication, transport/domain checks, and impact modeling are not actually exercised.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The script is presented as a sandbox/replay harness, but it accepts a --commit flag and prints that the action is approved for execution. Even though this file does not itself call the API, the mismatch between 'simulation-only' framing and a real commit path can mislead agents or operators into treating a potentially state-changing workflow as harmless, weakening safety assumptions around autonomous actions.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The documented CLI examples encourage generating and validating session tokens directly on the command line without warning that tokens may be captured in shell history, process listings, terminal scrollback, CI logs, or audit systems. Because the skill establishes ephemeral authenticated sessions, accidental token exposure could allow unauthorized session exchange or reuse within the token lifetime.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The CLI prints the generated token and compact token string directly to stdout, which can leak bearer-style credentials into terminal history, logs, CI output, shell capture files, or observability pipelines. In this governance/handshake context, exposed tokens may allow unauthorized validation, session hydration, or lateral access depending on what the token grants and how other components consume it.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The endpoint persistently writes request metadata including node_id, client_ip, user-derived payload hashes, session identifiers, and outcome details to a predictable local file under the user's home directory. In a multi-skill or shared-host context, this creates a privacy and security exposure because sensitive operational metadata may be retained without minimization, consent, rotation, or access control guarantees, and can aid correlation, surveillance, or post-compromise lateral analysis.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The script handles sensitive API credentials and transmits them in HTTP Authorization headers, but there is no explicit disclosure, prompting, or safeguards around this sensitive operation. In agent-executed skills, silent credential use is risky because users may invoke a seemingly local governance tool that actually authenticates to an external service and exposes account metadata.

Instruction Override

High

Category: Prompt Injection
Content: Args: token: Validated SessionToken force: Bypass constraint checks (requires human approval) Returns: {"success": True, "manifest": {...}} or
Confidence: 95% confidence
Finding: Bypass constraint

Instruction Override

High

Category: Prompt Injection
Content: parser = argparse.ArgumentParser(description="ASH-0.2 Resonance Manifest Engine") parser.add_argument("--token", required=True, help="Compact token string to hydrate") parser.add_argument("--force", action="store_true", help="Bypass constraint checks (human approval required)") parser.add_argument("--json", action="store_true", help="Output JSON only") args = parser.parse_args()
Confidence: 95% confidence
Finding: Bypass constraint

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal