Review Agent

Security checks across malware telemetry and agentic risk

Overview

This appears to be a legitimate review-coach skill, but it needs Review because it processes confidential drafts with broad bot scopes, third-party LLM calls, persistent archives, and an external installer/patch outside the reviewed package.

Install only after reviewing the external GitHub installer and OpenClaw patch. Use it only on Feishu/Lark or WeCom with per-peer isolation, restrict bot document/drive scopes, audit delivery_targets.json, and assume drafts, profiles, conversations, dissent, and summaries may be sent to OpenRouter and stored locally unless you change the deployment.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (28)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill declares no explicit permissions while instructing the agent to execute Python scripts, read and write session files, invoke external tooling for document/audio ingestion, and use networked Feishu/Lark and update-check functionality. This creates a hidden capability surface that weakens policy enforcement and user consent, especially because the skill may run in the main agent on channels without per-peer isolation.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The documented purpose is a review coach, but the described behavior extends into update checks, profile validation, config/credential resolution, attachment extraction/transcription, and external-tool orchestration. That mismatch can mislead operators about what the skill can access and transmit, increasing the risk of unauthorized data access, credential exposure, and unexpected network activity.

Description-Behavior Mismatch

Medium

Confidence: 83% confidence
Finding: The skill is presented as a coaching workflow, but it also publishes reviewed material to shared Lark docs and messages multiple parties on close. This expands the data dissemination boundary beyond the requester and can leak sensitive draft content if sharing targets, permissions, or recipient assumptions are wrong.

Description-Behavior Mismatch

Medium

Confidence: 83% confidence
Finding: The persona describes a document-merge stage that can generate a revised draft or directly edit external documents, which exceeds a narrow 'review coach / Q&A' expectation and creates capability expansion risk. In this skill context, that is meaningful because users may share sensitive pre-meeting materials expecting critique, not automated rewriting or mutation of source documents.

Intent-Code Divergence

High

Confidence: 90% confidence
Finding: The file says the agent must not write answers for the requester, but later process guidance requires 'specific modification suggestions' and even replacement text. This contradiction is dangerous because it can cause the agent to drift from critique into authorship, undermining user intent boundaries and increasing the chance that sensitive business content is rewritten or fabricated by the system.

Intent-Code Divergence

High

Confidence: 92% confidence
Finding: The Q&A loop instructs the agent to send direct rewrite guidance such as 'change X to Y,' which conflicts with the earlier hard rule to only ask questions and not provide answers. In a workflow handling drafts and decision briefs, this inconsistency can cause unauthorized ghostwriting behavior and make downstream recipients think the requester's own reasoning was supplied by the agent.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The documented design allows a requester-facing review tool to automatically distribute session outputs to third parties and archival locations, including bosses and local storage. That materially expands data flow beyond the direct user interaction boundary and creates a real confidentiality risk, especially because summaries, finals, conversations, annotations, and dissent may contain sensitive internal content.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The configuration explicitly supports sending review artifacts to boss/admin recipients and storing full session records, but that behavior is not clearly necessary for a 'review coach' and is not bounded by user authorization in the README. This creates a risk of covert escalation or oversharing of drafts, feedback, and internal deliberation to parties the requester may not expect.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The documented workflow goes beyond simple review coaching and includes persistent ingestion, archival, session closing, delivery to targets, and dashboard updates. Those side effects expand the trust boundary and can move or expose sensitive draft content in ways not clearly disclosed by the skill description, increasing the risk of unintended data handling and overreach.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The skill instructs fetching external documents and processing PDFs, images, and audio through local or external tools, including link-based retrieval from Lark and Google Drive. For a review coach, these capabilities materially widen access to user data and can pull in sensitive content from external systems without strong justification, scoping guarantees, or consent prompts.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The code assembles session metadata, responder/requester identifiers, profile excerpts, draft materials, findings, and dissent logs into a prompt and sends them to OpenRouter. For a review-coach skill handling potentially sensitive meeting prep and internal documents, this is a real data-exposure issue because content that appears local is transferred to a third-party LLM service without minimization in this file.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The script reads user metadata from a global `~/.review-agent/users/<open_id>/meta.json` store for both responder and requester, extending access beyond the current session directory. In a multi-user agent environment, cross-user metadata lookup increases privacy risk and broadens trust boundaries, especially when those names/profile details are later included in prompts sent externally.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The ingest logic automatically fetches Google Docs content from URLs discovered in user-provided text, expanding the skill from local attachment normalization into remote content retrieval. In a review-coaching context, that broadens data access and can unintentionally pull in third-party or overprivileged content without clear boundary checks or user confirmation.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The script sends the draft content, accepted findings, and responder/profile context to OpenRouter, an external third-party LLM service. In a review-coaching workflow, those materials can contain sensitive business plans, internal documents, personal notes, or manager-specific information, and the skill description does not clearly disclose this external data transfer, so users may unknowingly expose confidential data.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The persona allows a 'direct' mode that edits Lark or Google documents via API but does not require an explicit user-facing warning, confirmation step, or authorization boundary. In this skill context, direct external-document mutation is especially risky because review materials may be business-critical and shared across platforms; unintended edits could alter records, leak information, or overwrite user-owned content.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The workflow says summaries, final materials, and dissent logs are automatically delivered to the Responder and Requester without an explicit privacy or data-sharing notice. Because this skill processes drafts, proposals, agendas, and possibly sensitive attachments, automatic forwarding can expose confidential business information or internal disagreements to recipients the user did not fully realize would receive them.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The README describes distributing potentially sensitive materials through DM, email, and local archives without any explicit privacy warning, consent mechanism, or data-sharing disclosure. Users could reasonably believe they are interacting with a private coaching agent while the system retains and forwards full records, creating significant privacy and trust harm.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill persistently stores original inputs, normalized content, conversation history, annotations, dissent, metadata, and errors on disk, but the user-facing description does not warn the briefer that their drafts and replies will be retained. For pre-meeting material that may contain confidential business information, undisclosed persistence materially raises privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill may transfer user content to external tools or services for OCR, transcription, document retrieval, and delivery, yet the description does not tell users their materials could leave the immediate chat environment. Because briefing drafts may contain sensitive internal strategy or personnel information, undisclosed external transfer significantly increases confidentiality and third-party exposure risk.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: The profile instructs the agent to default to Chinese when the Requester’s language is ambiguous, which can cause responses in an unintended language without explicit user choice. In a review workflow handling potentially sensitive business drafts, this creates privacy, usability, and misdelivery risks if content is translated, mirrored, or exposed to the wrong audience expectations.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: `build_synthesis_brief` constructs a prompt containing session details, material excerpts, user identity data, and audit/dissent content, then submits it to an external LLM API. There is no evidence in this file of user-facing disclosure, consent gating, or data-classification checks, which is risky for pre-meeting review data that may include confidential business information.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: This code sends the requester's conversation history, submitted materials, and responder profile content to OpenRouter without any disclosure, consent gate, or minimization visible in this path. In a review/pre-meeting workflow, these inputs are likely to contain sensitive internal plans, personal data, or confidential business information, so silent transfer to a third-party model provider creates a real data exposure and compliance risk.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code sends up to 6,000 characters of the final brief plus accepted findings to OpenRouter, an external LLM service, whenever --verify-final is used. In a review-coach skill handling drafts, proposals, and meeting materials, this can expose sensitive business content or personal data to a third party without any explicit consent check, redaction, or local-only fallback in this code path.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code fetches Google Docs content server-side through a CLI without any explicit user-facing warning or consent in that execution path. This can surprise users, leak the fact that server credentials can access linked documents, and retrieve content beyond what the requester expected the skill to process.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: At the LLM invocation point, the script transmits normalized document content and responder standards/profile text externally without any user-facing warning, confirmation, or in-band notice. Because this happens during document merging, users may reasonably believe their drafts remain inside the agent environment, making the undisclosed outbound transfer a real privacy and compliance risk.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal