baml-codegen

Security checks across malware telemetry and agentic risk

Overview

This is a coherent BAML code-generation skill, with expected caution needed around generated code, local generation hooks, and external LLM provider data handling.

Install this if you want an agent to generate BAML project files. Review generated diffs before committing, inspect any on_generate hook before running baml-cli generate, use trusted MCP servers, and do not process sensitive invoices, receipts, resumes, medical records, images, or internal documents through external model providers unless you have appropriate consent, redaction, and provider controls.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep

Findings (8)

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The reference guide repeatedly shows configuration for third-party LLM providers and API-key usage, but does not warn users that prompts, documents, images, audio, and other inputs may be transmitted off-host to external services. In an agent skill context, this omission is security-relevant because users may apply the examples to sensitive data and unintentionally exfiltrate secrets, source code, customer data, or regulated content.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The FastAPI invoice example extracts PDF text from an uploaded file and sends the full contents to an external LLM service without any notice, consent flow, minimization, or redaction guidance. Invoice PDFs commonly contain sensitive financial and personal data, so example code that normalizes silent third-party transmission can lead developers to deploy privacy-impacting behavior by default.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The sentiment API example forwards arbitrary user-submitted text directly to an external LLM without warning that the text may leave the application boundary. While sentiment inputs are often lower sensitivity than invoices or receipts, users may still submit personal, confidential, or regulated content, making the omission a real privacy risk in sample code.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The RAG example sends retrieved document contents, including quoted excerpts, to an external LLM with no disclosure that internal documents may be exposed outside the system. In practice, RAG corpora often contain proprietary, customer, legal, or regulated information, so omission of privacy and data-governance safeguards is materially risky.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The receipt analysis endpoint reads uploaded images and submits them to an external vision model without warning about disclosure of image contents and embedded purchase/payment data. Receipts can contain names, merchants, dates, partial card details, and location data, so this example encourages silent third-party processing of sensitive financial information.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The examples show raw user-provided text being sent directly to external model providers without any accompanying privacy notice, consent guidance, or suggestion to minimize/redact sensitive content first. In a code-generation/pattern library context, users may copy these patterns into production and unknowingly transmit invoices, contracts, resumes, or other regulated data to third-party APIs.

Missing User Warnings

High

Confidence: 97% confidence
Finding: Listing medical records as a supported extraction use case without any warning materially increases the risk that developers will process highly sensitive health information through external LLM APIs without appropriate safeguards. This can create serious privacy, regulatory, and contractual exposure because health data often requires heightened handling, consent, and vendor controls.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The documentation encourages receipt, invoice, resume, and form extraction from images and documents but does not warn that these inputs commonly contain sensitive personal, financial, or identity data. In a code-generation skill, this omission can lead downstream users to send regulated or confidential data to external LLM providers without considering consent, data minimization, retention, or redaction requirements.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal