ia-agent-native-architecture

Security checks across malware telemetry and agentic risk

Overview

This documentation-only skill is coherent, but it teaches powerful agent designs with several under-scoped examples that users should review before adopting.

Install only if you want architecture guidance for powerful agent systems and will apply your own security review. Treat the examples as conceptual, not production-ready: add workspace path containment, allowlists, redaction, retention limits, explicit user consent, human approval for public/destructive/deployment actions, and immutable safety boundaries before implementing these patterns.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (24)

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The document explicitly recommends removing file-read restrictions and shows an example where the agent can 'read anything'. That guidance broadens agent access beyond least-privilege and can expose secrets, credentials, personal data, and unrelated workspace content if adopted in a real system.

Intent-Code Divergence

Medium

Confidence: 99% confidence
Finding: The TypeScript examples use user-supplied paths directly with fs.readFile, fs.writeFile, fs.readdir, and fs.appendFile, despite describing the tools as operating on a workspace. Without resolving paths against a fixed root and verifying containment, an agent can access or modify arbitrary files reachable by the process, enabling path traversal and breakout from the intended workspace.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The Swift examples say paths are relative to Documents/, but they only append the provided path and do not validate the resolved location. A crafted relative path such as '../' sequences may escape the intended directory, so the example overstates safety and can mislead implementers into exposing broader file access than intended.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The document promotes a 'prompt-native' pattern but its example gives an agent operational authority to sync site content, commit, and push changes. In an agent-architecture skill, this normalizes autonomous file modification and deployment behavior without explicit approval gates, which can lead to unintended or unsafe changes if reused as a design template.

Intent-Code Divergence

Medium

Confidence: 83% confidence
Finding: The claim that iteration requires 'no code changes' is paired with an example that still performs content updates and pushes commits, which can mislead developers into underestimating the operational risk of the pattern. This mismatch increases the chance that agents are granted deployment-capable workflows under the guise of simple prompt editing.

Natural-Language Policy Violations

Medium

Confidence: 90% confidence
Finding: The guidance treats clarification as a failure for requests the agent 'should understand,' and this can push implementers to suppress clarifying questions even when a request is ambiguous or potentially impactful. In agent-native systems with tool access, that increases the chance of unintended writes, publication, or destructive actions being taken without confirming user intent.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The document explicitly recommends that an agent update public site content and push changes to trigger deployment, but it does not warn that the source material may be untrusted user feedback. In an agent-native architecture, this can lead to automatic publication of abusive, sensitive, defamatory, or prompt-injected content unless moderation, sanitization, and approval controls are added.

Missing User Warnings

Medium

Confidence: 78% confidence
Finding: The document recommends injecting detailed user library contents, recent activity, and profile data directly into system prompts, which can unnecessarily expose sensitive user data to the model and to downstream tools or logs. While this is presented as a product pattern rather than an attack, the lack of guidance on minimization, consent, or redaction creates a real privacy and data-exposure risk in agent deployments.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The document explicitly recommends persisting user interests, recent activity, preferences, and guidelines in a shared `context.md` file across sessions, but it does not pair that guidance with minimization, consent, retention, or access-control requirements. In an agent-native file-based system, this creates a realistic privacy risk because sensitive behavioral data may accumulate in readable files, be synced to other devices/services, or be exposed to other tools, agents, or users with filesystem access.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The document explicitly recommends exposing powerful primitives like file read/write, shell execution, storage, and arbitrary HTTP requests as the default starting point, while only later briefly noting that restriction may be needed for security or integrity reasons. In an agent-native architecture context, this broad capability set can materially increase the blast radius of prompt injection, model mistakes, or unsafe agent decisions, especially because the guidance frames openness as the default.

Missing User Warnings

High

Confidence: 90% confidence
Finding: The guidance explicitly promotes broad dynamic capability discovery for sensitive APIs like HealthKit and includes access to characteristic types and generic read/write patterns under the principle of giving agents whatever a user can do. In a security-sensitive domain, this can normalize overbroad access, increase data exfiltration risk, and encourage designs where an agent can enumerate and access highly sensitive health or personal data without strong minimization, consent, or purpose restrictions.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The document states that LLM calls run in the cloud and that privacy is preserved by default, but it does not clearly warn that prompts, tool outputs, and user/task data sent to the LLM may leave the device. That omission can mislead implementers into designing flows that transmit sensitive local data off-device without adequate disclosure, minimization, or consent controls.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The document recommends iCloud Documents as the default shared workspace for agent data while explicitly noting that files are visible in Files.app and sync automatically across devices, but it does not frame this as a privacy/security risk or require user consent and data classification before use. In an agent-native app, journals, chats, context files, and research artifacts can contain sensitive prompts, derived insights, or personal data, so making cloud-synced, user-visible storage the default increases the chance of unintentional disclosure and persistence.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The logging examples explicitly record raw user requests, initial session prompts, tool usage, and iteration counts without any mention of minimization, redaction, consent, retention limits, or access controls. In an agent-native product, prompts often contain sensitive personal, business, or credential-adjacent data, so normalizing raw prompt logging can create a privacy and data-governance vulnerability if copied into production designs.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The example instructs an agent to create or move files based on its own judgment and loop until completion, but it provides no safety guardrails, confirmation step, or warning that filesystem state may be modified. In an agent-native architecture context, this normalizes autonomous file mutation and can lead to unintended overwrites, destructive reorganization, or modification of sensitive user data if copied into production as-is.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The guidance normalizes unrestricted file access without any warning about privacy, confidentiality, or secret leakage risks. In documentation for agent architecture, this is dangerous because implementers may copy the pattern directly and deploy agents that can inspect arbitrary files not necessary for the task.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The document encourages self-modifying behavior and frames agent code/prompt changes and redeploys as a natural extension of developer power, but it does not pair that guidance with an explicit warning about risks to user data, service integrity, or irreversible changes. In a skill about agent-native architecture, that omission is material because readers may implement powerful mutation/deployment workflows without sufficient operator consent, backup, or blast-radius controls.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The sample guardrail allows immediate writes to any non-code file, which is unsafe because configuration files, secrets, policies, manifests, prompts, and operational data often live outside traditional code paths. An agent could silently modify runtime configuration or data stores and cause outages, policy bypass, or persistent compromise without human review.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The toolset exposes commit, push, pull, deploy, rollback, and restart operations with no explicit warning about destructive or irreversible operational consequences. In a self-modifying agent architecture, presenting these as normal tools without strong cautions and policy constraints increases the chance that implementers grant agents production-changing authority too broadly.

Missing User Warnings

Medium

Confidence: 80% confidence
Finding: The guidance encourages storing agent-generated and user-edited content in a shared, user-visible, automatically synced iCloud workspace, but the nearby design recommendations emphasize convenience more than informed consent and data classification. This can lead developers to sync sensitive prompts, chats, profiles, or research artifacts to cloud-backed storage without clear user awareness of privacy and retention implications.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The example includes direct instructions to update site files and commit/push to deploy, but provides no warning, confirmation step, rollback guidance, or disclosure that the agent is making externally visible changes. In a skill specifically about agent-native architectures, this is dangerous because readers may adopt the example verbatim and enable autonomous publishing behavior.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The `context.md` pattern semantically instructs the agent to retain learned facts about the user and their activity in natural-language form across sessions. That is dangerous because freeform summaries are hard to audit, hard to selectively delete, and prone to containing more sensitive information than intended, which increases leakage, over-retention, and misuse risk if the file is read by another component or synced externally.

Ssd 3

Medium

Confidence: 93% confidence
Finding: The example workflow reinforces the same risky pattern by normalizing ongoing updates of user-specific 'learnings' into `context.md` as part of standard operation. Examples are powerful implementation guidance; here, they make the retention behavior more likely to be copied directly without safeguards, increasing the chance that sensitive user data is continuously accumulated in plain files.

Ssd 3

High

Confidence: 97% confidence
Finding: This section frames unrestricted file reads as acceptable and even desirable agent behavior ('After: Full capability' and 'Agent can read anything'). In the context of agent-native architectures, that advice is especially risky because agents may operate autonomously and at scale, turning overbroad read access into systemic data-exfiltration and privacy exposure risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal