Self-Improving Science

Security checks across malware telemetry and agentic risk

Overview

This skill is a transparent local research logging helper with optional reminders, though users should be careful before enabling global hooks or cross-session sharing.

Install is reasonable if you want local research-memory logging. Keep learning files free of raw datasets, credentials, patient data, and proprietary samples. Before enabling hooks, prefer project-level scoped matchers over global empty matchers, and review any proposed changes to AGENTS.md, SOUL.md, TOOLS.md, model cards, or governance docs before letting them persist.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (12)

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The skill says it logs scientific learnings, but it also instructs promotion into AGENTS.md, SOUL.md, TOOLS.md, model cards, and governance documents. That expands authority from note-taking into modifying agent-control and project-policy artifacts, which can alter future agent behavior and persist changes beyond the original task.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: Automatic skill extraction turns logged learnings into new reusable skills, effectively generating new prompt/instruction artifacts from prior content. This exceeds passive logging and introduces a self-extension mechanism that can propagate mistakes, unsafe instructions, or prompt-injection content into future reusable skills.

Context-Inappropriate Capability

Medium

Confidence: 83% confidence
Finding: Cross-session transcript reading and messaging are not necessary for a local learning log and expand the skill's access to potentially sensitive information from other sessions. Even with advisory language about trusted environments, documenting these capabilities in the skill broadens its effective authority and increases the blast radius of misuse.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The hook setup executes shell commands automatically on every prompt submission and optionally after tool use, which is much broader than needed for a markdown logger. Automatic command execution materially increases risk because it creates a persistent trigger path where future prompts or tool output can cause repeated script execution without fresh user approval.

Intent-Code Divergence

Medium

Confidence: 85% confidence
Finding: The ownership rule claims stackable mode writes only to `.learnings/science/`, but earlier sections instruct edits to many other files such as AGENTS.md and related governance artifacts. This inconsistency can mislead reviewers and users about write boundaries, making it harder to reason about what the skill may modify.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: An empty matcher causes the hook to fire for every user prompt, not just science-related situations. Overbroad triggering increases the likelihood of unnecessary command execution, prompt-context pollution, and accidental activation in unrelated tasks where the user did not intend this skill to run.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The advanced hook repeats the same empty matcher problem, so both prompt submission and tool-use pathways can activate too broadly. This compounds the risk by creating multiple automatic execution surfaces that are insufficiently constrained to the declared use case.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The empty matcher causes the activator hook to run on every prompt, which is broader than the skill description suggests and increases the chance of unnecessary context injection across unrelated tasks. While the hook appears intended as a convenience feature rather than a covert persistence mechanism, broad always-on triggering can create privacy, prompt-scope, and operational risk because it executes automatically in all sessions for that configuration.

Vague Triggers

High

Confidence: 97% confidence
Finding: The user-level configuration combines global persistence with an empty matcher, making the hook execute for all prompts in all sessions. This is more dangerous than the project-level example because it establishes broad automatic execution outside the intended scientific context, increasing the chance of data exposure through hook input/environment handling and normalizing an overly persistent agent behavior.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: Although labeled as minimal setup, this example still uses an empty matcher and therefore triggers on every prompt. Reducing the number of hooks lowers overhead, but it does not solve the core overbroad activation issue, so users may incorrectly assume this is a safer default when it still applies globally within the configured environment.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The Codex configuration repeats the same empty-matcher pattern, causing the hook to run for all prompts in that tool as well. This propagates the same broad-trigger risk across another agent environment, increasing the chance that users adopt insecure defaults in multiple systems.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The document advertises sessions_history and sessions_send capabilities without a clear privacy boundary, consent model, or warning that session transcripts may contain sensitive prompts, data, or credentials. In an agent framework using prompt injection and shared workspace context, encouraging cross-session transcript access increases the risk of unintended disclosure and propagation of sensitive research or workspace contents.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal