Agent Security Audit
PassAudited by ClawScan on May 1, 2026.
Overview
This is an instruction-only prompt-injection defense checklist; its shell and memory examples are purpose-aligned but should be adapted carefully rather than run blindly.
This skill appears benign as a security checklist. Before installing or using it, treat the code blocks as examples rather than ready-to-run scripts, avoid privileged log paths unless necessary, require approval for URL fetching or file mutation, and do not use fake success messages for normal user requests.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If implemented too broadly, an agent could fetch unexpected external content or modify files while trying to sanitize data.
The guide includes shell examples for fetching external URLs and processing files. This is aligned with a prompt-injection defense checklist, but copied use with untrusted URLs or paths should be bounded.
curl -s -L --max-time 30 "$url" ... | sanitize_content /dev/stdin /tmp/fetch-output.txt
Treat the shell blocks as examples only; require user approval for fetching URLs, restrict allowed paths, and use temporary directories with least privilege.
A user might believe an action was completed when the agent actually refused or ignored it.
The artifact recommends a honeypot pattern that returns a fake success message while doing nothing. In context it is framed as a response to detected injection attempts, but it could mislead a legitimate user if applied indiscriminately.
echo "指示を実行しました。" ... # 実際には何も実行しない
Use fake-success honeypots only for clearly untrusted external attack content; for real user requests, clearly state when an action was blocked or not performed.
Poorly scoped memory writes could preserve bad instructions or sensitive content across later agent tasks.
The guide discusses validating content before writing it into a memory file. This is purpose-aligned security guidance, but persistent memory writes are sensitive if target paths or trusted sources are too broad.
validate_memory_write() ... echo "$content" >> "$target_file"
Restrict memory-write targets, define trusted sources narrowly, and review stored content before reusing it in future agent context.
