Prompt Hardening

v1.0.0

硬化 agent prompt、system prompt、SOUL.md、AGENTS.md、cron prompt 使 LLM 可靠遵循指令。触发词:agent 不听话、忽略规则、绕过约束、prompt 优化、指令合规、规则强化、prompt 硬化、LLM 不遵守、模型违规、creative circumve...

0· 50·1 current·1 all-time
by_silhouette@lanyasheng
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description (prompt hardening) match the provided artifacts: SKILL.md documents 16 hardening patterns, references, a simple audit script, and a smoke test. There are no env vars, binaries, or installs that are unrelated to auditing/rewriting prompts.
Instruction Scope
SKILL.md primarily instructs the operator/agent to read target prompt files and run scripts/audit.sh to produce a 16-point audit and suggested rewrites. This is within scope. Two caveats: (1) SKILL.md repeatedly says to 'identify model history violations' but doesn't define where or how to obtain model violation history (could imply reading logs or conversation history) — that is ambiguous and may require operator guidance to avoid overbroad data access; (2) SKILL.md explicitly states the skill is advisory and should not modify prompts automatically, which reduces risk if followed.
Install Mechanism
No install spec — instruction-only plus two small code files. Nothing downloaded from the network or installed on the host during skill activation.
Credentials
The skill requests no environment variables, credentials, or config paths. The actions described (reading prompt files and running a local audit script) are proportionate to the stated purpose.
Persistence & Privilege
always is false and there are no indications the skill modifies other skills or system-wide settings. The skill can be invoked autonomously by agents (platform default) but it does not request elevated or persistent privileges.
Assessment
This skill appears to do what it says: static guidance and a small local audit script for hardening prompts. Before installing or running it: (1) Review the audit script locally — it contains several shell-logic bugs (quoting/expansion issues) so its results may be unreliable; run it in a safe sandbox or inspect and fix it first. (2) Note SKILL.md asks you to 'identify model history violations' but doesn't specify which logs or data to use — don't let the agent start reading unrelated logs or private data without explicit operator consent. (3) The skill is advisory and says it will not auto-modify prompts; insist on manual operator approval before applying any changes. (4) If you plan to use automated enforcement, pair prompt hardening with code-level/tool hooks (the skill itself recommends that) rather than relying solely on prompt edits. If you want extra assurance, ask the author for clarity on how 'model history' should be obtained and for a corrected audit.sh implementation.

Like a lobster shell, security has layers — review code before you run it.

latestvk97bqae5cwwak1txjzhbb1ct1x84cbt0

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments