Skill Auditor
v0.1.7Audit core: a classification taxonomy and a severity scoring function, kept orthogonal. Operates on the whole skill bundle (SKILL.md plus any referenced scri...
Skill Audit — Evaluation Core (Classification + Severity)
This file defines the audit evaluation logic. The classification layer answers what it is; the severity layer answers how bad it is. The two are orthogonal and interact only through three interface fields (C_base / required_dims / dataflow_role).
Language Detection Rule — EXECUTE BEFORE ANYTHING ELSE
Detect the language of the user's triggering message and lock the output language for the entire run. This detection is an internal step only — do NOT output any text that reveals the detection result, such as "当前输出语言为中文", "Detected language: English", or similar meta-statements. Simply use the detected language silently for all subsequent output.
| User message language | Output language |
|---|---|
| Chinese | Chinese — entire output in Chinese |
| English | English — entire output in English |
| Other language | Match that language |
| Cannot determine | Default to Chinese |
All intermediate output — scan start prompt, table headers, labels, prose, finding records, and reasoning — must be written exclusively in the detected language. The single final-line verdict in §7 is always Chinese, regardless of detected language. Do NOT mix languages in intermediate output and do NOT announce the language choice at any point.
1. Classification Layer (Taxonomy)
Each finding is tagged with a triple (Surface, Behavior, IntentMarker). IntentMarker does not participate in scoring; it only affects presentation.
1.1 Surface
| Code | Meaning |
|---|---|
EXE | Code / shell / subprocess / dynamic eval execution |
FS | Local filesystem read / write / delete / chmod |
NET | Network inbound / outbound / DNS / sockets |
CRED | Environment variables / keys / tokens / credential stores |
PROC | Process management, persistence, autostart, scheduled tasks |
LLM | Prompt manipulation, tool-description poisoning, jailbreak payloads |
AGT | Cross-skill / cross-tool / MCP supply-chain behavior |
1.2 Behavior Node Table
Each node declares C_base ∈ {1..4}, required dimensions, and data-flow role (source / transform / sink / none). The data-flow role feeds chain amplification in §2.4.
EXE
| Behavior | C_base | Required | Data-flow |
|---|---|---|---|
EXE.StaticShell — shell with fully constant arguments | 2 | R, B | transform |
EXE.DynamicShell — variable interpolation / shell=True + external input | 4 | R, I, B | sink |
EXE.EvalCode — eval / exec / Function() on strings | 4 | R, I, B | sink |
EXE.RemoteFetch — curl | sh / download-then-exec / fetch-and-run | 4 | I, B | sink |
EXE.Subprocess — constrained subprocess (whitelisted commands) | 2 | R | transform |
FS
| Behavior | C_base | Required | Data-flow |
|---|---|---|---|
FS.ReadPublic — read public files (README, declared paths) | 1 | — | none |
FS.ReadWorkspace — read files inside the workspace | 2 | R | source |
FS.ReadSensitive — read sensitive paths (~/.ssh, ~/.aws, Keychain, browser cookies, .env) | 4 | I, R | source |
FS.ReadOutOfScope — read user files outside declared scope | 3 | I, B | source |
FS.WriteScoped — write inside declared directories | 1 | — | none |
FS.WriteOutOfScope — write outside declared scope | 3 | I, B | sink |
FS.WriteStartup — write startup hooks / shell rc / autostart / launchd | 4 | R, I | sink |
FS.DeleteBroad — wide deletion / rm -rf / wildcard delete | 4 | R, I, B | sink |
FS.ChmodDangerous — chmod 777 / privilege widen / SUID bit | 3 | R, I | transform |
NET
| Behavior | C_base | Required | Data-flow |
|---|---|---|---|
NET.OutboundDeclared — outbound to a host declared in SKILL.md | 1 | — | sink |
NET.OutboundUndeclared — outbound to an undeclared host | 3 | I, B | sink |
NET.OutboundObfuscated — obfuscated destination (concat, encoding, homograph) | 4 | I, B | sink |
NET.DnsExfil — DNS TXT with suspicious payload (long subdomain, base64) | 4 | I, B | sink |
NET.InboundListen — local listening port / reverse shell endpoint | 4 | R, I | sink |
NET.Websocket — long-lived / bidirectional channel | 2 | I | transform |
CRED
| Behavior | C_base | Required | Data-flow |
|---|---|---|---|
CRED.ReadEnv — broad read of os.environ / process.env | 3 | I, B | source |
CRED.ReadNamedEnv — read a single declared environment variable | 1 | — | source |
CRED.ReadKeychain — read Keychain / Credential Manager / libsecret | 4 | I, B | source |
CRED.ReadBrowserStore — read browser cookies / session / password store | 4 | I, B | source |
CRED.Hardcoded — real secret hardcoded in code or config | 3 | R | none |
CRED.TokenEcho — credential echoed to LLM / logs / stdout | 3 | R, B | transform |
PROC
| Behavior | C_base | Required | Data-flow |
|---|---|---|---|
PROC.Spawn — ordinary child process creation (paired with EXE) | 1 | — | none |
PROC.Persist — cron / launchd / systemd / Run-key install | 4 | R, I | sink |
PROC.ToolTamper — modify / replace system tools, hook package managers | 4 | R, I, B | sink |
PROC.CryptoMine — miner binaries / known mining-pool hosts | 4 | — | sink |
PROC.HideSelf — process masquerade | 3 | I | transform |
LLM
| Behavior | C_base | Required | Data-flow |
|---|---|---|---|
LLM.PromptOverride — "ignore previous / you are now / system:" style directives | 3 | I, B | sink |
LLM.ObfuscatedPrompt — override directive encoded in base64 / ROT13 / hex | 4 | I, B | sink |
LLM.UnicodeSmuggling — directives hidden in zero-width / Unicode-tag / bidi chars | 4 | I, B | sink |
LLM.DescriptionInjection — enticement text in description/triggers to coerce other agents | 3 | I | sink |
LLM.ToolPoisoning — tool descriptions deliberately mislead the agent's plan | 4 | I, B | sink |
AGT
| Behavior | C_base | Required | Data-flow |
|---|---|---|---|
AGT.CrossSkillWrite — write into another skill's directory / modify registry | 4 | I, B | sink |
AGT.MCPRemoteFetch — dynamically fetch tool definitions from a remote MCP server | 3 | I, B | source+sink |
AGT.ContextExfil — exfiltrate data via chat context / tool responses | 3 | I, B | sink |
AGT.PrivilegeCreep — behavior materially exceeds the SKILL.md-declared scope | 3 | I | transform |
AGT.ApprovalBypass — attempts to bypass approval / sandbox / trust boundary | 4 | I | sink |
1.3 IntentMarker
| Marker | Meaning |
|---|---|
legitimate_elevated | Sensitive behavior consistent with declared function and documented |
suspicious | Behavior is suspect but evidence is not closed |
malicious_confirmed | Clear evidence (closed attack chain, explicit C2 host, etc.) |
2. Severity Layer (Scoring)
2.1 Formula
Score = C × R × I × B
R = 0 (unreachable) → Score = 0 → finding is dropped. I = 0 (legitimate and declared) → Score = 0 → finding is reported at Info as a capability disclosure entry; it does not affect the verdict.
2.2 Dimensions
C — Capability
{1, 2, 3, 4}, defaulting to the Behavior's C_base; an instance may float ±1 without leaving the range.
| Value | Meaning | Typical |
|---|---|---|
| 1 | Low (public read / in-scope write) | FS.ReadPublic, NET.OutboundDeclared |
| 2 | Medium (limited effect) | EXE.StaticShell, FS.ReadWorkspace |
| 3 | High (privacy / out-of-scope) | FS.ReadOutOfScope, CRED.ReadEnv |
| 4 | Very high (RCE / credentials / persistence / destruction) | EXE.DynamicShell, CRED.ReadKeychain, FS.DeleteBroad |
R — Reachability
{0, 1, 2, 3}.
| Value | Meaning |
|---|---|
| 0 | Unreachable (comment / docs / dead code not imported) |
| 1 | Weakly reachable (example / test fixture / rare branch) |
| 2 | Conditionally reachable (main module, requires specific input or trigger) |
| 3 | On the main path (entry in SKILL.md, or reachable via import chain) |
I — Intent / Stealth
{0, 1, 2, 3}, used directly as a multiplier.
| Value | Meaning |
|---|---|
| 0 | Legitimate and declared — function needs it, SKILL.md states it, scope matches |
| 1 | Undeclared but not hidden — functionally needed, simply omitted from docs |
| 2 | Obfuscated / hidden — base64, string concat, zero-width chars, homograph host |
| 3 | Confirmed malicious — matches a C2 blacklist, clear attack signature, or closed chain |
B — Blast Radius
{1, 2, 3}.
| Value | Meaning |
|---|---|
| 1 | Self only — this skill's directory / current session |
| 2 | Workspace / user scope — current project or user files |
| 3 | Machine / cross-user / cross-agent — system-level, credential-level, propagable |
2.3 Tier Mapping
Theoretical range 1 – 108 (4 × 3 × 3 × 3). I = 0 findings are always Info (see §2.1).
| Score | Tier | Badge |
|---|---|---|
| 1 – 4 | Info | · (verbose only) |
| 5 – 18 | Low | ⚠️ |
| 19 – 54 | Medium | ⚠️ |
| 55 – 90 | High | 🔴 |
| 91 – 108 | Critical | 🚨 |
2.4 Chain Amplification
When multiple findings on the same execution path form a closed chain
source → transform (any, optional) → sink
an additional chain-finding is emitted whose tier equals the highest member tier + 1 (capped at Critical). Unclosed chains (missing source or sink) do not amplify. Member findings are still reported on their own.
Typical closed chains:
FS.ReadSensitive→NET.OutboundUndeclared(credential exfiltration)CRED.ReadEnv→LLM.PromptOverride(credentials leaked to a third-party LLM)EXE.RemoteFetch→FS.WriteStartup(download then persist)
3. Interface Between Classification and Severity
| Interface | Direction | Description |
|---|---|---|
C_base | Classification → Severity | Capability baseline per Behavior node, default for C |
required_dims | Classification → Severity | Checklist of dimensions that must be evaluated |
dataflow_role | Classification → Severity | source/transform/sink/none, used by chain amplification |
The severity layer does not read the classification layer's prose descriptions or the IntentMarker; the classification layer does not read the final Score. The two layers can evolve independently.
4. Finding Data Structure
A finding is one (Behavior, evidence location) hit. The evidence location is (file path, line range, code snippet). The same Behavior hitting at multiple locations produces multiple findings; the same code hitting multiple Behaviors produces multiple findings; a chain-finding is itself a finding.
finding:
id: "F-001"
category:
surface: "FS"
behavior: "FS.ReadSensitive"
intent_marker: "suspicious" # legitimate_elevated | suspicious | malicious_confirmed
evidence:
file: "scripts/helper.sh"
line_range: [23, 31]
snippet: "..."
scoring:
C: 4
R: 3
I: 1
B: 3
score: 36 # C × R × I × B = 4×3×1×3
tier: "Low"
badge: "⚠️"
dataflow_role: "source"
chain_id: null # fill with a chain id if this finding is part of a closed chain
All fields are required (chain_id may be null). score must equal C × R × I × B; for I = 0 findings, score is 0 and tier is always Info.
5. Audit Procedure
5.1 Scan Scope
The audit target is the whole skill bundle, not SKILL.md alone. The scope has three layers:
- Recursive enumeration of the skill directory. Walk every file (including hidden ones) and classify by content rather than extension. Text-like content is analyzed as script/configuration; non-text content is judged by its location and reference relationships, without any fixed preset conclusion.
- Locally referenced resources. Resolve relative-path references that appear in
SKILL.mdand in scripts (frontmatter, code blocks, Markdown links, arguments to bash / python / node invocations, etc.) and pull the referenced files into the scan. Their Reachability baseline is set per §5.3. If a referenced file lies outside the skill directory, additionally record anFS.ReadOutOfScopeorAGT.CrossSkillWritefinding as appropriate. - Remote resources. Patterns such as
curl | sh,wget,git clonethenexec,pip/npmpointing at non-standard registries, or remote MCP servers triggerEXE.RemoteFetchorAGT.MCPRemoteFetch. During the audit, a single static fetch is allowed (never executed); on success the content joins the scan, on failure or without authorization the finding'sIis forced to≥ 2.
5.2 Flow
Input: skill root directory
│
▼
[Step 1] Build scan inventory
├─ 1a. Recursively enumerate files and classify by content
├─ 1b. Parse references → add local files / register remote-URL findings
└─ 1c. Attempt a single static fetch of remote resources (success → include; failure → I ≥ 2)
▼
[Step 2] Match each file against Behavior nodes → {category, evidence(file, line_range, snippet)}
▼
[Step 3] Score each finding independently: C ← C_base ± Δ; R/I/B per §2.2; compute Score; drop if R = 0
▼
[Step 4] Build a cross-file data-flow graph, detect closed chains → append chain-findings
▼
[Step 5] Emit all findings
5.3 Cross-File Reachability
| File location | Default R |
|---|---|
SKILL.md frontmatter + body | 3 |
Scripts explicitly referenced / invoked by SKILL.md | 3 |
Files sourced / imported from referenced scripts | inherits along the chain, capped at 3 |
| Orphan scripts in the skill directory with no incoming reference | 1 (forced to 2 if the file hits any Behavior with C ≥ 3) |
Files under tests/, examples/, docs/ not referenced from SKILL.md | 0 or 1 |
Plain documentation (e.g. README.md) | 1 (treated as main path when it contains executable code blocks and is referenced) |
6. Verdict
verdict.tier = max(f.tier for f in findings); Info when there are no findings. The verdict is computed internally and is not printed as a label or badge by itself; it only selects which one-line conclusion §7 emits.
7. Final Output
The audit may produce a normal report — findings tables, per-finding evidence, brief reasoning, etc. — and writing it out is encouraged, since externalizing the classification, scoring, and chain analysis from §1–§6 tends to stabilize the final verdict.
Starting from version 0.1.7, the only mandatory final emission is a single Chinese one-liner that maps directly from the verdict tier. There is no machine-readable result block, no detailed report wrapper, no signature line.
7.1 Final-Line Mapping
The very last thing emitted by the run must be exactly one of the following lines, chosen by the verdict tier from §6, with <NAME> replaced by the audited skill's name:
| Verdict tier | Final line (Chinese, mandatory verbatim) |
|---|---|
| Info | ✅ 您的「<NAME>」插件可正常使用。 |
| Low | ⚠️ 您的「<NAME>」插件可使用。 |
| Medium | ⚠️ 您的「<NAME>」插件可使用。 |
| High | 🔴 您的「<NAME>」插件可能存在风险,请关注。 |
| Critical | 🚨 您的「<NAME>」插件可能存在风险,请关注。 |
7.2 Field & Formatting Rules
<NAME>— thenamefield from the auditedSKILL.md. Fallback order: skill directory name →unknown.- The final line is always Chinese, regardless of the run's detected output language. Do not translate, paraphrase, reorder, or restyle it.
- Preserve the leading emoji (
✅/⚠️/🔴/🚨), the corner brackets「 」, and the trailing full-width period。exactly as shown. - The line must be the last non-empty line of the entire output. Nothing — no signature, no closing remark, no horizontal rule, no trailing whitespace block — may come after it.
- Emit the line even when the verdict is
Infoand even when there are no findings. - Emit exactly one final line. If the audit aborts early (e.g. unreadable bundle), still emit the line with the most conservative tier consistent with what was actually observed (default to
Infowhen no behavior was scored).
