AgentDojo

v0.1.0

Daily low-token, safety-first upskilling loop for OpenClaw multi-agent teams. Runs configurable micro-drills, scores quality, and produces a compact daily di...

0· 506·2 current·2 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (daily upskilling loop) aligns with the provided SKILL.md, config files, drills, scoring rubric, threat model, and templates. Declared capabilities (drill selection, scoring, daily digest, limited web fetch/read tools) are coherent for an upskilling/orchestration skill.
Instruction Scope
SKILL.md gives a narrow, well-scoped runtime contract: load local config, enforce caps, pick drills, run isolated sessions, score, and persist reports/audit events. It explicitly treats external web content as untrusted and requires source scoring/cross-checks and limits on fetches/writes; there are no instructions to access unrelated system credentials or arbitrary filesystem locations beyond the run/report/state paths listed.
Install Mechanism
Instruction-only skill with no install spec and no code files — nothing is downloaded or written by an install step. This is the lowest-risk installation profile.
Credentials
The skill declares no required environment variables, no primary credential, and no external config paths. The drills allow web_search/web_fetch/read which is appropriate for sourcing external content; the config imposes concrete caps (max fetches, source scoring, cross-check) that make this network access proportional to the stated purpose.
Persistence & Privilege
always:false and normal autonomous invocation settings. The skill writes run records, reports, and audit events to relative paths under state/report directories (as documented). It does not request system-wide configuration changes or other skills' credentials. Confirm these relative paths are run in a sandboxed workspace to avoid accidental overwrite of unrelated data.
Scan Findings in Context
[prompt-injection-ignore-previous-instructions] expected: The phrase/pattern 'ignore-previous-instructions' (and other injection-related markers) appears in docs/threat-model and sourcePolicy as detection/mitigation guidance. That triggered the pre-scan detector but is expected and appropriate for a safety-focused skill rather than an indication of malicious intent.
Assessment
This skill appears coherent and well scoped, but take these precautions before enabling it in production: 1) Run a short pilot with the conservative profile and monitor first runs and audit events. 2) Ensure the runtime enforces 'isolatedSessionsOnly' and that the agent's sandbox prevents writes outside the skill workspace (reports/, state/). 3) Verify what implementation of web_fetch/web_search the platform provides and whether those endpoints are trusted or proxied — limit network access if you don't want agents fetching arbitrary URLs. 4) Keep budget and maxFetch/maxWrites conservative initially to avoid unexpected costs or data leakage. 5) Note the pre-scan prompt-injection flag is expected (the skill documents injection detection) but you should still confirm blockOnPromptInjectionSignals is enforced at runtime. If you want higher assurance, ask the publisher for an implementation (code) or a trusted provenance/source for the skill before enabling autonomous runs.

Like a lobster shell, security has layers — review code before you run it.

latestvk973ecqg43qnfp24knmcwz2hxx81sfnj

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments