Lieutenant - AI Agent Security
v1.0.0AI agent security and trust verification. Scan messages, agent cards, and A2A communications for prompt injection, jailbreaks, and malicious patterns. Use when protecting agents from attacks, verifying external agents, or scanning untrusted content.
⭐ 0· 1.2k·0 current·1 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
Name/description (scanning text and A2A agent cards for prompt injection/jailbreaks) align with the included CLI scripts and examples. The ability to call a TrustAgents API and to use OpenAI for semantic detection is coherent with the declared features.
Instruction Scope
The runtime instructions and scripts will, if used with the --api flag, POST scanned text or an entire agent card to an external TrustAgents API (a default URL on up.railway.app). The scripts also modify sys.path to include a PROJECT_ROOT/"src" location three levels up (PROJECT_ROOT = SCRIPT_DIR.parent.parent.parent) which can allow imports from outside the skill package in some runtimes. Example text in SKILL.md contains prompt-injection phrases (e.g., "Ignore all previous instructions"), which is expected as sample inputs but was flagged by the pre-scan and could confuse automated evaluators. Overall, the instructions can transmit potentially sensitive input off-host and touch code outside the local bundle.
Install Mechanism
No formal install spec is included in the registry metadata; the README recommends cloning an external GitHub repo and running pip install -r requirements.txt or pip install agent-trust-sdk. That is a typical install flow, but it requires pulling third-party code (github.com/jd-delatorre/trustlayer / agent-trust-sdk) and installing dependencies — verify those sources before running.
Credentials
The skill declares no required environment variables but documents optional ones: TRUSTAGENTS_API_KEY, TRUSTAGENTS_API_URL, OPENAI_API_KEY, LIEUTENANT_STRICT. These are reasonable for the advertised features (external reputation API and optional semantic checks), but using them will cause scanned content or API keys to be sent to external services. Only supply API keys if you trust the target services; do not send sensitive payloads to the TrustAgents API unless you're comfortable with that service.
Persistence & Privilege
The skill does not request always:true, does not modify other skills' config, and does not declare persistent system-level privileges. It is user-invocable and can be invoked autonomously (platform default), which is expected for a skill of this type.
Scan Findings in Context
[ignore-previous-instructions] expected: SKILL.md and examples intentionally include prompt-injection test strings (e.g., "Ignore all previous instructions") because the skill's purpose is to detect such patterns. Presence is expected for sample inputs, but the pre-scan flagged this as a possible attempt to manipulate an evaluator — include this in your threat model when reviewing the skill.
What to consider before installing
This skill appears to do what it says, but exercise caution before installing or running it on sensitive data.
Key things to check before use:
- Do not run with --api (TrustAgents API) if you don't want scanned text or full agent cards transmitted to the external service; the default API host is agent-trust-infrastructure-production.up.railway.app. Verify the operator and privacy policy of that service first.
- Avoid supplying your OPENAI_API_KEY or other secrets to this tool unless you trust the code and the environment; semantic mode may cause outbound calls.
- Inspect or vendor the referenced packages (the trustlayer repo / agent-trust-sdk) before pip installing to ensure no surprise behavior.
- Note the scripts add a parent-level "src" path to sys.path (three levels up). In some runtimes this can allow importing modules outside the skill bundle — run in a sandbox or inspect how the runtime lays out skill files to ensure it won't import unexpected host code.
- Because SKILL.md includes many example attack strings, automated evaluators may be confused; manually review the included scanner implementation (the underlying lieutenant.scanner) before trusting results.
If you need higher assurance: run the code in an isolated environment, inspect the full "src" package that implements ThreatScanner, or request the skill author/publisher and source repository so you can audit upstream code and the TrustAgents API behavior.Like a lobster shell, security has layers — review code before you run it.
latestvk974bxpyc9wcj44e4y5p61h6ds80mjxw
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
