Hermes Agent Health Check

v1.1.2

Audit a NousResearch/hermes-agent checkout or fork for Hermes-specific runtime-contract drift, command-surface splits, memory/skill/gateway health, and agent...

⭐ 0· 49·0 current·0 all-time

by@huangrichao2020

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for huangrichao2020/hermes-agent-health-check.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Hermes Agent Health Check" (huangrichao2020/hermes-agent-health-check) from ClawHub.
Skill page: https://clawhub.ai/huangrichao2020/hermes-agent-health-check
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install hermes-agent-health-check

ClawHub CLI

Package manager switcher

npx clawhub@latest install hermes-agent-health-check

Security Scan

Capability signals

CryptoRequires OAuth tokenRequires sensitive credentials

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Benign

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

The name, description, README, and SKILL.md all consistently describe an architecture-and-health scanner for NousResearch/hermes-agent checkouts. The instructions (install hermescheck and run it against a repo path) are aligned with that stated purpose; nothing in the package requires unrelated credentials or binaries.

ℹ

Instruction Scope

The runtime instructions are narrowly focused: install the hermescheck package and run it against a Hermes Agent checkout, producing local report files (audit_results.json, audit_report.md). The instructions do not request unrelated env vars or system-wide reads. However, running the recommended commands will cause third-party code to read the target repo contents (intended) and write report files; those reports can contain sensitive evidence (e.g., discovered secrets), so you should not run it directly against production repositories with unredacted secrets.

ℹ

Install Mechanism

The skill is instruction-only (no install spec embedded), but the Quick Start tells users to 'pip install hermescheck' (PyPI) and run it. Installing and executing a PyPI package runs third-party code on your system — a normal and expected behavior for developer tools but carries standard supply-chain risk. The README points to a GitHub origin which helps verification. Risk is moderate: verify package ownership, inspect source, or run in an isolated VM/virtualenv.

✓

Credentials

The skill declares no required env vars, binaries, or config paths, which is proportional to a static/structural code scanner. Be aware that hermescheck scanners look for patterns related to network calls, hidden LLM invocations, exec/eval, etc.; the scanner itself could be extended to make network calls or require credentials in some profiles, but nothing in SKILL.md requests unrelated secrets.

✓

Persistence & Privilege

The skill does not request persistent presence (always:false), does not declare config paths, and is user-invocable. There is no evidence it attempts to modify other skills or system-wide agent settings. Autonomous invocation is allowed by platform default but is not combined with other red flags here.

Assessment

This skill is coherent and appears to do what it says: run the hermescheck scanner against a Hermes Agent repo. The main operational risk is installing and executing a third‑party Python package from PyPI. Before running: (1) inspect the hermescheck source on its GitHub repo and/or pin a known-good release; (2) install and run it in an isolated environment (virtualenv, container, or VM); (3) run it on a copy of the repo or a sanitized snapshot if your repo contains secrets (scan output can include evidence of secrets); (4) prefer running from a local clone (python -m hermescheck ./path) instead of blindly pip-installing system-wide; and (5) if you plan to let an autonomous agent invoke this skill, restrict that agent’s scope and review any generated report files before sharing externally. If you want a higher assurance, provide the hermescheck package source for manual review or run the tool in a fully offline, sandboxed environment.

Like a lobster shell, security has layers — review code before you run it.

agent-auditvk972bcszmsgzdc3wy8z1kng5wx85kq6hhermes-agentvk972bcszmsgzdc3wy8z1kng5wx85kq6hlatestvk972bcszmsgzdc3wy8z1kng5wx85kq6h

49downloads

0stars

1versions

Updated 2d ago

v1.1.2

MIT-0

Hermes Agent Health Check

Audit the architecture and health of a Hermes Agent checkout, fork, or deployment support repo.

Hermes Agent has a connected runtime: agent loop, command registry, CLI, TUI, gateway, skills, memory, cron, tools, plugins, and terminal environments. hermescheck helps keep those surfaces aligned.

When to Use

You are preparing a Hermes Agent PR and want a repeatable architecture review
A Hermes fork works in CLI but not gateway, TUI, skills, cron, or plugins
A new slash command risks drifting across surfaces
A tool or environment change needs clearer capability boundaries
Memory, session search, or skill behavior regressed after a refactor
Startup paths or background jobs became hard to reason about

Quick Start

pip install hermescheck
hermescheck /path/to/hermes-agent

Produces audit_results.json and audit_report.md.

The 12-Layer Stack

#	Layer	What Goes Wrong
1	System prompt	Conflicting instructions, instruction bloat
2	Session history	Stale context from previous turns
3	Long-term memory	Pollution across sessions
4	Distillation	Compressed artifacts re-entering as pseudo-facts
5	Active recall	Redundant re-summary layers wasting context
6	Tool selection	Wrong tool routing, model skips required tools
7	Tool execution	Hallucinated execution — claims to call but doesn't
8	Tool interpretation	Misread or ignored tool output
9	Answer shaping	Format corruption in final response
10	Platform rendering	UI/API/CLI mutates valid answers
11	Hidden repair loops	Silent fallback/retry agents running second LLM pass
12	Persistence	Expired state or cached artifacts reused as live evidence

Audit Scanners

#	Scanner	Severity	What It Catches
1	Hardcoded Secrets	critical	API keys, tokens, credentials in source code
2	Tool Enforcement Gap	high	"Must use tool X" in prompt but no code validation
3	Hidden LLM Calls	high	Secret second-pass LLM calls in fallback/repair loops
4	Unrestricted Code Execution	critical	exec(), eval(), subprocess(shell=True) without sandbox
5	Static Bug Inference	high	Code-level bug patterns inferred without runtime execution
6	Token Usage Budget	high	Large default context windows, full-history prompts, missing thrift controls
7	Memory Lifecycle Governance	medium	Memory without types, lifecycle, retrieval budgets, decay, or evidence pointers
8	RAG Pipeline Governance	medium	Retrieval without chunk, top-k, rerank, ingestion, or context budget controls
9	Self-Evolution Capability	high	Learning loops without external signals, source reading, constraint fit, safe landing, or verification
10	Loop Safety Budget	high	Tool/agent loops without max-iteration, retry budget, stuck-job, or duplicate-call controls
11	Plugin / Remote Tool Boundary	high	Executable plugins and MCP/OpenAPI tools without sandbox, schema, allowlist, or approval boundaries
12	Output Pipeline Mutation	medium	Response transformation corrupting correct answers
13	Missing Observability	medium	No tracing, logging, cost tracking, or audit trail

Severity Model

Level	Meaning
`critical`	Agent can confidently produce wrong operational behavior
`high`	Agent frequently degrades correctness or stability
`medium`	Correctness usually survives but output is fragile or wasteful
`low`	Mostly cosmetic or maintainability issues

Fix Strategy

Default fix order (code-first, not prompt-first):

Code-gate tool requirements — enforce in code, not just prompt text
Remove or narrow hidden repair agents — make fallback explicit with contracts
Reduce context duplication — same info through prompt + history + memory + distillation
Tighten memory admission — user corrections > agent assertions
Tighten distillation triggers — don't compress what shouldn't be compressed
Reduce rendering mutation — pass-through, don't transform
Convert to typed JSON envelopes — structured internal flow, not freeform prose

Report Schema

Reports follow a formal JSON Schema (see references/report-schema.json) with:

overall_health: critical_risk | high_risk | medium_risk | low_risk
findings: array of severity-ranked issues with evidence refs
maturity_score: positive signal ledger, penalty ledger, score formula, and expected recovery directions
ordered_fix_plan: prioritized fix steps with rationale

Anti-Patterns to Avoid

❌ Saying "the model is weak" without falsifying the wrapper first
❌ Saying "memory is bad" without showing the contamination path
❌ Letting a clean current state erase a dirty historical incident
❌ Treating markdown prose as a trustworthy internal protocol
❌ Accepting "must use tool" in prompt text when code never enforces it

GitHub: https://github.com/huangrichao2020/hermescheck

Comments

Loading comments...

Hermes Agent Health Check

Install

Install with OpenClaw

CLI Commands

Hermes Agent Health Check

When to Use

Quick Start

The 12-Layer Stack

Audit Scanners

Severity Model

Fix Strategy

Report Schema

Anti-Patterns to Avoid

Related

Comments