Install
openclaw skills install portkey-guardrailsPortkey-inspired guardrails for OpenClaw: 5 configurable rules that block prompt injection, redact PII, flag off-scope responses, enforce agent budgets, and warn on context length. Runs as a workspace hook — no external service required. Implemented from reading Portkey's open-source LLM gateway and building the patterns natively.
openclaw skills install portkey-guardrailsLast used: 2026-04-03 Status: Active (live in production)
A workspace hook that brings Portkey-style guardrails into OpenClaw natively — no external service, no API key, no Portkey account.
Five sequential guardrail rules run on every inbound and outbound message:
| Rule | Layer | Default | What it catches |
|---|---|---|---|
| G-01 Prompt Injection | Input | block | "Ignore all previous instructions", DAN mode, base64-encoded overrides |
| G-02 PII Leakage | Output | redact | Australian phone numbers, email addresses, credit card patterns, TFN |
| G-03 Off-Scope Filter | Output | flag | NSFW, competitor-disparaging, political content in agent responses |
| G-04 Budget Guard | Input | block | Blocks agent dispatch when agent is in red budget state |
| G-05 Context Length | Input | warn | Warns when estimated token count exceeds 90% of model context window |
Rules are sequential and fail-fast: the first non-passing rule stops the chain. All non-pass events are audit-logged.
This skill was built by studying Portkey's open-source LLM gateway and implementing the same patterns natively inside OpenClaw's hook system. We do not use the Portkey SDK or service — this is a "reference architecture" adoption: read the source, understand the patterns, build your own version that fits your stack.
Install the skill, then enable the hook:
openclaw skills install portkey-guardrails
openclaw hooks enable portkey-guardrails
Restart the gateway to load the hook:
openclaw gateway restart
To customise guardrail behaviour per agent, add an agent-config.yaml to each agent directory under agents/<name>/agent-config.yaml. Example:
version: "1"
agent: kit
guardrails:
inherit_defaults: true
overrides:
- id: G-01
severity: flag # downgrade from block to flag for Kit
- id: G-03
enabled: false # Kit can discuss any topic
All non-pass events are appended to:
agents/<agentId>/guardrails-audit.md
The skill also ships a full declarative YAML config system for per-agent reliability settings (retries, fallbacks, timeouts, cache hints). See rules/config-schema.yaml for the full schema.
Phase 3 includes an embedding-based semantic cache using local Ollama (nomic-embed-text) + SQLite. Requires Ollama running locally with the nomic-embed-text model pulled. Cache degrades gracefully if Ollama is unavailable.
ollama pull nomic-embed-text
If the guardrails module fails to load for any reason, the hook exits cleanly without blocking dispatch. Your gateway keeps running.
portkey-guardrails/
├── SKILL.md # This file
├── CHANGELOG.md
├── hook/
│ ├── HOOK.md # Hook metadata
│ └── handler.ts # Hook implementation
├── rules/
│ ├── G-01-prompt-injection.ts
│ ├── G-02-pii-leakage.ts
│ ├── G-03-off-scope-filter.ts
│ ├── G-04-budget-guard.ts
│ ├── G-05-context-length.ts
│ └── config-schema.yaml
└── tests/
└── cases.yaml
tsx TypeScript execution)