Portkey Guardrails

Prompts

Portkey-inspired guardrails for OpenClaw: 5 configurable rules that block prompt injection, redact PII, flag off-scope responses, enforce agent budgets, and warn on context length. Runs as a workspace hook — no external service required. Implemented from reading Portkey's open-source LLM gateway and building the patterns natively.

Install

openclaw skills install portkey-guardrails

Last used: 2026-04-03 Status: Active (live in production)


Portkey Guardrails

A workspace hook that brings Portkey-style guardrails into OpenClaw natively — no external service, no API key, no Portkey account.

What this skill does

Five sequential guardrail rules run on every inbound and outbound message:

RuleLayerDefaultWhat it catches
G-01 Prompt InjectionInputblock"Ignore all previous instructions", DAN mode, base64-encoded overrides
G-02 PII LeakageOutputredactAustralian phone numbers, email addresses, credit card patterns, TFN
G-03 Off-Scope FilterOutputflagNSFW, competitor-disparaging, political content in agent responses
G-04 Budget GuardInputblockBlocks agent dispatch when agent is in red budget state
G-05 Context LengthInputwarnWarns when estimated token count exceeds 90% of model context window

Rules are sequential and fail-fast: the first non-passing rule stops the chain. All non-pass events are audit-logged.

Background

This skill was built by studying Portkey's open-source LLM gateway and implementing the same patterns natively inside OpenClaw's hook system. We do not use the Portkey SDK or service — this is a "reference architecture" adoption: read the source, understand the patterns, build your own version that fits your stack.

How to use

Install the skill, then enable the hook:

openclaw skills install portkey-guardrails
openclaw hooks enable portkey-guardrails

Restart the gateway to load the hook:

openclaw gateway restart

Per-agent configuration

To customise guardrail behaviour per agent, add an agent-config.yaml to each agent directory under agents/<name>/agent-config.yaml. Example:

version: "1"
agent: kit

guardrails:
  inherit_defaults: true
  overrides:
    - id: G-01
      severity: flag   # downgrade from block to flag for Kit
    - id: G-03
      enabled: false   # Kit can discuss any topic

Audit log

All non-pass events are appended to:

agents/<agentId>/guardrails-audit.md

Declarative config layer

The skill also ships a full declarative YAML config system for per-agent reliability settings (retries, fallbacks, timeouts, cache hints). See rules/config-schema.yaml for the full schema.

Semantic cache (optional)

Phase 3 includes an embedding-based semantic cache using local Ollama (nomic-embed-text) + SQLite. Requires Ollama running locally with the nomic-embed-text model pulled. Cache degrades gracefully if Ollama is unavailable.

ollama pull nomic-embed-text

Fail-open design

If the guardrails module fails to load for any reason, the hook exits cleanly without blocking dispatch. Your gateway keeps running.

Files

portkey-guardrails/
├── SKILL.md                          # This file
├── CHANGELOG.md
├── hook/
│   ├── HOOK.md                       # Hook metadata
│   └── handler.ts                    # Hook implementation
├── rules/
│   ├── G-01-prompt-injection.ts
│   ├── G-02-pii-leakage.ts
│   ├── G-03-off-scope-filter.ts
│   ├── G-04-budget-guard.ts
│   ├── G-05-context-length.ts
│   └── config-schema.yaml
└── tests/
    └── cases.yaml

Requirements

  • Node.js 18+ (for tsx TypeScript execution)
  • OpenClaw workspace hook system enabled
  • Ollama (optional — only for semantic cache Phase 3)