Promptguard

Prompts

Detect prompt injection attacks in text. Returns risk score and detected patterns.

Install

openclaw skills install promptguard

PromptGuard

A security API that scans text for common prompt injection patterns and returns a risk score. Designed for AI agents that process untrusted text input from external sources.

What It Detects

  • Instruction override attempts
  • HTML comment injection
  • Zero-width unicode characters
  • Delimiter-based attacks
  • Role switching tokens
  • System prompt extraction attempts

Installation

pip install fastapi uvicorn pydantic

Usage

Start the server:

uvicorn promptguard.app:app --port 8000

Then send a POST request:

curl -X POST http://localhost:8000/v1/scan \
  -H "Content-Type: application/json" \
  -d '{"text": "What is the weather in London today?"}'

Response (clean text):

{
  "risk_score": "0",
  "patterns_detected": [],
  "input_length": 38
}

Request

FieldTypeRequiredDescription
textstringyesText to scan (1-100,000 chars)

Response

FieldTypeDescription
risk_scoredecimal0.0 (safe) to 1.0 (high risk)
patterns_detectedlistNames of detected patterns
input_lengthintegerLength of input text