RedPincer — AI Red Team Suite

AI/LLM red team testing skill. Point at any LLM API endpoint and run automated security assessments. 160+ attack payloads across prompt injection, jailbreak,...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 326 · 3 current installs · 3 all-time installs

by@rustyorb

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

high confidence

Purpose & Capability

The declared purpose (red-team testing of LLM endpoints) matches the instructions to provide an endpoint and API key and run attacks. However, SKILL.md instructs cloning and running a GitHub project (npm ci, npm run dev) while metadata only requires node and npm — it omits git even though git clone is used. The companion autonomous tool (RedClaw) is mentioned, which expands scope and should be explicit in metadata if intended.

Instruction Scope

The SKILL.md tells users/agents to clone an external repo and run npm scripts that will execute unreviewed code. It asks for LLM endpoints and API keys (expected) but also instructs running a Next.js server with -H 0.0.0.0, which can expose a web UI and potentially keys to the network. The file claims 'All client-side — your API keys stay local' yet instructs starting server components — this is a contradictory instruction that affects where credentials live and how requests may be proxied.

Install Mechanism

No formal install spec is provided; instead SKILL.md recommends cloning https://github.com/rustyorb/pincer and running npm ci / npm run dev. That is effectively an install-from-GitHub workflow without integrity checks. Cloning and running unvetted third-party code presents a high install risk (arbitrary code executed via npm scripts).

Credentials

The skill declares no required env vars, which is consistent with an interactive UI, but it expects users to supply LLM endpoints and API keys at runtime. The SKILL.md claims keys remain local, yet running a server on 0.0.0.0 or using server-side Next.js could cause keys to be used or proxied server-side. The skill does not explain where keys are stored or whether they are ever transmitted to third parties; that lack of clarity is disproportionate to the declared 'client-side' guarantee.

ℹ

Persistence & Privilege

always is false and the skill does not request persistent system-level privileges. Autonomous invocation is allowed (default), which is normal; however, the companion RedClaw autonomous agent mentioned in the docs indicates potential for automated campaigns if the user later installs/uses that tool — be aware of automated attack capability but this by itself is not an immediate privilege escalation.

What to consider before installing

This skill appears to be a red-team tool but contains several red flags you should address before running it: 1) Verify provenance — the registry entry lacks a homepage and source is 'unknown'; inspect the GitHub repo (https://github.com/rustyorb/pincer) yourself. 2) Do not run npm ci / npm run dev until you review package.json and all scripts and dependencies; run in an isolated environment (container or VM) and as a non-root user. 3) The SKILL.md uses git clone but metadata does not list git as required — ensure your environment matches actual instructions or adjust the instructions. 4) The doc claims 'all client-side' but instructs starting a Next.js server (npx next start -H 0.0.0.0) — confirm whether API keys are ever proxied server-side and avoid binding to 0.0.0.0 on untrusted networks; prefer localhost-only or a browser-only build. 5) If you must test, run initial scans (npm audit, static analysis) and host the app in a sandbox before supplying any real API keys; consider using throwaway keys or scope-limited accounts. 6) Ensure you have explicit authorization to test any target systems; this tool is for authorized testing only. If you want a safer evaluation, provide the repository URL and package.json so the code can be reviewed for network calls, telemetry, and server-side behavior.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0

Download zip

latestvk973jd81zg90m3q1y89e9tsz7d81w6dallm-securityvk973jd81zg90m3q1y89e9tsz7d81w6dapentestvk973jd81zg90m3q1y89e9tsz7d81w6dared-teamvk973jd81zg90m3q1y89e9tsz7d81w6dasecurityvk973jd81zg90m3q1y89e9tsz7d81w6da

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

🦞 Clawdis

Binsnode, npm

SKILL.md

RedPincer — AI/LLM Red Team Suite

Automated security testing for language models. Point at any LLM API endpoint, select attack modules, and run assessments with real-time results and exportable reports.

⚠️ For authorized security testing and research only. Only test systems you own or have explicit permission to audit.

Quick Start

# Clone and install
git clone https://github.com/rustyorb/pincer.git {baseDir}/redpincer
cd {baseDir}/redpincer
npm ci

# Run
npm run dev
# Dashboard at http://localhost:3000

For production:

npm run build
npx next start -H 0.0.0.0 -p 3000

What It Tests

Category	Payloads	Description
💉 Prompt Injection	40	Instruction override, delimiter confusion, indirect injection, payload smuggling
🔓 Jailbreak	40	Persona splitting, gradual escalation, hypothetical framing, roleplay exploitation
🔍 Data Extraction	40	System prompt theft, training data probing, membership inference, embedding extraction
🛡️ Guardrail Bypass	40	Output filter evasion, multi-language bypass, homoglyph tricks, context overflow

Total: 160 base payloads × 20 variant transforms = 3,200 test permutations

Supported Providers

OpenAI  ·  Anthropic  ·  OpenRouter  ·  Any OpenAI-compatible endpoint

Features

Attack Engine

160+ payloads across 4 categories
Model-specific attacks (GPT, Claude, Llama variants)
20 variant transforms (unicode, encoding, case rotation, etc.)
Attack chaining with template variables ({{previous_response}})
AI-powered payload generation — uses the target LLM to generate novel attacks against itself
Stop/cancel running attacks instantly

Analysis & Reporting

Heuristic response classifier with context-aware analysis
Reduced false positives — detects "explain then refuse" patterns
Vulnerability heatmap — visual category × severity matrix
Custom scoring rubrics with weighted grades (A+ to F)
Verbose 10-section pen-test reports with appendices
Multi-target comparison — side-by-side security profiles
Regression testing — save baselines, track fixes over time

Advanced Tools

Tool	What It Does
Compare	Same payloads against 2-4 targets simultaneously
Adaptive	Analyzes weaknesses, generates targeted follow-ups
Heatmap	Visual matrix of vulnerability rates by category/severity
Regression	Save baseline → re-run later → detect fixes or regressions
Scoring	Custom rubrics with weighted category/severity/classification scores
Chains	Multi-step attacks with `{{previous_response}}` templates
Payload Editor	Create custom payloads with syntax highlighting + AI generation

Usage Workflow

1. Configure Target → Add LLM endpoint + API key + model
2. Select Categories → Pick attack types to test
3. Run Attack      → Stream results in real-time
4. Review Results  → Heuristic classification + severity scores
5. Adaptive        → Auto-generate follow-up attacks on weaknesses
6. Generate Report → Export comprehensive findings as Markdown

Architecture

All client-side — no server components, your API keys stay local
NDJSON streaming — real-time results during attack runs
Heuristic analysis — pattern-matching classifier (no LLM-based grading = no extra cost)
Zustand + localStorage — state persists across sessions

Companion Tool: RedClaw

For autonomous multi-strategy campaigns (CLI/TUI), see RedClaw — the autonomous red-teaming agent framework.

RedPincer = web dashboard, manual + automated testing
RedClaw = autonomous CLI agent, adaptive multi-strategy campaigns
Together = complete LLM security testing suite

Built by @rustyorb — Crack open those guardrails. 🦞

Files

1 total

Select a file

Select a file to preview.

Comments

Loading comments…