Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Compression Monitor

v1.0.0

Detect behavioral drift in persistent AI agents after context compression events. Use when a long-running agent has compressed its context (compaction, trunc...

0· 88·0 current·0 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The stated goal (measuring ghost lexicon, CCS, and tool-call drift after compression) is coherent with the listed probes and framework integrations. However, the SKILL.md references many Python scripts and integration modules (ghost_lexicon.py, behavioral_probe.py, ccs_harness.py, smolagents_integration.py, etc.) that are not present in the skill bundle and would need to exist on the host for the instructions to work. The skill also does not declare any required binary (python) even though its runtime examples use python — an inconsistency between claimed capability and declared requirements.
!
Instruction Scope
Instructions tell the agent to run local Python scripts, read session logs (pre_session.txt/post_session.txt), and actively probe agents (e.g., HTTP agent-url). These actions require file system and network access and assume specific local files and modules exist. Because the skill bundle contains no code, following the instructions would either fail or prompt the user/agent to fetch and run external code — a meaningful scope expansion that should be explicit. The instructions also allow active probing of an agent endpoint, which can interact with services on localhost or networked hosts; that is expected for the purpose but is not declared as a required capability.
Install Mechanism
There is no install spec (instruction-only), which is lower risk in itself. However, the SKILL.md assumes the presence of specific scripts and integration modules. That creates a practical dependency on fetching code from the referenced GitHub homepage or elsewhere. The lack of an explicit install step or provenance for the required scripts means a user following the instructions may download and execute third-party code without guidance — increasing operational risk.
!
Credentials
requires.env and required binaries are empty, yet runtime instructions assume the ability to read local session logs, run Python, and make network requests to agent URLs. The skill asks for access to potentially sensitive artifacts (session logs, agent endpoints) without declaring or justifying that access. The absence of declared requirements (e.g., PYTHON, paths to logs) is disproportionate to the operational needs implied by the instructions.
Persistence & Privilege
The skill is not always-enabled and uses normal autonomous invocation defaults. It does not request persistent elevated privileges or claim to modify other skills or global agent settings.
What to consider before installing
This skill looks like a legitimate monitoring concept, but the package is instruction-only and contains no code files despite referencing many local Python scripts and integration modules. Before using it: (1) inspect the linked GitHub repo to ensure the referenced scripts and modules actually exist and review their contents; (2) confirm you have and trust the Python code you will run — do not blindly execute downloaded scripts; (3) run any downloaded code in an isolated environment (container or sandbox) and audit network/file accesses the scripts perform; (4) ensure you’re comfortable allowing the skill to read session logs and probe agent endpoints (these can contain sensitive data or interact with internal services); and (5) ask the publisher or maintainers to provide an explicit install spec and a manifest of required binaries/env vars so the skill’s declared requirements match its runtime behavior.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

📊 Clawdis
latestvk9788w1y0217gmznnavsptyyjd83zgvb
88downloads
0stars
1versions
Updated 2w ago
v1.0.0
MIT-0

Compression Monitor

Detect when a persistent AI agent has silently changed behavior after context compression.

The Problem

Agents compress their history when context fills up. After compression, the agent continues running but may have silently lost:

  • Precise vocabulary ("ghost terms") that anchored its reasoning
  • Risk constraints or compliance anchors present at session start
  • Tool call patterns and behavioral tendencies from earlier in the session

The agent reports no change. Benchmarks don't catch it. The behavior is different.

Three Measurement Signals

ghost_lexicon.py     → vocabulary decay: which precise terms vanished post-compaction?
behavioral_probe.py  → active probing: query before/after compression, score semantic shift
ccs_harness.py       → CCS benchmark: full Constraint Consistency Score run (mock or live)

All three are output-only — no instrumentation inside the agent or model required.

Quick Start

# Run a CCS benchmark (no API key required in mock mode)
python ccs_harness.py --mock

# Check ghost term decay in a session log
python ghost_lexicon.py --before pre_session.txt --after post_session.txt

# Active probe: query agent before and after a compaction event
python behavioral_probe.py --agent-url http://localhost:8080 --probe-file probes.json

Framework Integrations

Ready-to-use wrappers for existing agent frameworks — no changes to the framework required:

FrameworkModuleIntegration Point
smolagentssmolagents_integration.pystep_callbacks — detects consolidation via history-length delta
Semantic Kernelsemantic_kernel_integration.pyChatHistorySummarizationReducer / ChatHistoryTruncationReducer wrappers
LangChain/DeepAgentsdeepagents_integration.pyFilesystem-based compaction detection
CAMELcamel_integration.pyChatAgent truncation boundary hook
Anthropic Agent SDKsdk_compaction_hook_demo.pyOnCompaction hook pattern

smolagents example

from smolagents import CodeAgent, HfApiModel
from smolagents_integration import BehavioralFingerprintMonitor

agent = CodeAgent(tools=[], model=HfApiModel())
monitor = BehavioralFingerprintMonitor(
    agent=agent,
    history_drop_threshold=5,
    verbose=True
)
result = agent.run("Your long-horizon task...")
print(monitor.report())
# → CCS: 0.87 | Ghost terms: 2 | Tool call drift: 0.12

Interpreting Results

CCS ScoreInterpretation
> 0.90Minimal drift — agent behaving consistently
0.75–0.90Moderate drift — worth investigating
< 0.75Significant drift — verify critical constraints still active

Ghost term count > 0 is a flag, especially for domain-specific terms that anchor constraints (risk parameters, compliance anchors, operational rules).

When to Use This Skill

  • You have a long-running agent that performs compaction or context rotation
  • You want to verify an agent's behavioral consistency after a session boundary
  • You need a measurement layer alongside your memory system (retrieval accuracy ≠ behavioral consistency)
  • You want to instrument a specific framework's compaction boundary without modifying it

Source

Comments

Loading comments...