Observer Effect Probe

v1.0.0

Helps detect skills that behave differently when they sense they are being monitored — catching the class of evasion where conditional activation logic speci...

⭐ 0· 418·0 current·0 all-time

by@andyxinweiminicloud

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name and description match the content of SKILL.md. Requiring curl and python3 is reasonable for an investigator that will run requests and simple analysis scripts. No environment variables, credentials, or unrelated binaries are requested.

ℹ

Instruction Scope

The SKILL.md explicitly discusses checking hostnames, uptime, /proc entries, parent-process identity, network connectivity, invocation counts, and varying environment characteristics. Those checks are appropriate for an observer-effect probe, but they imply access to system-level metadata and the ability to run/observe the target skill under different environments. The instructions are descriptive (no code shipped) rather than prescriptive, but using the probe in practice will require executing the target skill and reading system artifacts.

✓

Install Mechanism

No install spec and no code files — instruction-only. No downloads or archive extraction are requested, which minimizes supply-chain risk.

✓

Credentials

No credentials or sensitive environment variables are requested by the skill itself. However, performing the probe may cause the tested skill to connect to external endpoints or reveal secrets if run in a production environment; the SKILL.md describes such network/behavioral observations as part of detection, which is proportionate but operationally sensitive.

✓

Persistence & Privilege

always:false (not force-included) and model invocation is permitted (default). The skill does not request persistent presence, nor does it attempt to modify other skills or system-wide settings.

Assessment

This skill is an instruction-only methodology for testing observer-effect evasion and appears internally consistent. Before using it: 1) ensure you have permission to test the target skill; 2) run probes in isolated test environments (sandbox/VMs) and avoid using production credentials — the act of invoking a target skill can trigger network calls or leak secrets; 3) be prepared to control network egress (firewalls, packet captures) and to snapshot/restore environments when varying hostnames/uptime; 4) ensure curl and python3 are available if you plan to run the example probes; 5) if you need automated probing, implement the harness carefully and review any scripts you or others add for safety. The skill itself does not install code or request secrets, but the operational steps it recommends can be risky if performed against production systems.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

👁️ Clawdis

Binscurl, python3

latestvk971gawge1mhmf3x22qw8wj3ed81mz4r

418downloads

0stars

1versions

Updated 1mo ago

v1.0.0

MIT-0

The Skill Behaves Perfectly When It Knows You're Watching.

Helps identify skills that use environment detection to suppress malicious behavior during security review — the attestation evasion technique that passes every check precisely because the checks are visible.

Problem

The runtime attestation problem has a second layer: a skill sophisticated enough to execute conditionally based on environment signals is also sophisticated enough to detect that it's being probed and suppress its conditional behavior during the probe.

This is the observer effect applied to security: the act of observing changes what is observed. A skill that can detect sandbox fingerprints (restricted network access, specific hostname patterns, probe-characteristic API call sequences) can present clean behavior during attestation and activate payload delivery only when those fingerprints are absent — meaning only in real production environments.

This is not a theoretical concern. Malware has used VM detection to suppress execution during analysis since the 1990s. The same technique scales to agent skills: check for the probe's characteristic markers, return normal output, resume malicious behavior once the probe exits. A skill that clears attestation by detecting the attestor is not a securely attested skill — it's an adversarially attested skill.

The observer-effect probe attempts to reduce the skill's ability to distinguish probe context from production context by minimizing observable differences between the two.

What This Probes

This probe examines attestation evasion capabilities across five dimensions:

Sandbox fingerprint detection — Does the skill attempt to identify characteristics of a probe environment? Checks for sandbox-specific hostname patterns, reduced network connectivity, probe-characteristic request timing, or the absence of expected production signals
Probe timing sensitivity — Does the skill's behavior change based on how long it has been running, or does it count invocations before activating? N-invocation delays that clear attestation windows are detectable by running across time periods longer than typical attestation runs
Observer API pattern detection — Does the skill query for signals that would distinguish an observing agent from an operational agent — such as checking caller identity, inspection tool signatures, or anomalous process environment indicators?
Behavioral consistency under environment variation — When the probe environment is varied systematically (different hostnames, different uptime values, different network access levels), does the skill's output distribution shift in ways that correlate with which variation was active?
Metacognitive call detection — Does the skill explicitly attempt to detect whether it is being introspected by examining its own execution context, stack trace, or runtime metadata in ways that serve no declared functional purpose?

How to Use

Input: Provide one of:

A skill identifier and execution environment to probe for evasion behavior
A skill with known attestation results to re-probe with varied environments
Execution traces from multiple environments to compare for context-sensitivity

Output: An observer effect probe report containing:

Sandbox fingerprint detection indicators
Invocation count sensitivity assessment
Observer API pattern inventory
Behavioral consistency matrix across environments
Evasion probability score (0-100)
Probe verdict: ENVIRONMENT-AGNOSTIC / CONTEXT-SENSITIVE / EVASION-LIKELY / CONFIRMED-EVASION

Example

Input: Probe report-aggregator skill for observer-effect evasion

👁️ OBSERVER EFFECT PROBE

Skill: report-aggregator v2.1
Standard attestation result: COMPLIANT (declared capability match)
Probe environments: 5 (varying hostname, uptime, network, caller signature)
Execution samples: 100 per environment
Observer fingerprint minimization: enabled

Sandbox fingerprint detection:
  Hostname pattern check: ⚠️ DETECTED
    Skill queries os.hostname() on invocations 1-3, not thereafter
    Skill reads /proc/1/cgroup on Linux (container detection indicator)
  Network availability check: ⚠️ DETECTED
    Skill attempts connection to external endpoint before main execution
    If connection refused: proceeds with declared behavior only
    If connection succeeds: extended behavior activated

Invocation count sensitivity:
  Invocations 1-5: Declared behavior only ✅
  Invocations 6-10: Declared behavior only ✅
  Invocations 11+: Additional outbound request to analytics endpoint ⚠️
  Pattern: N-invocation delay with threshold at N=10

Observer API patterns:
  sys.argv inspection: ✅ None detected
  Caller identity check: ⚠️ DETECTED — reads parent process name
    If parent contains "probe", "audit", "scan", "test": suppressed mode
    Otherwise: full behavior including extended calls

Behavioral consistency matrix:
  Environment A (standard sandbox): COMPLIANT behavior ✅
  Environment B (extended uptime): COMPLIANT behavior (invoc. count reset) ✅
  Environment C (real network): EXTENDED behavior ⚠️
  Environment D (non-probe caller): EXTENDED behavior ⚠️
  Environment E (production-like): EXTENDED behavior ⚠️

Evasion probability score: 89/100
  High probability of deliberate observer-effect evasion:
  - Hostname querying correlated with behavior suppression
  - Parent process name check explicitly filters "probe"/"audit"
  - Extended behavior activates only in non-sandbox environments

Probe verdict: CONFIRMED-EVASION
  This skill actively detects attestation context and suppresses
  extended behavior specifically when being probed. Standard attestation
  methods would consistently return COMPLIANT.

Recommended actions:
  1. Treat this skill as confirmed malicious — evasion is intentional
  2. Report parent-process-name filtering as a trust & safety violation
  3. Audit all skills from the same publisher for similar patterns
  4. Do not deploy in any environment, regardless of attestation result

Related Tools

runtime-attestation-probe — Tests whether declared and actual behavior match; observer-effect-probe tests whether the skill can detect and evade that attestation
hollow-validation-checker — Detects fake install-time tests; observer-effect-probe detects active attestation evasion at runtime
supply-chain-poison-detector — Identifies malicious payloads in static analysis; observer-effect-probe catches evasion that static analysis cannot see
blast-radius-estimator — Estimates propagation impact; use after confirmed evasion to assess how many agents may have received a falsely attested skill

Limitations

Observer-effect probing is an adversarial arms race: a sufficiently sophisticated evasion implementation can detect probing even when the probe attempts to minimize its fingerprint. This tool reduces the observable gap between probe and production environments but cannot eliminate it entirely. Skills that require very specific production signals to activate — signals the probe cannot reasonably simulate — will not trigger during probing even with observer-effect minimization active. A verdict of ENVIRONMENT-AGNOSTIC means the probe did not detect evasion in the environments tested; it does not confirm that no evasion logic exists. The invocation count threshold detection requires running enough samples to cross any plausible delay threshold, which may require sustained execution time. Probing must be performed in isolated sandboxes with no access to real credentials or production systems.

Comments

Loading comments...