Scout

Other

Structured OSINT research on people, companies, and organizations. Use when the user wants a provenance-backed brief, entity resolution across public sources...

Install

openclaw skills install ocas-scout

Scout

Scout conducts lawful OSINT research on people, companies, and organizations, assembling provenance-backed briefs where every claim carries a source reference, retrieval timestamp, and direct quote. It works through a tiered source waterfall — public web first, then rate-limited registries, then paid databases only with explicit permission — collecting no more than the stated research goal requires.

When to use

Research a person and build a source-backed brief
Do background research on a company using public sources
Resolve whether two profiles are the same person with cited sources
Compile what is publicly knowable about a subject
Expand a quick lookup into an auditable brief

When not to use

Illegal intrusion into private systems
Credential theft or bypassing access controls
Covert surveillance
Speculative doxxing
Topic research without a person/org focus — use Sift

Responsibility boundary

Scout owns lawful OSINT research on people and organizations with provenance-backed output.

Scout does not own: general topic research (Sift), image processing (Look), knowledge graph writes (Elephas), social graph (Weave), communications (Dispatch).

Commands

scout.research.start — begin a new research request with subject and goal
scout.research.expand --tier <1|2|3> — escalate to a higher source tier
scout.brief.render — generate the final markdown brief with findings and sources
scout.brief.render_pdf — optional PDF brief generation
scout.status — return current research state
scout.journal — write journal for the current run; called at end of every run
scout.update — pull latest from GitHub source; preserves journals and data

Invariants

Legality-first — only publicly available sources without bypassing access controls
Minimization — collect only what the research goal requires
Provenance for every claim — at least one source reference with URL, retrieval timestamp, and quote
Paid sources require explicit permission — Tier 3 needs a recorded PermissionGrant
No doxxing by default — private details suppressed unless explicitly permitted
Uncertainty must be surfaced — incomplete identity resolution stated clearly

Input contract

ResearchRequest requires: request_id, as_of, subject (type, name, aliases, known_locations, known_handles), goal, constraints (time_budget_minutes, minimize_pii).

Read references/scout_schemas.md for exact schema.

Research workflow

Normalize request and subject identity inputs
Resolve likely identity matches conservatively
Run Tier 1 public-source collection
Record provenance for every retained claim
Compile preliminary findings with confidence levels
Escalate to Tier 2 only if enabled and useful
Escalate to Tier 3 only after explicit permission grant is recorded
Generate brief with findings, uncertainty, and source log
Store request, findings, sources, and decisions locally
Emit Signal files for confirmed entities and relationships to ~/openclaw/db/ocas-elephas/intake/{signal_id}.signal.json. Use Signal schema from spec-ocas-shared-schemas.md. One file per entity or relationship with sufficient confidence.
Write journal via scout.journal

When minimize_pii=true, suppress unnecessary sensitive details in the final brief.

Source waterfall

Read references/scout_source_waterfall.md for full tier logic.

Tier 1 — public web, official sites, news, filings, public social profiles. Automatic.
Tier 2 — rate-limited sources, registries, extended datasets. Only if enabled and useful.
Tier 3 — paid OSINT providers, background databases. Requires explicit permission grant.

Output requirements

Markdown brief with: Executive Summary, Identity Resolution Notes, Findings, Risk and Uncertainty, Source Log. Every finding carries source-backed provenance.

Inter-skill interfaces

Scout writes Signal files to Elephas intake: ~/openclaw/db/ocas-elephas/intake/{signal_id}.signal.json

Emit one Signal file per confirmed entity or high-confidence relationship discovered during research. Use the Signal schema from spec-ocas-shared-schemas.md. Elephas decides promotion.

See spec-ocas-interfaces.md for signal format.

Storage layout

~/openclaw/data/ocas-scout/
  config.json
  requests.jsonl
  sources.jsonl
  findings.jsonl
  decisions.jsonl
  briefs/
  reports/

~/openclaw/journals/ocas-scout/
  YYYY-MM-DD/
    {run_id}.json

Default config.json:

{
  "skill_id": "ocas-scout",
  "skill_version": "2.3.0",
  "config_version": "1",
  "created_at": "",
  "updated_at": "",
  "waterfall": {
    "enabled_tiers": [1, 2]
  },
  "paid_sources": {
    "enabled": false
  },
  "brief": {
    "format": "markdown"
  },
  "retention": {
    "days": 90,
    "max_records": 10000
  }
}

OKRs

Universal OKRs from spec-ocas-journal.md apply to all runs.

skill_okrs:
  - name: verified_claim_ratio
    metric: fraction of findings with at least one verified source reference
    direction: maximize
    target: 0.70
    evaluation_window: 30_runs
  - name: entity_resolution_accuracy
    metric: fraction of identity resolutions confirmed correct
    direction: maximize
    target: 0.90
    evaluation_window: 30_runs
  - name: source_diversity
    metric: median unique source domains per brief
    direction: maximize
    target: 6
    evaluation_window: 30_runs

Optional skill cooperation

Weave — read social graph (read-only) for identity context
Elephas — optionally emit Signal files for Chronicle promotion
Sift — may use Sift for web searches during research

Journal outputs

Observation Journal — research runs producing findings
Research Journal — structured multi-source research sessions

Visibility

public

Initialization

On first invocation of any Scout command, run scout.init:

Create ~/openclaw/data/ocas-scout/ and all subdirectories (briefs/, reports/)
Write default config.json with ConfigBase fields if absent
Create empty JSONL files: requests.jsonl, sources.jsonl, findings.jsonl, decisions.jsonl
Create ~/openclaw/journals/ocas-scout/
Ensure ~/openclaw/db/ocas-elephas/intake/ exists (create if missing)
Register cron job scout:update if not already present (check openclaw cron list first)
Log initialization as a DecisionRecord in decisions.jsonl

Background tasks

Job name	Mechanism	Schedule	Command
`scout:update`	cron	`0 0 * * *` (midnight daily)	`scout.update`

openclaw cron add --name scout:update --schedule "0 0 * * *" --command "scout.update" --sessionTarget isolated --lightContext true --timezone America/Los_Angeles

Self-update

scout.update pulls the latest package from the source: URL in this file's frontmatter. Runs silently — no output unless the version changed or an error occurred.

Read source: from frontmatter → extract {owner}/{repo} from URL
Read local version from skill.json
Fetch remote version: gh api "repos/{owner}/{repo}/contents/skill.json" --jq '.content' | base64 -d | python3 -c "import sys,json;print(json.load(sys.stdin)['version'])"
If remote version equals local version → stop silently

Download and install:

TMPDIR=$(mktemp -d)
gh api "repos/{owner}/{repo}/tarball/main" > "$TMPDIR/archive.tar.gz"
mkdir "$TMPDIR/extracted"
tar xzf "$TMPDIR/archive.tar.gz" -C "$TMPDIR/extracted" --strip-components=1
cp -R "$TMPDIR/extracted/"* ./
rm -rf "$TMPDIR"

On failure → retry once. If second attempt fails, report the error and stop.
Output exactly: I updated Scout from version {old} to {new}

Support file map

File	When to read
`references/scout_schemas.md`	Before creating requests, findings, or briefs
`references/scout_source_waterfall.md`	Before tier selection or escalation decisions
`references/scout_brief_template.md`	Before rendering briefs
`references/journal.md`	Before scout.journal; at end of every run