Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Sift

v2.3.0

Web search, research synthesis, fact verification, and entity extraction. The system's general research engine. Use for topic research, web lookups, fact-che...

0· 243·0 current·0 all-time
byIndigo Karasu@indigokarasu

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for indigokarasu/ocas-sift.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Sift" (indigokarasu/ocas-sift) from ClawHub.
Skill page: https://clawhub.ai/indigokarasu/ocas-sift
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install ocas-sift

ClawHub CLI

Package manager switcher

npx clawhub@latest install ocas-sift
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The skill's name/description (web research, synthesis, fact verification, entity extraction) aligns with the declared optional credentials (Brave, Exa, Tavily) and with the filesystem reads/writes (journals, data, Elephas intake). Reading Thread/Chronicle context and using tiered search providers is consistent with the research functionality.
!
Instruction Scope
Runtime instructions ask the agent to persist sessions, write journals, and emit Signal files to ~/openclaw/db/ocas-elephas/intake/ for entity promotion — behavior that will make researched data persistent and hand it off to another skill (Elephas). The instructions also mention reading conversation context, Chronicle, and potentially geolocation from other components; those data sources are not listed as explicit credentials but the skill will consume contextual/system data if available. The SKILL.md also instructs self-update behavior (pull latest from GitHub) and README claims registering a midnight cron for automatic updates — these are side effects beyond pure query/lookup work and expand runtime scope.
!
Install Mechanism
There is no formal install spec in the registry, but SKILL.md and README include an 'openclaw skill install https://github.com/indigokarasu/sift' line and describe automatic self-updates via a cron job. The registry provides no packaged install instructions or vetted release URL; self-update/auto-install behavior implies fetching code from GitHub at runtime, which increases risk and is not fully declared in the registry metadata.
Credentials
The skill lists optional API keys for the search/semantic providers (brave_search_api_key, exa_api_key, tavily_api_key) which are proportional to its function. No unrelated secrets are requested. However, it will write extracted entities and decisions to local intake/journal paths — this may surface or persist sensitive content into the system knowledge pipeline (Elephas/Chronicle), so users should consider whether that data flow is acceptable.
!
Persistence & Privilege
Although always:false, the skill claims to register a daily 'sift:update' cron job and to persist session/journal/entity files under the user's home directory. Scheduled self-updates and persistent writing into shared intake directories represent lasting changes and a broader blast radius (automatic downloads, ongoing background behavior, and cross-skill data flows). These persistent actions are not fully explicit in the registry install metadata.
What to consider before installing
What to consider before installing: - Data persistence: Sift will write journals, session data, and Signal files to ~/openclaw/... and will emit extracted entities to the Elephas intake; reviewed or sensitive queries can become persistent artifacts and may be promoted into a shared knowledge graph. If you handle sensitive data, decide whether to allow these write locations or to restrict them. - Self-update behavior: SKILL.md/README describe automatic self-updates and a cron job that pulls from GitHub. The registry has no formal install package; that means the skill expects to fetch code externally at runtime. Ask whether you want a skill that can download and update itself automatically — this increases risk and you should review the upstream GitHub repo before enabling. - External network calls: The skill will call free search providers (Brave/DuckDuckGo/SearXNG) and optional paid semantic providers (Exa/Tavily). Optional API keys are reasonable for this purpose; provide only keys you trust. If you prefer offline or air-gapped usage, do not supply provider keys and limit external tiers. - Cross-skill interactions: Sift writes to Elephas intake and may read Thread/Chronicle context. Confirm you trust those other skills and that their intake/promotion behavior is acceptable. - Checklist before install: inspect the upstream GitHub repository referenced in SKILL.md, confirm you accept automatic writes to ~/openclaw paths, decide whether to allow cron-based self-updates, and restrict or withhold API keys if you want to limit external queries. Because of the undeclared self-update/cron behavior and persistent cross-skill writes, this skill is coherent with its purpose but carries non-trivial persistence and remote-fetch risks — review the upstream source and decide whether to disable auto-updates or restrict filesystem paths before enabling.

Like a lobster shell, security has layers — review code before you run it.

latestvk9755dkvm4qen2da0qtdexfq9183rszm
243downloads
0stars
3versions
Updated 3h ago
v2.3.0
MIT-0

Sift

Sift is the system's general research engine, retrieving and synthesizing information from the web across a tiered source hierarchy — internal knowledge first, then free web search, then rate-limited semantic research providers for deep work. It evaluates source reliability through cross-source agreement scoring, extracts structured entities from retrieved content, and emits enrichment candidates to Chronicle so researched knowledge accumulates over time.

When to use

  • Web search and research synthesis on any topic
  • Fact verification across multiple sources with consensus scoring
  • Document summarization and structured entity extraction
  • Comparison research across products, technologies, or options
  • Deep research sessions with multi-source threading

When not to use

  • OSINT investigations on individuals — use Scout
  • Image-to-action processing — use Look
  • Pattern analysis on the knowledge graph — use Corvus
  • Communications and message drafting — use Dispatch

Sift never performs OSINT investigations on individuals. If the primary entity of a query is a person, Scout should be invoked.

Responsibility boundary

Sift owns web research, fact verification, and structured entity extraction.

Sift does not own: person-focused OSINT (Scout), image processing (Look), knowledge graph writes (Elephas), pattern analysis (Corvus), social graph (Weave).

Commands

  • sift.search — execute a search query with automatic tier selection and query rewriting
  • sift.research — run a multi-source research session producing a structured research journal
  • sift.verify — fact-check a specific claim across multiple sources with consensus scoring
  • sift.summarize — summarize a document or URL with structured entity extraction
  • sift.extract — extract entities, claims, statistics, and relationships from content
  • sift.thread.list — list active research threads with entity overlap detection
  • sift.status — return current state: active threads, quota usage, source reputation summary
  • sift.journal — write journal for the current run; called at end of every run
  • sift.update — pull latest from GitHub source; preserves journals and data

Response modes

Sift classifies query depth automatically:

  • quick_answer — simple factual lookups, single-source sufficient
  • comparison — multi-source comparison with structured output
  • research — deep multi-session investigation with threading
  • document_analysis — URL or document-focused extraction

Users may override with phrases like "quick answer", "deep dive", "compare", or "summarize".

Search tier selection

  • Tier 1 — Internal Knowledge: LLM knowledge, conversation context, Chronicle if available.
  • Tier 2 — Free Web Search: Brave Search API, SearXNG, DuckDuckGo. Default for all queries.
  • Tier 3 — Semantic Research: Exa, Tavily. Deep research with sparse sources only. Quota-limited.

Read references/search_tiers.md for provider details and escalation criteria.

Source reputation model

Sift maintains per-domain trust scores based on: cross-source agreement, contradiction frequency, historical accuracy, structured data quality, citation frequency.

Structured extraction rules

When pages are retrieved, extract: entities (with type from shared ontology), claims, statistics, relationships, citations. Each extraction includes confidence level.

Extracted entities are emitted as enrichment candidates for Elephas.

Run completion

After every Sift command that produces results:

  1. Persist session, entities, sources, and decisions to local JSONL files
  2. For each extracted entity or relationship with confidence >= med: write a Signal file to ~/openclaw/db/ocas-elephas/intake/{signal_id}.signal.json. Use Signal schema from spec-ocas-shared-schemas.md.
  3. Write journal via sift.journal

Chronicle interaction

Sift never writes directly to Chronicle. It emits enrichment candidates via Signal files to ~/openclaw/db/ocas-elephas/intake/{signal_id}.signal.json. Elephas decides promotion.

Inter-skill interfaces

Sift writes Signal files to Elephas intake: ~/openclaw/db/ocas-elephas/intake/{signal_id}.signal.json

Sift may read from Thread (when present) for recent browsing context to improve query rewriting. This is a cooperative read, not a dependency.

See spec-ocas-interfaces.md for signal format.

Storage layout

~/openclaw/data/ocas-sift/
  config.json
  sessions.jsonl
  threads.jsonl
  entities.jsonl
  sources.jsonl
  decisions.jsonl
  reports/

~/openclaw/journals/ocas-sift/
  YYYY-MM-DD/
    {run_id}.json

Default config.json:

{
  "skill_id": "ocas-sift",
  "skill_version": "2.3.0",
  "config_version": "1",
  "created_at": "",
  "updated_at": "",
  "search": {
    "default_tier": 2,
    "tier3_daily_limit": 50
  },
  "retention": {
    "days": 30,
    "max_records": 10000
  }
}

OKRs

Universal OKRs from spec-ocas-journal.md apply to all runs.

skill_okrs:
  - name: source_accuracy
    metric: fraction of extracted facts confirmed by cross-source agreement
    direction: maximize
    target: 0.85
    evaluation_window: 30_runs
  - name: tier3_quota_compliance
    metric: fraction of days where Tier 3 usage stays within daily limit
    direction: maximize
    target: 1.0
    evaluation_window: 30_runs
  - name: entity_extraction_precision
    metric: fraction of extracted entities with valid source reference
    direction: maximize
    target: 0.90
    evaluation_window: 30_runs

Optional skill cooperation

  • Elephas — emit Signal files for Chronicle promotion after every extraction
  • Thread — may read recent browsing context for query rewriting (cooperative, not required)
  • Weave — may use Weave for entity disambiguation
  • Chronicle — may read Chronicle (read-only) for entity context

Journal outputs

  • Observation Journal — search and extraction runs
  • Research Journal — structured multi-source research sessions

Initialization

On first invocation of any Sift command, run sift.init:

  1. Create ~/openclaw/data/ocas-sift/ and subdirectories (reports/)
  2. Write default config.json with ConfigBase fields if absent
  3. Create empty JSONL files: sessions.jsonl, threads.jsonl, entities.jsonl, sources.jsonl, decisions.jsonl
  4. Create ~/openclaw/journals/ocas-sift/
  5. Ensure ~/openclaw/db/ocas-elephas/intake/ exists (create if missing)
  6. Register cron job sift:update if not already present (check openclaw cron list first)
  7. Log initialization as a DecisionRecord in decisions.jsonl

Background tasks

Job nameMechanismScheduleCommand
sift:updatecron0 0 * * * (midnight daily)sift.update
openclaw cron add --name sift:update --schedule "0 0 * * *" --command "sift.update" --sessionTarget isolated --lightContext true --timezone America/Los_Angeles

Self-update

sift.update pulls the latest package from the source: URL in this file's frontmatter. Runs silently — no output unless the version changed or an error occurred.

  1. Read source: from frontmatter → extract {owner}/{repo} from URL
  2. Read local version from skill.json
  3. Fetch remote version: gh api "repos/{owner}/{repo}/contents/skill.json" --jq '.content' | base64 -d | python3 -c "import sys,json;print(json.load(sys.stdin)['version'])"
  4. If remote version equals local version → stop silently
  5. Download and install:
    TMPDIR=$(mktemp -d)
    gh api "repos/{owner}/{repo}/tarball/main" > "$TMPDIR/archive.tar.gz"
    mkdir "$TMPDIR/extracted"
    tar xzf "$TMPDIR/archive.tar.gz" -C "$TMPDIR/extracted" --strip-components=1
    cp -R "$TMPDIR/extracted/"* ./
    rm -rf "$TMPDIR"
    
  6. On failure → retry once. If second attempt fails, report the error and stop.
  7. Output exactly: I updated Sift from version {old} to {new}

Visibility

public

Support file map

FileWhen to read
references/schemas.mdBefore creating sessions, threads, or extraction records
references/search_tiers.mdBefore tier selection or escalation
references/query_rewrite.mdBefore query rewriting
references/journal.mdBefore sift.journal; at end of every run

Comments

Loading comments...