OSINT Investigator

Deep OSINT (Open Source Intelligence) investigations. Use when the user wants to research, find, or investigate any person, place, organisation, username, domain, IP address, phone number, image, vehicle, or object using publicly available information. Triggers on phrases like "find information on", "investigate", "look up", "who is", "trace this", "dig into", "OSINT search", "background check", or any request to gather open-source intelligence about a target. Performs deep multi-source analysis across web search, social media, DNS/WHOIS, image search, maps, public records, and more — returning a structured intelligence report.

Audits

Suspicious

ClawScanSuspicious

Agentic behavior and permission review.

Static analysisPass

Pattern checks against bundled files.

VirusTotalPass

Multi-engine malware detections and file reputation.

Install

openclaw skills install osint-investigator

OSINT Investigator

Multi-source open-source intelligence gathering. Identify target type, run all applicable modules, then produce a structured report.

Target Classification

Before running any module, classify the target:

Person (real name, alias, face) → modules: social, web, image, username
Username / Handle → modules: username, social, web
Domain / Website → modules: dns, whois, web, social
IP Address → modules: ip, dns, web
Organisation / Company → modules: web, social, dns, maps, corporate
Phone Number → modules: phone, web, social
Email Address → modules: email, web, social
Location / Address → modules: maps, web, social, geo
Image / Photo → modules: image, reverse
Object / Asset → modules: web, image, social

Run ALL applicable modules in parallel. Never stop after one source.

Module Playbook

🌐 Web Search (`web_search` tool)

Run at minimum 5–8 targeted queries per target. Vary operators:

"full name" site:linkedin.com
"username" -site:twitter.com
target filetype:pdf
target inurl:profile
"target" "email" OR "contact" OR "phone"
target site:reddit.com
target site:github.com

Follow top URLs with web_fetch to extract full content.

🔗 DNS / WHOIS

whois <domain>
dig <domain> ANY
dig <domain> MX
dig <domain> TXT
nslookup <domain>
host <domain>

Also fetch: https://rdap.org/domain/<domain> via web_fetch

🌍 IP Intelligence

curl -s https://ipinfo.io/<ip>/json
curl -s https://ip-api.com/json/<ip>

Also check: https://www.shodan.io/host/<ip> via web_fetch

📱 Username Search

Check all platforms via web_fetch (just check HTTP status + page title — don't need to load full content for existence checks):

https://github.com/<username>
https://twitter.com/<username>
https://instagram.com/<username>
https://reddit.com/user/<username>
https://tiktok.com/@<username>
https://youtube.com/@<username>
https://linkedin.com/in/<username>
https://medium.com/@<username>
https://pinterest.com/<username>
https://twitch.tv/<username>
https://steamcommunity.com/id/<username>
https://keybase.io/<username>
https://t.me/<username> (Telegram)

🐦 Social Media Deep Dive

For each confirmed platform profile, use web_fetch to extract:

Bio / description
Profile photo URL
Follower/following counts
Join date
Location (if listed)
Links in bio
Pinned posts / recent activity

For Twitter/X: also search web_search for site:twitter.com "<target>" and nitter mirrors.

🗺️ Maps & Location

# Use web_fetch or browser for:
# Google Maps search
https://maps.googleapis.com/maps/api/geocode/json?address=<address>&key=<key>
# Or use goplaces skill if available
# Streetview metadata check
https://maps.googleapis.com/maps/api/streetview/metadata?location=<lat,lng>&key=<key>

Also search: web_search for "<target location>" site:maps.google.com OR site:wikimapia.org OR site:openstreetmap.org

🖼️ Image Search & Reverse Image Search

Finding images of a person (no image provided):

Search for profile photos on all confirmed social profiles — extract direct image URLs from page source or og:image meta tags
Run web_search for "<name>" site:linkedin.com — LinkedIn og:image often returns profile photo URL directly
Check Gravatar: compute MD5 of likely email addresses → https://www.gravatar.com/<md5>.json
Search news/press: web_search for "<name>" filetype:jpg OR filetype:png
Use web_fetch to pull og:image from any confirmed profile pages

Reverse image search (image URL or local file provided):

# Direct URL-based reverse search (use web_fetch):
https://yandex.com/images/search?rpt=imageview&url=<image_url>
https://tineye.com/search?url=<image_url>

# Google Lens (requires browser tool):
https://lens.google.com/uploadbyurl?url=<image_url>

# For avatars and profile images — extract URL then feed into:
# 1. Yandex (best for face matching, indexes more than Google)
# 2. TinEye (exact match/copy detection)
# 3. Google Lens via browser tool

EXIF / Metadata extraction (if file is available locally):

exiftool <image>            # full metadata dump
exiftool -gps:all <image>   # GPS coordinates only
exiftool -DateTimeOriginal <image>  # when photo was taken

Online tools: web_fetch https://www.metadata2go.com or https://www.pic2map.com

Photo geolocation (no EXIF GPS):

Street signs, shop names, vehicle plates → web_search to identify region
Architecture / vegetation / road markings → narrow country/region
Sun angle + shadow direction → https://www.suncalc.org to estimate time & location
Cross-reference with Google Street View via browser tool

When searching for a person by image from social media:

web_fetch the profile page and look for og:image or <img> src in the rendered HTML
Extract the full CDN image URL
Feed to Yandex imageview and TinEye
Note: Instagram/Facebook CDN URLs expire — use Yandex cache or download first

📧 Email Intelligence

# Breach/exposure check
curl -s "https://haveibeenpwned.com/api/v3/breachedaccount/<email>" -H "hibp-api-key: <key>"
# Format validation + domain MX check
dig $(echo <email> | cut -d@ -f2) MX
# Gravatar (hashed MD5 of email)
curl -s "https://www.gravatar.com/<md5_hash>.json"

Also: web_search for "<email>" site:pastebin.com OR site:ghostbin.com

📞 Phone Intelligence

# Carrier / region lookup
curl -s "https://phonevalidation.abstractapi.com/v1/?api_key=<key>&phone=<number>"

Also: web_search for "<phone_number>" and check site:truecaller.com, site:whitepages.com

🏢 Corporate / Organisation

Use web_fetch on:

https://opencorporates.com/companies?q=<name>
Companies House (UK): https://find-and-update.company-information.service.gov.uk/search?q=<name>
LinkedIn company page: https://linkedin.com/company/<slug>
Crunchbase: web_search for site:crunchbase.com "<company>"

📄 Document & Data Leaks

web_search queries:
"<target>" filetype:pdf OR filetype:xlsx OR filetype:docx
"<target>" site:pastebin.com
"<target>" site:github.com password OR secret OR key
"<target>" site:trello.com OR site:notion.so

🔍 Cache & Archive

# Wayback Machine
curl -s "https://archive.org/wayback/available?url=<url>"
web_fetch "https://web.archive.org/web/*/<url>" for snapshots
# Google Cache via web_search: cache:<url>

Investigation Workflow

Classify the target type
Plan — list all modules to run
Execute all modules (parallelise where possible using multiple tool calls)
Correlate — cross-reference findings across sources, note consistencies and conflicts
Report — structured output (see below)

Report Format

Always produce a structured report. Adapt sections to what was found:

# OSINT Report: <Target>
**Date:** <UTC timestamp>
**Target Type:** <classification>
**Query:** <original user request>

## Identity Summary
[Key identifying information — name, aliases, age, location, nationality]

## Online Presence
[Confirmed profiles with URLs, follower counts, activity level]

## Contact & Technical
[Email addresses, phone numbers, domains, IPs]

## Location Intelligence
[Known locations, addresses, coordinates, map links]

## Corporate / Organisational Links
[Companies, roles, affiliations]

## Historical Data
[Archived content, old usernames, past locations]

## Document & Data Exposure
[Public documents, paste sites, leak mentions]

## Image Intelligence
[Profile photos, reverse image results, photo metadata]

## Confidence & Gaps
[Confidence level per finding — High/Medium/Low; list gaps]

## Sources
[All URLs consulted]

Configuration & Authentication

Config is stored at: <skill_dir>/config/osint_config.json (chmod 600, auto-created on first save).

The agent configures everything conversationally — no terminal script needed. When the user says they want to add credentials, configure PDF output, or set up an API key, follow the flow below.

Conversational Config Flow

When the user wants to configure the skill, ask them questions directly in chat and write the answers to the config file yourself using the write tool.

Step 1 — Ask what they want to configure:

"What would you like to set up? I can configure:

Platform credentials (Instagram, Twitter/X, LinkedIn, Facebook)

API keys (Google Maps, Shodan, HaveIBeenPwned, Hunter.io, AbstractAPI Phone)

PDF report output (on/off, save location)"

Step 2 — Collect the values (ask one platform at a time):

For API keys: ask them to paste the key directly in chat
For passwords: warn them the value will be stored in a local JSON file, then ask
For output settings: ask yes/no / provide a path

Step 3 — Write the config:

# Read existing config (or start fresh)
import json, os
cfg_path = "<skill_dir>/config/osint_config.json"
os.makedirs(os.path.dirname(cfg_path), exist_ok=True)
cfg = json.load(open(cfg_path)) if os.path.exists(cfg_path) else {"platforms": {}, "output": {}}

# Example: save Twitter bearer token
cfg["platforms"]["twitter"] = {"configured": True, "method": "api_key", "bearer_token": "<VALUE>"}

# Example: enable PDF
cfg["output"]["pdf_enabled"] = True
cfg["output"]["pdf_output_dir"] = "~/Desktop"

# Write back
with open(cfg_path, "w") as f:
    json.dump(cfg, f, indent=2)
os.chmod(cfg_path, 0o600)

Use the write tool directly — no need to run Python.

Supported Platform Integrations

Platform	Fields	What It Unlocks
Instagram	`username`, `password`	Profile content behind login wall — use a burner account
Twitter/X	`bearer_token` (+ optional `api_key`, `api_secret`)	Full tweet/profile/search via API v2 (free tier works)
LinkedIn	`username` (email), `password`	Profile scraping — use sparingly, heavily rate-limited
Facebook	`email`, `password`	Public profile/group content
Google Maps	`api_key`	Geocoding, Place Search, Street View metadata
Shodan	`api_key`	Deep IP/host intelligence
HaveIBeenPwned	`api_key`	Email breach lookups ($3.95/mo at haveibeenpwned.com/API/Key)
Hunter.io	`api_key`	Email discovery by domain (free: 25 req/mo at hunter.io/api-keys)
AbstractAPI Phone	`api_key`	Phone carrier/region lookup (app.abstractapi.com/api/phone-validation)

Reading Credentials During a Search

# Read config and extract a value in one line:
BEARER=$(python3 -c "import json; c=json.load(open('<skill_dir>/config/osint_config.json')); print(c['platforms']['twitter']['bearer_token'])")

# Then use it:
curl -s -H "Authorization: Bearer $BEARER" \
  "https://api.twitter.com/2/users/by/username/<handle>?user.fields=description,location,created_at,public_metrics"

Twitter/X API v2 (when configured)

# Profile lookup
curl -s -H "Authorization: Bearer $BEARER" \
  "https://api.twitter.com/2/users/by/username/<handle>?user.fields=description,location,created_at,public_metrics,entities"

# Recent tweets
curl -s -H "Authorization: Bearer $BEARER" \
  "https://api.twitter.com/2/users/<user_id>/tweets?max_results=10&tweet.fields=created_at,geo,entities"

# Search recent tweets
curl -s -H "Authorization: Bearer $BEARER" \
  "https://api.twitter.com/2/tweets/search/recent?query=<query>&max_results=10"

Shodan API (when configured)

curl -s "https://api.shodan.io/shodan/host/<ip>?key=$SHODAN_KEY"
curl -s "https://api.shodan.io/dns/resolve?hostnames=<domain>&key=$SHODAN_KEY"

Hunter.io API (when configured)

curl -s "https://api.hunter.io/v2/domain-search?domain=<domain>&api_key=$HUNTER_KEY"
curl -s "https://api.hunter.io/v2/email-verifier?email=<email>&api_key=$HUNTER_KEY"

HaveIBeenPwned API (when configured)

curl -s "https://haveibeenpwned.com/api/v3/breachedaccount/<email>" \
  -H "hibp-api-key: $HIBP_KEY" -H "User-Agent: osint-investigator"

Google Maps API (when configured)

curl -s "https://maps.googleapis.com/maps/api/geocode/json?address=<address>&key=$GMAPS_KEY"
curl -s "https://maps.googleapis.com/maps/api/place/textsearch/json?query=<query>&key=$GMAPS_KEY"
curl -s "https://maps.googleapis.com/maps/api/streetview/metadata?location=<lat,lng>&key=$GMAPS_KEY"

PDF Report Generation

Check if PDF is enabled

python3 -c "import json; c=json.load(open('<skill_dir>/config/osint_config.json')); print(c.get('output',{}).get('pdf_enabled', False))"

Generate a PDF

Write the markdown report to a temp file, then run the shell wrapper (self-installs fpdf2 if missing):

cat > /tmp/osint_report.md << 'ENDREPORT'
<full markdown report>
ENDREPORT

bash <skill_dir>/scripts/generate_pdf.sh \
  --input /tmp/osint_report.md \
  --target "Target Name" \
  --output ~/Desktop

The wrapper (generate_pdf.sh) will:

Check if fpdf2 is installed — install it automatically if not
Call generate_pdf.py with the same arguments
Print the output path: PDF saved: /path/to/OSINT_Name_20260225_1035.pdf

No setup needed by the user — works on any machine with Python 3 + pip.

PDF confidence colour coding

Confidence is detected automatically from the text of each section/paragraph/table row — just include the word in your report and the PDF will colour-code it:

🟢 GREEN [HIGH] — verified from multiple reliable sources
🟠 ORANGE [MED] — likely correct, single or unverified source
🔴 RED [LOW] — possible match, little corroborating evidence
⚪ GREY [UNVERIFIED] — user-provided context, not independently confirmed

Toggling PDF output via conversation

When the user says "turn on PDF reports" or "disable PDF output":

Read the config file
Update cfg["output"]["pdf_enabled"] to true or false
Write it back
Confirm to the user

Ethics & Legality

Only use publicly available data — never attempt to access private systems
Do not aggregate data in ways designed to facilitate stalking or harassment
Respect robots.txt in spirit; use cached/archive versions where direct scraping is blocked
If the target is clearly a private individual being investigated without consent, flag this before proceeding
Instagram/LinkedIn/Facebook credentials: always recommend a burner/alt account — never the user's personal accounts

Reference Files

references/osint-sources.md — curated OSINT databases, APIs, and search operators by category
references/social-platforms.md — platform-specific extraction tips and URL patterns
scripts/generate_pdf.py — PDF generator (requires fpdf2, auto-installed via shell wrapper)
scripts/generate_pdf.sh — shell wrapper; self-installs fpdf2, then calls generate_pdf.py
config/osint_config.json — live config (auto-created on first write, chmod 600)

OSINT Investigator

Audits

Install

OSINT Investigator

Target Classification

Module Playbook

🌐 Web Search (web_search tool)

🔗 DNS / WHOIS

🌍 IP Intelligence

📱 Username Search

🐦 Social Media Deep Dive

🗺️ Maps & Location

🖼️ Image Search & Reverse Image Search

📧 Email Intelligence

📞 Phone Intelligence

🏢 Corporate / Organisation

📄 Document & Data Leaks

🔍 Cache & Archive

Investigation Workflow

Report Format

Configuration & Authentication

Conversational Config Flow

Supported Platform Integrations

Reading Credentials During a Search

Twitter/X API v2 (when configured)

Shodan API (when configured)

Hunter.io API (when configured)

HaveIBeenPwned API (when configured)

Google Maps API (when configured)

PDF Report Generation

Check if PDF is enabled

Generate a PDF

PDF confidence colour coding

Toggling PDF output via conversation

Ethics & Legality

Reference Files

🌐 Web Search (`web_search` tool)