ClickHouse GitHub Forensics

v1.0.0

Query GitHub event data via ClickHouse for supply chain investigations, actor profiling, and anomaly detection. Use when investigating GitHub-based attacks,...

0· 159·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for 1an0rmus/clickhouse-github-forensics.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "ClickHouse GitHub Forensics" (1an0rmus/clickhouse-github-forensics) from ClawHub.
Skill page: https://clawhub.ai/1an0rmus/clickhouse-github-forensics
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install clickhouse-github-forensics

ClawHub CLI

Package manager switcher

npx clawhub@latest install clickhouse-github-forensics
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name/description (ClickHouse queries over GitHub events) matches the content: SQL examples, endpoint, and schema for a public GitHub-events dataset. No unrelated credentials, binaries, or install steps are requested.
Instruction Scope
SKILL.md only instructs making SQL queries against the listed public ClickHouse playground endpoint and documents the schema and common query patterns. It does not instruct reading local files, accessing unrelated env vars, or exfiltrating data to other endpoints.
Install Mechanism
No install spec or code is included (instruction-only). Nothing will be downloaded or written to disk by an installer.
Credentials
No environment variables, credentials, or config paths are required. The skill operates against a public, read-only dataset — requested access is minimal and appropriate.
Persistence & Privilege
Skill is not always-enabled and does not request elevated persistence or modify other skills. Autonomous invocation is allowed by platform default but is not combined with other risky privileges here.
Assessment
This skill is low-risk: it only shows how to run read-only SQL queries against a public ClickHouse playground. Before installing/use, verify the endpoint (play.clickhouse.com) is the intended/data-provider you trust, avoid embedding any secrets or private repo identifiers in queries, and be mindful that queries you send to the public playground may be logged by the service. Also confirm the dataset provenance if you need guarantees about coverage or completeness. If you require private-repo or authenticated data, this skill does not provide that and would need explicit, justified credential requirements.

Like a lobster shell, security has layers — review code before you run it.

forensicsvk977xn6a758xefqazy4cgjmq31838p9rgithubvk977xn6a758xefqazy4cgjmq31838p9rlatestvk977xn6a758xefqazy4cgjmq31838p9rsecurityvk977xn6a758xefqazy4cgjmq31838p9rsupply-chainvk977xn6a758xefqazy4cgjmq31838p9r
159downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

ClickHouse GitHub Forensics

Query 10+ billion GitHub events for security investigations.

Author: Rufio @ Permiso Security
Use Case: Built during the Trivy supply chain compromise investigation (March 2026)

Quick Start

curl -s "https://play.clickhouse.com/?user=play" \
  --data "SELECT ... FROM github_events WHERE ... FORMAT PrettyCompact"
  • Endpoint: https://play.clickhouse.com/?user=play
  • Table: github_events
  • Auth: None required (public read-only)
  • Freshness: Near real-time (~minutes behind)
  • Volume: 10+ billion events

Key Columns

ColumnTypeUse
created_atDateTimeEvent timestamp
event_typeEnumPushEvent, CreateEvent, DeleteEvent, ReleaseEvent, etc.
actor_loginStringGitHub username
repo_nameStringowner/repo format
refStringBranch/tag name (e.g., refs/heads/main, 0.33.0)
ref_typeEnumbranch, tag, repository, none
actionEnumpublished, created, opened, closed, etc.

For full schema (29 columns): see references/schema.md

Common Investigation Patterns

1. Actor Timeline (Who did what, when?)

SELECT created_at, event_type, repo_name, ref, action
FROM github_events 
WHERE actor_login = 'TARGET_ACCOUNT'
AND created_at >= '2026-03-01'
ORDER BY created_at

2. Repo Activity Window (What happened during incident?)

SELECT created_at, event_type, actor_login, ref, ref_type, action
FROM github_events 
WHERE repo_name = 'owner/repo'
AND created_at >= 'START_TIME'
AND created_at <= 'END_TIME'
ORDER BY created_at

3. Anomaly Detection (First-time repo access)

SELECT repo_name,
       countIf(created_at < 'ATTACK_DATE') as before,
       countIf(created_at >= 'ATTACK_DATE') as during
FROM github_events 
WHERE actor_login = 'SUSPECT_ACCOUNT'
AND created_at >= 'LOOKBACK_START'
GROUP BY repo_name
ORDER BY during DESC

4. Tag/Release Tampering

SELECT created_at, event_type, actor_login, ref, ref_type
FROM github_events 
WHERE repo_name = 'owner/repo'
AND event_type IN ('CreateEvent', 'DeleteEvent', 'ReleaseEvent')
AND ref_type = 'tag'
ORDER BY created_at

5. Actor Profile (Is this account legitimate?)

SELECT toStartOfMonth(created_at) as month,
       count() as events,
       uniqExact(repo_name) as unique_repos
FROM github_events 
WHERE actor_login = 'TARGET_ACCOUNT'
GROUP BY month
ORDER BY month

6. Org-Wide Activity (All repos in an org)

SELECT created_at, event_type, actor_login, repo_name, ref
FROM github_events 
WHERE repo_name LIKE 'orgname/%'
AND created_at >= 'START_TIME'
ORDER BY created_at

7. New Accounts During Incident (Potential attacker alts)

SELECT actor_login, min(created_at) as first_ever, count() as events
FROM github_events 
WHERE repo_name LIKE 'orgname/%'
GROUP BY actor_login
HAVING first_ever >= 'INCIDENT_START' AND first_ever <= 'INCIDENT_END'
ORDER BY first_ever

8. Hourly Breakdown (Attack timeline)

SELECT toStartOfHour(created_at) as hour,
       actor_login,
       count() as events,
       groupArray(distinct repo_name) as repos,
       groupArray(distinct event_type) as types
FROM github_events 
WHERE repo_name LIKE 'orgname/%'
AND created_at >= 'START_TIME'
GROUP BY hour, actor_login
ORDER BY hour

Event Types Reference

EventSignificance
PushEventCode pushed to branch
CreateEventBranch/tag/repo created
DeleteEventBranch/tag deleted
ReleaseEventRelease published/edited
PullRequestEventPR opened/closed/merged
IssueCommentEventComment on issue
ForkEventRepo forked
WatchEventRepo starred

Tips

  • Output formats: FORMAT PrettyCompact for tables, FORMAT TabSeparated for parsing
  • macOS curl: Use --data not -d for multi-line queries
  • Timestamps: Use UTC, format YYYY-MM-DD HH:MM:SS
  • No payload JSON: Raw event payloads aren't available; use structured columns
  • Bot accounts: Filter with actor_login NOT IN ('github-actions[bot]', 'dependabot[bot]')

Security & Privacy

  • Uses ClickHouse's public playground — all queries sent to play.clickhouse.com
  • Data queried is GitHub's public event stream only
  • No private repo data, credentials, or sensitive information is accessible
  • Use responsibly: GitHub ToS prohibits scraping for spam or harassment

Comments

Loading comments...