Skillv0.1.0

ClawScan security

Variant Annotation · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

ReviewMar 13, 2026, 9:36 AM

Verdict: Review
Confidence: medium
Model: gpt-5-mini
Summary: The skill's stated purpose (annotating variants using ClinVar/dbSNP and producing ACMG-backed reports) matches the code's use of NCBI E-utilities, but there are coherence issues — the SKILL.md promises many external data sources and predictions (gnomAD, SIFT, PolyPhen, CADD, etc.) while the packaging, dependency list, and visible code only show NCBI queries and no declared API keys or extra clients; this mismatch and incomplete dependency declaration warrant caution.
Guidance: This skill appears to be a legitimate variant-annotation tool that queries public NCBI endpoints (ClinVar and dbSNP). However, before installing or running it, consider the following: - Privacy: Variant strings and any patient-associated identifiers you pass to the tool will be sent to public databases (NCBI E-utilities). Do not feed identifiable patient data you cannot share publicly. - Implementation gaps: The SKILL.md promises population databases (gnomAD/ExAC/1000G) and functional predictors (SIFT, PolyPhen, CADD), but the manifest and requirements do not document how those are fetched. Review the full scripts/main.py to confirm which external APIs the code uses and whether additional API keys or local data files are required. - Missing dependency/install instructions: requirements.txt is minimal. If you plan to run the script, inspect main.py and run it in an isolated environment (virtualenv or container) to identify missing Python packages and avoid inadvertently executing unreviewed code system-wide. - Rate limits and API keys: The code supports an optional NCBI API key to increase rate limits. The skill does not declare how to supply that key; consider storing any API key securely and verify the CLI/argument behavior before providing secrets. - Security hygiene: Run the code on non-production data initially, review all network endpoints in the full source, and, if you need to use data that must remain private, evaluate running a local install of necessary databases or use tools that permit offline annotation. If you want, I can (a) scan the remainder of scripts/main.py for hidden endpoints or suspicious code, (b) list exact places where the implementation diverges from the SKILL.md, or (c) suggest a safe step-by-step to run the tool in a sandboxed environment.

Review Dimensions

Purpose & Capability: noteThe skill claims to aggregate ClinVar, dbSNP, gnomAD/ExAC/1000G population frequencies, and functional predictions (SIFT, PolyPhen, CADD). The included Python code clearly implements NCBI E-utilities queries for ClinVar and dbSNP (expected and coherent). However, requirements.txt only lists 'dataclasses' and there are no explicit clients, endpoints, or install steps for gnomAD, ExAC, SIFT, PolyPhen, or CADD in the manifest or referenced files. That mismatch suggests either incomplete implementation or missing declared dependencies (not necessarily malicious, but inconsistent).
Instruction Scope: okSKILL.md instructs the agent to call the VariantAnnotator class or run scripts/main.py and to query public databases (ClinVar/dbSNP). The runtime instructions do not ask the agent to read unrelated system files, access other credentials, or transmit data to unknown endpoints. They do show example FTP/wget commands in reference docs (ClinVar FTP), but the primary runtime examples use the packaged script and NCBI E-utilities — which is appropriate for the stated purpose.
Install Mechanism: okThere is no install specification (instruction-only skill plus a local script), so nothing will be automatically downloaded or executed at install time. That lowers install-time risk. The only file that might require additional packages is scripts/main.py, but requirements.txt is minimal and no install step is provided — an inconsistency but not an install-time hazard.
Credentials: noteThe skill declares no required environment variables or credentials (none listed). The code accepts an optional NCBI api_key parameter for rate-limit increases and sets a User-Agent header for requests; the SKILL.md and manifest do not document providing an API key via env var or config. This is not dangerous by itself, but the lack of declared optional credentials (e.g., NCBI_API_KEY) is an omission and should be clarified. No broad system credentials or unrelated secrets are requested.
Persistence & Privilege: okThe skill is not always-enabled, requires no configuration paths, and does not request persistent elevated privileges. It appears to be a normal, on-demand tool with no evidence it modifies other skills or system-wide settings.