Install
openclaw skills install personal-genomics-analysisAnalyze consumer DNA data from WeGene, 23andMe, AncestryDNA, VCF, BAM, or CRAM files. Generate evidence-based reports covering health risks, pharmacogenomics, ancestry, nutrition, exercise traits, and supplement guidance. Runs locally and keeps raw genetic data on the user's machine.
openclaw skills install personal-genomics-analysisThis skill guides you through a structured, multi-phase workflow for analyzing consumer genetic testing data and producing actionable health insights. The workflow is interactive — you gather information from the user at key decision points rather than making assumptions.
The analysis pipeline is designed to be:
Read references/supported_formats.md for detailed format specifications. In brief:
| Platform | File Type | Key Characteristics |
|---|---|---|
| WeGene | TSV (.txt) | rsid \t chromosome \t position \t genotype |
| 23andMe | TSV (.txt) | # rsid \t chromosome \t position \t genotype (comment header with #) |
| AncestryDNA | TSV (.txt) | rsid \t chromosome \t position \t allele1 \t allele2 (separate allele columns) |
| VCF | .vcf / .vcf.gz | Standard VCF v4.x, may contain WGS or chip data |
| CRAM/BAM | .cram / .bam | Alignment files for variant verification, depth analysis |
Write a Python script that:
{rsid: genotype_string}chr:pos for position-based lookupsWhen both chip data (WeGene/23andMe) and WGS (VCF) are available, use a dual-source lookup strategy: check chip data first (faster), fall back to VCF by rsid or chr:pos. This maximizes coverage since chip and WGS may cover different variant sets.
Read references/snp_database.md for the curated SNP database organized by category.
The database covers ~120 clinically relevant SNPs across these categories:
Each SNP entry includes: gene, variant name, risk allele, condition/trait, evidence level, PMID reference, and a plain-language explanation.
Generate a Python analysis script that:
Generate an interactive HTML report with:
Follow the user's language (Chinese or English) for all report text.
This is the critical interactive phase. After presenting initial results:
Ask the user about:
This information is essential because genetic risk is only part of the picture. A person with a family history of early heart attack AND multiple CAD risk SNPs faces very different odds than someone with the same SNPs but no family history.
Based on the user's health profile, conduct a targeted deep-dive. Read
references/deep_risk_snps.md for extended SNP panels organized by disease pathway:
For each category relevant to the user:
If the user has provided alignment files:
Note: samtools may need to be compiled from source in sandboxed environments.
See references/tool_setup.md for instructions.
For whole-genome sequencing data:
Based on all gathered information, produce actionable recommendations.
Read references/supplement_guide.md for evidence-based supplement recommendations
mapped to genetic findings. The guide covers:
Always organize supplements into tiers:
Based on the risk profile, suggest:
Offer to generate:
Every report MUST include a clear disclaimer: genetic analysis provides risk estimates, not diagnoses. Results should be discussed with a qualified healthcare provider. Consumer genetic testing has limitations in coverage and accuracy compared to clinical-grade testing.
Follow the user's language. If the user writes in Chinese, produce reports in Chinese. If in English, use English. For SNP names and gene symbols, always keep the standard scientific nomenclature regardless of language.
Don't try to do everything at once. The workflow is designed as a conversation:
Each phase should end with a clear handoff to the user before proceeding.