Pharmaclaw Chemistry Query

Chemistry agent skill for PubChem API queries (compound info/properties, structures/SMILES/images, synthesis routes/references) + RDKit cheminformatics (SMIL...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
2 · 541 · 4 current installs · 4 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the included scripts: PubChem/ChEMBL/PubMed queries, RDKit processing, visualization, retrosynthesis, and reaction templates. Declared external APIs and system dependency (optional Java/OPSIN) are used by the code, and there are no unrelated credentials or surprising binaries requested.
Instruction Scope
SKILL.md instructs use for compound lookups, RDKit analysis, and retrosynthesis, which the scripts implement. It also mentions chaining outputs to a 'pharma-pharmacology-agent' (a usage suggestion rather than an autonomy requirement). The scripts run subprocesses and access only local skill files, call public APIs (PubChem/ChEMBL/NCBI), and write visualizations under a local viz directory; they do not attempt to read or exfiltrate unrelated system files or require secrets.
Install Mechanism
There is no formal install spec; code runs as-is. One runtime behavior worth noting: opsin_name_to_smiles.py will auto-download an OPSIN JAR from a GitHub release on first use. The download is verified with a pinned SHA-256 checksum in the script, reducing risk. No other remote arbitrary downloads or unverified installers are present.
Credentials
The skill declares no required environment variables or credentials and the code does not attempt to access hidden tokens. External APIs used are public (no key required). The lack of requested secrets is proportionate to its stated functionality.
Persistence & Privilege
always is false and the skill does not request permanent system-wide presence or modify other skills' configurations. It writes images/reports into its own viz/scripts directories and runs local subprocesses — expected for this type of tool.
Assessment
This skill appears coherent and implements what it claims: PubChem/ChEMBL/PubMed queries plus RDKit-based analysis, drawing, and retrosynthesis. Before installing or running it, consider: (1) it can generate multi-step synthesis routes and contains named reaction templates and experimental conditions — if your organization restricts assistance for lab synthesis, review the content carefully; (2) the OPSIN JAR is downloaded at runtime from a GitHub release but the script verifies a pinned SHA-256 (good practice); (3) the skill runs multiple Python scripts as subprocesses and writes visualization files into a viz directory under the skill — check file-write locations and sanitize any sensitive environment where you run it; and (4) no secrets are required. If you need higher assurance, review the included files (particularly templates.json and rdkit_reaction.py) for content you consider sensitive and run in an isolated environment (container or sandbox).

Like a lobster shell, security has layers — review code before you run it.

Current versionv2.0.1
Download zip
latestvk973nnfz50c2q78gh1pnaxmhnn82q1tz

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Chemistry Query Agent v1.4.1

Overview

Full-stack chemistry toolkit combining PubChem data retrieval with RDKit molecule processing, visualization, analysis, retrosynthesis, and synthesis planning. All outputs are structured JSON for easy downstream chaining. Generates PNG/SVG images on demand.

Key capabilities:

  • PubChem compound lookup (info, structure, synthesis refs, similarity search)
  • RDKit molecular properties (MW, logP, TPSA, HBD/HBA, rotatable bonds, aromatic rings)
  • 2D molecule visualization (PNG/SVG)
  • BRICS retrosynthesis with recursive depth control
  • Multi-step synthesis route planning
  • Forward reaction simulation with SMARTS templates
  • Morgan fingerprints and similarity/substructure search
  • 21 named reaction templates (Suzuki, Heck, Grignard, Wittig, Diels-Alder, etc.)

Quick Start

# PubChem compound info
exec python scripts/query_pubchem.py --compound "aspirin" --type info

# Molecular properties from SMILES
exec python scripts/rdkit_mol.py --smiles "CC(=O)Oc1ccccc1C(=O)O" --action props

# Retrosynthesis
exec python scripts/rdkit_mol.py --target "CC(=O)Oc1ccccc1C(=O)O" --action retro --depth 2

# Full chain (name → props + draw + retro)
exec python scripts/chain_entry.py --input-json '{"name": "caffeine", "context": "user"}'

Scripts

scripts/query_pubchem.py

PubChem REST API queries with automatic name→CID resolution and timeout handling.

--compound <name|CID> --type <info|structure|synthesis|similar> [--format smiles|inchi|image|json] [--threshold 80]
  • info: Formula, MW, IUPAC name, InChIKey (JSON)
  • structure: SMILES, InChI, image URL, or full JSON
  • synthesis: Synonyms/references for a compound
  • similar: Similar compounds by 2D fingerprint (top 20)

scripts/rdkit_mol.py

RDKit cheminformatics engine. Resolves names via PubChem automatically.

--smiles <SMILES> --action <props|draw|fingerprint|similarity|substruct|xyz|react|retro|plan>
ActionDescriptionKey Args
propsMW, logP, TPSA, HBD, HBA, rotB, aromRings--smiles
draw2D PNG/SVG (300×300)--smiles --output file.png --format png|svg
retroBRICS recursive retrosynthesis--target <SMILES|name> --depth N
planMulti-step retro route--target <SMILES|name> --steps N
reactForward reaction via SMARTS--reactants "smi1 smi2" --smarts "<SMARTS>"
fingerprintMorgan fingerprint bitvector--smiles --radius 2
similarityTanimoto similarity scoring--query_smiles --target_smiles "smi1,smi2"
substructSubstructure matching--query_smiles --target_smiles "smi1,smi2"
xyz3D coordinates (MMFF optimized)--smiles

scripts/chain_entry.py

Standard agent chain interface. Accepts {"smiles": "...", "context": "..."} or {"name": "...", "context": "..."}. Returns unified JSON with props, visualization, and retrosynthesis.

python scripts/chain_entry.py --input-json '{"name": "sotorasib", "context": "user"}'

Output schema:

{
  "agent": "chemistry-query",
  "version": "1.4.0",
  "smiles": "<canonical>",
  "status": "success|error",
  "report": {"props": {...}, "draw": {...}, "retro": {...}},
  "risks": [],
  "viz": ["path/to/image.png"],
  "recommend_next": ["pharmacology", "toxicology"],
  "confidence": 0.95,
  "warnings": [],
  "timestamp": "ISO8601"
}

scripts/templates.json

21 named reaction templates with SMARTS, expected yields, conditions, and references. Includes: Suzuki, Heck, Buchwald-Hartwig, Grignard, Wittig, Diels-Alder, Click, Sonogashira, Negishi, and more.

Chaining

  1. Name → Full Profile: chain_entry.py with {"name": "ibuprofen"} → props + draw + retro
  2. Chemistry → Pharmacology: Output feeds directly into pharma-pharmacology-agent
  3. Retro + Viz: Get precursors, then draw each one
  4. Suzuki Test: --action react --reactants "c1ccccc1Br c1ccccc1B(O)O" --smarts "[c:1][Br:2].[c:3][B]([c:4])(O)O>>[c:1][c:3]"

Tested With

All features verified end-to-end with RDKit 2024.03+:

MoleculeSMILESTests Passed
CaffeineCN1C=NC2=C1C(=O)N(C(=O)N2C)Cinfo, structure, props, draw, retro, plan, chain
AspirinCC(=O)Oc1ccccc1C(=O)Oinfo, structure, props, draw, retro, plan, chain
SotorasibPubChem name lookupinfo, structure, props, draw, retro, chain
IbuprofenPubChem name lookupinfo, structure, props, chain
Invalid SMILESXXXINVALIDGraceful JSON error
Empty input{}Graceful JSON error

Resources

  • references/api_endpoints.md — PubChem API endpoint reference and rate limits
  • scripts/rdkit_reaction.py — Legacy reaction module
  • scripts/chembl_query.py, scripts/pubmed_search.py, scripts/admet_predict.py — Additional query modules

scripts/advanced_chem.py

Advanced cheminformatics engine with 6 Tier 1 capabilities.

--action <standardize|descriptors|scaffold|mcs|mmpa|chemspace> --smiles <SMILES> [options]
ActionDescriptionKey Args
standardizeSalt stripping, charge normalization, tautomer enumeration--smiles
descriptors217+ molecular descriptors (RDKit full set), QED, SA Score, Lipinski/Veber rules--smiles --descriptor_set all|druglike|physical|topological
scaffoldMurcko scaffold extraction, generic scaffolds, diversity analysis, R-group decomposition--smiles or --target_smiles "smi1,smi2,..." --rgroup_core <SMARTS>
mcsMaximum Common Substructure across 2+ molecules--target_smiles "smi1,smi2,..."
mmpaMatched Molecular Pair Analysis — find single-point transformations--target_smiles "smi1,smi2,..."
chemspaceChemical space visualization (PCA/t-SNE/UMAP scatter plot PNG)--target_smiles "smi1,smi2,..." --method pca|tsne|umap --output plot.png

Examples:

# Standardize a salt form
python scripts/advanced_chem.py --action standardize --smiles "[Na+].CC(=O)[O-]"

# Full descriptors (217+)
python scripts/advanced_chem.py --action descriptors --smiles "CC(=O)Oc1ccccc1C(=O)O" --descriptor_set all

# Scaffold diversity of a set
python scripts/advanced_chem.py --action scaffold --target_smiles "CC(=O)Oc1ccccc1C(=O)O,CN1C=NC2=C1C(=O)N(C(=O)N2C)C,CC(C)Cc1ccc(cc1)C(C)C(=O)O"

# MCS of aspirin and salicylic acid
python scripts/advanced_chem.py --action mcs --target_smiles "CC(=O)Oc1ccccc1C(=O)O,c1ccccc1C(=O)O"

# Matched molecular pairs
python scripts/advanced_chem.py --action mmpa --target_smiles "c1ccc(CC(=O)O)cc1,c1ccc(CCC(=O)O)cc1"

# Chemical space PCA plot
python scripts/advanced_chem.py --action chemspace --target_smiles "CC(=O)Oc1ccccc1C(=O)O,CN1C=NC2=C1C(=O)N(C(=O)N2C)C,c1ccccc1" --method pca --output space.png

Changelog

v2.0.0 (2026-02-28)

  • NEW: advanced_chem.py with 6 Tier 1 cheminformatics capabilities
    • Molecular Standardization & Tautomer Enumeration (salt stripping, charge normalization, canonical tautomers)
    • Extended Descriptors (217+ RDKit descriptors, QED, SA Score, Lipinski, Veber)
    • Scaffold Analysis (Murcko, generic scaffolds, diversity ratio, R-group decomposition)
    • Maximum Common Substructure (rdFMCS with coverage per molecule)
    • Matched Molecular Pair Analysis (rdMMPA fragmentation, transformation detection)
    • Chemical Space Visualization (PCA/t-SNE/UMAP with matplotlib scatter plots)
  • Dependencies: scikit-learn, matplotlib (added)

v1.4.1 (2026-02-25)

  • Security hardening: input sanitization for all subprocess calls (SMILES, compound names, output paths)
  • Added _sanitize_input() — length limits, null-byte rejection for all user inputs
  • Added _sanitize_output_path() — prevents path traversal, restricts extensions, blocks arbitrary file writes
  • Added shell metacharacter rejection in resolve_target()
  • Added SMILES validation via RDKit in chem_ui.py before subprocess calls
  • Added compound input validation in query_pubchem.py (length/null-byte checks)
  • Added timeout to resolve_target() PubChem subprocess call
  • Addresses VirusTotal "suspicious" classification for argument injection vectors

v1.4.0 (2026-02-14)

  • Fixed PubChem SMILES/InChI endpoint (property/CanonicalSMILES/TXT)
  • Fixed chain_entry.py HTML entity corruption
  • Fixed brics_retro to handle BRICSDecompose string output correctly
  • Added request timeouts (15s) to all PubChem calls
  • Graceful error handling for invalid SMILES and empty input
  • Updated chain output version and schema
  • Comprehensive end-to-end testing

v1.3.0

  • RDKit props NoneType fixes, invalid SMILES graceful errors
  • React fix: ReactionFromSmarts import
  • Name resolution via PubChem for all RDKit actions

v1.2.0

  • BRICS retrosynthesis + 21 reaction templates library
  • Multi-step synthesis planning

Files

14 total
Select a file
Select a file to preview.

Comments

Loading comments…