Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Literature Review

Conduct comprehensive, systematic literature reviews using multiple academic databases (PubMed, arXiv, bioRxiv, Semantic Scholar, etc.). This skill should be...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 9 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The code and SKILL.md align with a literature-review purpose: multiple scripts query Semantic Scholar, OpenAlex, CrossRef, PubMed/E-utilities, deduplicate, and generate PDFs. However, the skill's declared requirements list no environment variables or credentials while the code reads SEMANTIC_SCHOLAR_API_KEY, OPENALEX_API_KEY and USER_EMAIL from the environment — an inconsistency. The SKILL.md also mandates use of an external 'scientific-schematics' skill and references 'gget' and 'bioservices' for searches, which is plausible but not declared in metadata.
!
Instruction Scope
Runtime instructions require: (a) generating 1–2 AI schematics using a separate 'scientific-schematics' skill (marked as MANDATORY), (b) using multiple external tools/skills (gget, bioservices, datacommons-client), and (c) optionally scraping Google Scholar. The SKILL.md references a script 'scripts/generate_schematic.py' and commands to run it, but that file is not present in the manifest — this is a notable mismatch. The instructions also encourage Google Scholar scraping ('manual or careful scraping') which raises legal/terms-of-service and rate-limiting concerns. The agent is allowed to run Bash per allowed-tools, and the code calls external network APIs and shell tools (pandoc/xelatex) — expected for the task but broad in scope.
Install Mechanism
No install spec (instruction-only install) — lower disk-write risk. But the package includes executable Python scripts that will run subprocesses (pandoc, xelatex) and perform network requests. There is no automated installer, so any missing helper (e.g., generate_schematic.py) would have to be provided or fetched later; the absence of an install step combined with a missing referenced script is an operational red flag (could be a broken/incomplete package or a hint that code will be downloaded at runtime).
!
Credentials
Declared requirements list no env vars, yet scripts read SEMANTIC_SCHOLAR_API_KEY and OPENALEX_API_KEY and use USER_EMAIL (falling back to CLAWDBOT_EMAIL or 'anonymous@example.org'). Unclear: the registry metadata does not declare those required keys. The skill will function without keys (it uses fallbacks) but will leak or use USER_EMAIL for User-Agent headers, which could expose an email from the environment. Requiring no credentials in metadata while the code optionally accepts API keys is inconsistent and could cause surprise credential exposure or unexpected network behavior.
Persistence & Privilege
always is false and there are no requested config paths or modifications to other skills. The skill does not request persistent or global privileges in the metadata. It does permit autonomous invocation (default), which increases blast radius if the skill were malicious, but there are no additional persistence privileges requested.
What to consider before installing
Key things to check before installing: - Environment variables: The code reads SEMANTIC_SCHOLAR_API_KEY, OPENALEX_API_KEY, and USER_EMAIL but the skill metadata declares no required env vars — confirm whether you need to supply API keys and whether you are comfortable exposing an email via USER_EMAIL/CLAWDBOT_EMAIL (it will be used in User-Agent headers). - Missing referenced script: SKILL.md tells you to run scripts/generate_schematic.py or scripts/generate_schematic.py is referenced in the schematic-generation example, but that file is not present in the manifest. Ask the publisher where generate_schematic.py is or whether the schematic step is implemented elsewhere. A missing file could indicate an incomplete package or that additional code will be fetched at runtime. - Mandatory external skill: The SKILL.md forces use of a separate 'scientific-schematics' skill to generate figures. Confirm what that skill is, who publishes it, and what permissions or credentials it requires before allowing automatic invocation. - Google Scholar scraping: The instructions encourage manual or careful scraping of Google Scholar (no official API). Scraping violates Google Scholar terms of service and may result in blocking; decide whether you want automation that encourages scraping. - External network calls and subprocesses: The scripts make network requests to public APIs (Semantic Scholar, OpenAlex, CrossRef, PubMed/E-utilities, EuropePMC) and call subprocess tools (pandoc, xelatex). Ensure your runtime environment allows these, and run the code in a sandbox if you need to audit behavior first. - Validate outputs and provenance: Because the skill aggregates many sources, verify DOIs and citation resolution (the repo includes verify_citations.py) and check how deduplication is implemented if you depend on completeness/accuracy for publication work. - If you have low tolerance for risk: don't install until the author fixes metadata mismatches (declare env vars and list the required helpers), provides the missing schematic-generation script or removes the mandatory schematic requirement, and documents the 'scientific-schematics' skill provenance and permissions. If you proceed: run the scripts in an isolated environment, supply API keys deliberately (not secrets you cannot rotate), and confirm the schematic tool and any other external skills before granting the agent autonomous invocation that might call them.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.3.0
Download zip
latestvk97fj5rs8qxbz1hpczfgnyg5kd83yh0c

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Literature Review

Overview

Conduct systematic, comprehensive literature reviews following rigorous academic methodology. Search multiple literature databases, synthesize findings thematically, verify all citations for accuracy, and generate professional output documents in markdown and PDF formats.

This skill integrates with multiple scientific skills for database access (gget, bioservices, datacommons-client) and provides specialized tools for citation verification, result aggregation, and document generation.

When to Use This Skill

Use this skill when:

  • Conducting a systematic literature review for research or publication
  • Synthesizing current knowledge on a specific topic across multiple sources
  • Performing meta-analysis or scoping reviews
  • Writing the literature review section of a research paper or thesis
  • Investigating the state of the art in a research domain
  • Identifying research gaps and future directions
  • Requiring verified citations and professional formatting

Visual Enhancement with Scientific Schematics

⚠️ MANDATORY: Every literature review MUST include at least 1-2 AI-generated figures using the scientific-schematics skill.

This is not optional. Literature reviews without visual elements are incomplete. Before finalizing any document:

  1. Generate at minimum ONE schematic or diagram (e.g., PRISMA flow diagram for systematic reviews)
  2. Prefer 2-3 figures for comprehensive reviews (search strategy flowchart, thematic synthesis diagram, conceptual framework)

How to generate figures:

  • Use the scientific-schematics skill to generate AI-powered publication-quality diagrams
  • Simply describe your desired diagram in natural language
  • Nano Banana Pro will automatically generate, review, and refine the schematic

How to generate schematics:

python scripts/generate_schematic.py "your diagram description" -o figures/output.png

The AI will automatically:

  • Create publication-quality images with proper formatting
  • Review and refine through multiple iterations
  • Ensure accessibility (colorblind-friendly, high contrast)
  • Save outputs in the figures/ directory

When to add schematics:

  • PRISMA flow diagrams for systematic reviews
  • Literature search strategy flowcharts
  • Thematic synthesis diagrams
  • Research gap visualization maps
  • Citation network diagrams
  • Conceptual framework illustrations
  • Any complex concept that benefits from visualization

For detailed guidance on creating schematics, refer to the scientific-schematics skill documentation.


Core Workflow

Literature reviews follow a structured, multi-phase workflow:

Phase 1: Planning and Scoping

  1. Define Research Question: Use PICO framework (Population, Intervention, Comparison, Outcome) for clinical/biomedical reviews

    • Example: "What is the efficacy of CRISPR-Cas9 (I) for treating sickle cell disease (P) compared to standard care (C)?"
  2. Establish Scope and Objectives:

    • Define clear, specific research questions
    • Determine review type (narrative, systematic, scoping, meta-analysis)
    • Set boundaries (time period, geographic scope, study types)
  3. Develop Search Strategy:

    • Identify 2-4 main concepts from research question
    • List synonyms, abbreviations, and related terms for each concept
    • Plan Boolean operators (AND, OR, NOT) to combine terms
    • Select minimum 3 complementary databases
  4. Set Inclusion/Exclusion Criteria:

    • Date range (e.g., last 10 years: 2015-2024)
    • Language (typically English, or specify multilingual)
    • Publication types (peer-reviewed, preprints, reviews)
    • Study designs (RCTs, observational, in vitro, etc.)
    • Document all criteria clearly

Phase 2: Systematic Literature Search

  1. Multi-Database Search:

    Select databases appropriate for the domain:

    Biomedical & Life Sciences:

    • Use gget skill: gget search pubmed "search terms" for PubMed/PMC
    • Use gget skill: gget search biorxiv "search terms" for preprints
    • Use bioservices skill for ChEMBL, KEGG, UniProt, etc.

    General Scientific Literature:

    • Search arXiv via direct API (preprints in physics, math, CS, q-bio)
    • Search Semantic Scholar via API (200M+ papers, cross-disciplinary)
    • Use Google Scholar for comprehensive coverage (manual or careful scraping)

    Specialized Databases:

    • Use gget alphafold for protein structures
    • Use gget cosmic for cancer genomics
    • Use datacommons-client for demographic/statistical data
    • Use specialized databases as appropriate for the domain
  2. Document Search Parameters:

    ## Search Strategy
    
    ### Database: PubMed
    - **Date searched**: 2024-10-25
    - **Date range**: 2015-01-01 to 2024-10-25
    - **Search string**:
    

    ("CRISPR"[Title] OR "Cas9"[Title]) AND ("sickle cell"[MeSH] OR "SCD"[Title/Abstract]) AND 2015:2024[Publication Date]

    - **Results**: 247 articles
    

    Repeat for each database searched.

  3. Export and Aggregate Results:

    • Export results in JSON format from each database
    • Combine all results into a single file
    • Use scripts/search_databases.py for post-processing:
      python search_databases.py combined_results.json \
        --deduplicate \
        --format markdown \
        --output aggregated_results.md
      

Phase 3: Screening and Selection

  1. Deduplication:

    python search_databases.py results.json --deduplicate --output unique_results.json
    
    • Removes duplicates by DOI (primary) or title (fallback)
    • Document number of duplicates removed
  2. Title Screening:

    • Review all titles against inclusion/exclusion criteria
    • Exclude obviously irrelevant studies
    • Document number excluded at this stage
  3. Abstract Screening:

    • Read abstracts of remaining studies
    • Apply inclusion/exclusion criteria rigorously
    • Document reasons for exclusion
  4. Full-Text Screening:

    • Obtain full texts of remaining studies
    • Conduct detailed review against all criteria
    • Document specific reasons for exclusion
    • Record final number of included studies
  5. Create PRISMA Flow Diagram:

    Initial search: n = X
    ├─ After deduplication: n = Y
    ├─ After title screening: n = Z
    ├─ After abstract screening: n = A
    └─ Included in review: n = B
    

Phase 4: Data Extraction and Quality Assessment

  1. Extract Key Data from each included study:

    • Study metadata (authors, year, journal, DOI)
    • Study design and methods
    • Sample size and population characteristics
    • Key findings and results
    • Limitations noted by authors
    • Funding sources and conflicts of interest
  2. Assess Study Quality:

    • For RCTs: Use Cochrane Risk of Bias tool
    • For observational studies: Use Newcastle-Ottawa Scale
    • For systematic reviews: Use AMSTAR 2
    • Rate each study: High, Moderate, Low, or Very Low quality
    • Consider excluding very low-quality studies
  3. Organize by Themes:

    • Identify 3-5 major themes across studies
    • Group studies by theme (studies may appear in multiple themes)
    • Note patterns, consensus, and controversies

Phase 5: Synthesis and Analysis

  1. Create Review Document from template:

    cp assets/review_template.md my_literature_review.md
    
  2. Write Thematic Synthesis (NOT study-by-study summaries):

    • Organize Results section by themes or research questions
    • Synthesize findings across multiple studies within each theme
    • Compare and contrast different approaches and results
    • Identify consensus areas and points of controversy
    • Highlight the strongest evidence

    Example structure:

    #### 3.3.1 Theme: CRISPR Delivery Methods
    
    Multiple delivery approaches have been investigated for therapeutic
    gene editing. Viral vectors (AAV) were used in 15 studies^1-15^ and
    showed high transduction efficiency (65-85%) but raised immunogenicity
    concerns^3,7,12^. In contrast, lipid nanoparticles demonstrated lower
    efficiency (40-60%) but improved safety profiles^16-23^.
    
  3. Critical Analysis:

    • Evaluate methodological strengths and limitations across studies
    • Assess quality and consistency of evidence
    • Identify knowledge gaps and methodological gaps
    • Note areas requiring future research
  4. Write Discussion:

    • Interpret findings in broader context
    • Discuss clinical, practical, or research implications
    • Acknowledge limitations of the review itself
    • Compare with previous reviews if applicable
    • Propose specific future research directions

Phase 6: Citation Verification

CRITICAL: All citations must be verified for accuracy before final submission.

  1. Verify All DOIs:

    python scripts/verify_citations.py my_literature_review.md
    

    This script:

    • Extracts all DOIs from the document
    • Verifies each DOI resolves correctly
    • Retrieves metadata from CrossRef
    • Generates verification report
    • Outputs properly formatted citations
  2. Review Verification Report:

    • Check for any failed DOIs
    • Verify author names, titles, and publication details match
    • Correct any errors in the original document
    • Re-run verification until all citations pass
  3. Format Citations Consistently:

    • Choose one citation style and use throughout (see references/citation_styles.md)
    • Common styles: APA, Nature, Vancouver, Chicago, IEEE
    • Use verification script output to format citations correctly
    • Ensure in-text citations match reference list format

Phase 7: Document Generation

  1. Generate PDF:

    python scripts/generate_pdf.py my_literature_review.md \
      --citation-style apa \
      --output my_review.pdf
    

    Options:

    • --citation-style: apa, nature, chicago, vancouver, ieee
    • --no-toc: Disable table of contents
    • --no-numbers: Disable section numbering
    • --check-deps: Check if pandoc/xelatex are installed
  2. Review Final Output:

    • Check PDF formatting and layout
    • Verify all sections are present
    • Ensure citations render correctly
    • Check that figures/tables appear properly
    • Verify table of contents is accurate
  3. Quality Checklist:

    • All DOIs verified with verify_citations.py
    • Citations formatted consistently
    • PRISMA flow diagram included (for systematic reviews)
    • Search methodology fully documented
    • Inclusion/exclusion criteria clearly stated
    • Results organized thematically (not study-by-study)
    • Quality assessment completed
    • Limitations acknowledged
    • References complete and accurate
    • PDF generates without errors

Database-Specific Search Guidance

PubMed / PubMed Central

Access via gget skill:

# Search PubMed
gget search pubmed "CRISPR gene editing" -l 100

# Search with filters
# Use PubMed Advanced Search Builder to construct complex queries
# Then execute via gget or direct Entrez API

Search tips:

  • Use MeSH terms: "sickle cell disease"[MeSH]
  • Field tags: [Title], [Title/Abstract], [Author]
  • Date filters: 2020:2024[Publication Date]
  • Boolean operators: AND, OR, NOT
  • See MeSH browser: https://meshb.nlm.nih.gov/search

bioRxiv / medRxiv

Access via gget skill:

gget search biorxiv "CRISPR sickle cell" -l 50

Important considerations:

  • Preprints are not peer-reviewed
  • Verify findings with caution
  • Check if preprint has been published (CrossRef)
  • Note preprint version and date

arXiv

Access via direct API or WebFetch:

# Example search categories:
# q-bio.QM (Quantitative Methods)
# q-bio.GN (Genomics)
# q-bio.MN (Molecular Networks)
# cs.LG (Machine Learning)
# stat.ML (Machine Learning Statistics)

# Search format: category AND terms
search_query = "cat:q-bio.QM AND ti:\"single cell sequencing\""

Semantic Scholar

Access via direct API (requires API key, or use free tier):

  • 200M+ papers across all fields
  • Excellent for cross-disciplinary searches
  • Provides citation graphs and paper recommendations
  • Use for finding highly influential papers

Specialized Biomedical Databases

Use appropriate skills:

  • ChEMBL: bioservices skill for chemical bioactivity
  • UniProt: gget or bioservices skill for protein information
  • KEGG: bioservices skill for pathways and genes
  • COSMIC: gget skill for cancer mutations
  • AlphaFold: gget alphafold for protein structures
  • PDB: gget or direct API for experimental structures

Citation Chaining

Expand search via citation networks:

  1. Forward citations (papers citing key papers):

    • Use Google Scholar "Cited by"
    • Use Semantic Scholar or OpenAlex APIs
    • Identifies newer research building on seminal work
  2. Backward citations (references from key papers):

    • Extract references from included papers
    • Identify highly cited foundational work
    • Find papers cited by multiple included studies

Citation Style Guide

Detailed formatting guidelines are in references/citation_styles.md. Quick reference:

APA (7th Edition)

  • In-text: (Smith et al., 2023)
  • Reference: Smith, J. D., Johnson, M. L., & Williams, K. R. (2023). Title. Journal, 22(4), 301-318. https://doi.org/10.xxx/yyy

Nature

  • In-text: Superscript numbers^1,2^
  • Reference: Smith, J. D., Johnson, M. L. & Williams, K. R. Title. Nat. Rev. Drug Discov. 22, 301-318 (2023).

Vancouver

  • In-text: Superscript numbers^1,2^
  • Reference: Smith JD, Johnson ML, Williams KR. Title. Nat Rev Drug Discov. 2023;22(4):301-18.

Always verify citations with verify_citations.py before finalizing.

Best Practices

Search Strategy

  1. Use multiple databases (minimum 3): Ensures comprehensive coverage
  2. Include preprint servers: Captures latest unpublished findings
  3. Document everything: Search strings, dates, result counts for reproducibility
  4. Test and refine: Run pilot searches, review results, adjust search terms

Screening and Selection

  1. Use clear criteria: Document inclusion/exclusion criteria before screening
  2. Screen systematically: Title → Abstract → Full text
  3. Document exclusions: Record reasons for excluding studies
  4. Consider dual screening: For systematic reviews, have two reviewers screen independently

Synthesis

  1. Organize thematically: Group by themes, NOT by individual studies
  2. Synthesize across studies: Compare, contrast, identify patterns
  3. Be critical: Evaluate quality and consistency of evidence
  4. Identify gaps: Note what's missing or understudied

Quality and Reproducibility

  1. Assess study quality: Use appropriate quality assessment tools
  2. Verify all citations: Run verify_citations.py script
  3. Document methodology: Provide enough detail for others to reproduce
  4. Follow guidelines: Use PRISMA for systematic reviews

Writing

  1. Be objective: Present evidence fairly, acknowledge limitations
  2. Be systematic: Follow structured template
  3. Be specific: Include numbers, statistics, effect sizes where available
  4. Be clear: Use clear headings, logical flow, thematic organization

Common Pitfalls to Avoid

  1. Single database search: Misses relevant papers; always search multiple databases
  2. No search documentation: Makes review irreproducible; document all searches
  3. Study-by-study summary: Lacks synthesis; organize thematically instead
  4. Unverified citations: Leads to errors; always run verify_citations.py
  5. Too broad search: Yields thousands of irrelevant results; refine with specific terms
  6. Too narrow search: Misses relevant papers; include synonyms and related terms
  7. Ignoring preprints: Misses latest findings; include bioRxiv, medRxiv, arXiv
  8. No quality assessment: Treats all evidence equally; assess and report quality
  9. Publication bias: Only positive results published; note potential bias
  10. Outdated search: Field evolves rapidly; clearly state search date

Example Workflow

Complete workflow for a biomedical literature review:

# 1. Create review document from template
cp assets/review_template.md crispr_sickle_cell_review.md

# 2. Search multiple databases using appropriate skills
# - Use gget skill for PubMed, bioRxiv
# - Use direct API access for arXiv, Semantic Scholar
# - Export results in JSON format

# 3. Aggregate and process results
python scripts/search_databases.py combined_results.json \
  --deduplicate \
  --rank citations \
  --year-start 2015 \
  --year-end 2024 \
  --format markdown \
  --output search_results.md \
  --summary

# 4. Screen results and extract data
# - Manually screen titles, abstracts, full texts
# - Extract key data into the review document
# - Organize by themes

# 5. Write the review following template structure
# - Introduction with clear objectives
# - Detailed methodology section
# - Results organized thematically
# - Critical discussion
# - Clear conclusions

# 6. Verify all citations
python scripts/verify_citations.py crispr_sickle_cell_review.md

# Review the citation report
cat crispr_sickle_cell_review_citation_report.json

# Fix any failed citations and re-verify
python scripts/verify_citations.py crispr_sickle_cell_review.md

# 7. Generate professional PDF
python scripts/generate_pdf.py crispr_sickle_cell_review.md \
  --citation-style nature \
  --output crispr_sickle_cell_review.pdf

# 8. Review final PDF and markdown outputs

Integration with Other Skills

This skill works seamlessly with other scientific skills:

Database Access Skills

  • gget: PubMed, bioRxiv, COSMIC, AlphaFold, Ensembl, UniProt
  • bioservices: ChEMBL, KEGG, Reactome, UniProt, PubChem
  • datacommons-client: Demographics, economics, health statistics

Analysis Skills

  • pydeseq2: RNA-seq differential expression (for methods sections)
  • scanpy: Single-cell analysis (for methods sections)
  • anndata: Single-cell data (for methods sections)
  • biopython: Sequence analysis (for background sections)

Visualization Skills

  • matplotlib: Generate figures and plots for review
  • seaborn: Statistical visualizations

Writing Skills

  • brand-guidelines: Apply institutional branding to PDF
  • internal-comms: Adapt review for different audiences

Resources

Bundled Resources

Scripts:

  • scripts/verify_citations.py: Verify DOIs and generate formatted citations
  • scripts/generate_pdf.py: Convert markdown to professional PDF
  • scripts/search_databases.py: Process, deduplicate, and format search results

References:

  • references/citation_styles.md: Detailed citation formatting guide (APA, Nature, Vancouver, Chicago, IEEE)
  • references/database_strategies.md: Comprehensive database search strategies

Assets:

  • assets/review_template.md: Complete literature review template with all sections

External Resources

Guidelines:

Tools:

Citation Styles:

Dependencies

Required Python Packages

pip install requests  # For citation verification

Required System Tools

# For PDF generation
brew install pandoc  # macOS
apt-get install pandoc  # Linux

# For LaTeX (PDF generation)
brew install --cask mactex  # macOS
apt-get install texlive-xetex  # Linux

Check dependencies:

python scripts/generate_pdf.py --check-deps

Summary

This literature-review skill provides:

  1. Systematic methodology following academic best practices
  2. Multi-database integration via existing scientific skills
  3. Citation verification ensuring accuracy and credibility
  4. Professional output in markdown and PDF formats
  5. Comprehensive guidance covering the entire review process
  6. Quality assurance with verification and validation tools
  7. Reproducibility through detailed documentation requirements

Conduct thorough, rigorous literature reviews that meet academic standards and provide comprehensive synthesis of current knowledge in any domain.

Files

9 total
Select a file
Select a file to preview.

Comments

Loading comments…