species_identification_sequence_blast_annotation_tool

v1.0.0

提供基于BLAST的FASTA序列和OTU表Top ASV的物种注释,支持映射文件、延迟设置和断点续传功能。

0· 164·0 current·0 all-time
byDong Zhao@zd200572

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for zd200572/blast-species-identification.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "species_identification_sequence_blast_annotation_tool" (zd200572/blast-species-identification) from ClawHub.
Skill page: https://clawhub.ai/zd200572/blast-species-identification
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install blast-species-identification

ClawHub CLI

Package manager switcher

npx clawhub@latest install blast-species-identification
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The skill description (BLAST-based species annotation) matches the included Python scripts which call NCBI BLAST via Biopython. Minor inconsistency: SKILL.md refers to 'blast_annotation_tool.py' while the repository provides 'blast_annotation.py' — the filename mismatch should be verified but does not imply malicious behavior.
Instruction Scope
SKILL.md instructs reading FASTA/OTU files and calling BLAST; the scripts read those files, run NCBIWWW.qblast, parse results, and write CSV summaries. They do not access unrelated system files, environment variables, or external endpoints beyond NCBI BLAST.
Install Mechanism
No automated install spec is provided (instruction-only). SKILL.md asks users to 'pip install biopython', which is appropriate. There is no download/extract of remote code in the install step; included scripts will run locally.
Credentials
The skill requires no credentials or environment variables. Network access to NCBI BLAST is necessary and expected for the stated purpose. No unrelated secrets, keys, or config paths are requested.
Persistence & Privilege
The skill does not request permanent presence (always:false) and does not modify other skills or system-wide configuration. It runs as a local script when invoked.
Assessment
This skill appears to do exactly what it claims: extract ASV sequences from FASTA/OTU inputs, send them to NCBI BLAST via Biopython, and write CSV results. Before installing or running, check: (1) confirm the filename mismatch (SKILL.md references blast_annotation_tool.py but the repository has blast_annotation.py) and adjust calls accordingly; (2) you will transmit sequence data to NCBI—ensure this is acceptable for privacy/compliance; (3) install Biopython (pip install biopython) and be mindful of NCBI rate limits (use --delay); (4) inspect the hardcoded SAMPLE_TOP_ASV mapping in blast_annotation.py — it may be specific to a dataset and could need updating for your data. If you need the skill to run without network access, it will not work because it relies on NCBI's online qblast service.

Like a lobster shell, security has layers — review code before you run it.

latestvk971s11tjtx0tha36j3ktbr85h8371dx
164downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

BLAST 物种注释工具技能

描述

提供BLAST物种注释工具的使用指南和快速调用功能。包含两个主要工具:

  • blast_annotation_tool.py - 对指定FASTA序列进行BLAST注释
  • top_asv_blast.py - 从OTU表提取Top ASV并进行BLAST注释

安装依赖

pip install biopython

工具一:blast_annotation_tool.py

基本用法

# 基本用法:输入FASTA文件和输出目录
python3 blast_annotation_tool.py sequences.fasta results/

# 使用序列ID到样本名的映射文件
python3 blast_annotation_tool.py sequences.fasta results/ --mapping mapping.csv

# 跳过已存在的结果(断点续传)
python3 blast_annotation_tool.py sequences.fasta results/ --skip-existing

参数说明

  • input: 输入FASTA文件路径 (必填)
  • output: 输出目录路径 (必填)
  • --mapping, -m: 序列ID到样本名的映射文件 (CSV格式)
  • --delay, -d: 每次BLAST请求之间的延迟秒数 (默认: 3)
  • --hits, -n: 每个样本保留的BLAST hits数量 (默认: 10)
  • --skip-existing, -s: 跳过已存在的结果文件 (默认: False)

工具二:top_asv_blast.py

基本用法

# 基本用法
python3 top_asv_blast.py taxa_table.xls rep.fasta results/

# 跳过已存在的结果(断点续传)
python3 top_asv_blast.py taxa_table.xls rep.fasta results/ --skip-existing

# 自定义参数
python3 top_asv_blast.py taxa_table.xls rep.fasta results/ --delay 5 --hits 20

参数说明

  • otu_table: OTU表文件 (.xls, .tsv, .csv) (必填)
  • fasta: 代表性序列FASTA文件 (必填)
  • output: 输出目录路径 (必填)
  • --top-n, -n: 每个样本提取前N个ASV (默认: 1)
  • --delay, -d: 每次BLAST请求之间的延迟秒数 (默认: 3)
  • --hits: 每个ASV保留的BLAST hits数量 (默认: 10)
  • --skip-existing, -s: 跳过已存在的结果文件 (默认: False)

输入文件格式

FASTA文件

>ASV1
TAGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAACGCCGCGTGAG...
>ASV2
TAGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAACGCCGCGTGAG...

映射文件 (可选)

CSV格式,第一列为序列ID,第二列为样本名:

ASV1,D1-8
ASV2,J2-8
ASV3,D3-8

OTU表格式

  • 第一列:ASV/OTU ID
  • 中间列:样本序列计数(支持重复样本自动合并)
  • 最后一列:Taxonomy注释

输出文件

  • 每个样本的BLAST结果CSV文件
  • 汇总表 (blast_summary.csv)
  • Top ASV信息表 (top_asv_info.csv)

注意事项

  1. 需要网络连接访问NCBI BLAST服务
  2. 每次比对可能需要几秒到几十秒
  3. 建议使用--delay参数避免请求过于频繁
  4. 使用--skip-existing可实现断点续传

快速调用

当您需要进行BLAST物种注释时,只需说:

  • "使用blast注释工具"
  • "运行top asv blast"
  • "BLAST物种注释指南"

我会为您提供详细的参数说明和使用方法。

Comments

Loading comments...