Protein Sequence Qc Pro

ReviewAudited by ClawScan on May 10, 2026.

Overview

This looks like a local protein-QC workflow, but the code ignores the documented input/output paths and uses hard-coded `/root` dataset locations, so it should be reviewed before use.

Review or patch the scripts before installing. In particular, remove the hard-coded `/root/autodl-tmp/...` paths, add real argument parsing for input and output locations, and run the workflow in an isolated environment. There is no evidence of credential theft or network exfiltration, but the current artifacts are too misleading and poorly scoped for safe unattended use.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

ConcernHigh Confidence

ASI02: Tool Misuse and Exploitation

What this means

Your chosen FASTA file or output directory may be ignored, and the workflow may read or write files in an unexpected root-level location.

Why it was flagged

The main QC pipeline is fixed to an absolute `/root` input/output tree instead of using the documented user-provided input and output paths. This gives the tool unclear local file scope.

Skill content

WORK_DIR = Path("/root/autodl-tmp/ou_a1d19d5984eecd78f231c50f774eddb0/ChemRxiv_QC_analysis")
INPUT_FASTA = Path("/root/autodl-tmp/ou_a1d19d5984eecd78f231c50f774eddb0/ChemRxiv_QC_analysis/input/all_ired_merged.fasta")

Recommendation

Do not run this unchanged on important systems. Update the scripts to require explicit input and output arguments, quote shell paths safely, and run in a sandboxed project directory.

ConcernHigh Confidence

ASI09: Human-Agent Trust Exploitation

What this means

A user or agent could believe the QC results came from the requested input when they actually came from, or attempted to use, a built-in path.

Why it was flagged

The user-facing instructions present a parameterized workflow, but the supplied main script does not implement these options and instead hard-codes a different dataset path. This mismatch can cause misplaced trust in what data was processed.

Skill content

python3 scripts/run_complete_qc.py \
    --input raw_sequences.fasta \
    --output qc_results/ \
    --threads 8

Recommendation

Treat the documentation as unreliable until the scripts are corrected and tested. Verify that outputs are generated from the intended input files before using results in publications.

NoteHigh Confidence

ASI04: Agentic Supply Chain Vulnerabilities

What this means

Future package versions from public repositories could behave differently or break the workflow.

Why it was flagged

The dependency installs are expected for this bioinformatics workflow, but versions are not pinned and the registry metadata reports no install spec, so dependency provenance and reproducibility are weaker.

Skill content

install:
  - id: cd-hit
    kind: conda
    package: cd-hit
    channel: bioconda
  - id: biopython
    kind: pip
    package: biopython

Recommendation

Install dependencies in a dedicated conda/virtual environment and prefer pinned, reviewed package versions.