Protein Sequence Qc Pro
ReviewAudited by ClawScan on May 10, 2026.
Overview
This looks like a local protein-QC workflow, but the code ignores the documented input/output paths and uses hard-coded `/root` dataset locations, so it should be reviewed before use.
Review or patch the scripts before installing. In particular, remove the hard-coded `/root/autodl-tmp/...` paths, add real argument parsing for input and output locations, and run the workflow in an isolated environment. There is no evidence of credential theft or network exfiltration, but the current artifacts are too misleading and poorly scoped for safe unattended use.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Your chosen FASTA file or output directory may be ignored, and the workflow may read or write files in an unexpected root-level location.
The main QC pipeline is fixed to an absolute `/root` input/output tree instead of using the documented user-provided input and output paths. This gives the tool unclear local file scope.
WORK_DIR = Path("/root/autodl-tmp/ou_a1d19d5984eecd78f231c50f774eddb0/ChemRxiv_QC_analysis")
INPUT_FASTA = Path("/root/autodl-tmp/ou_a1d19d5984eecd78f231c50f774eddb0/ChemRxiv_QC_analysis/input/all_ired_merged.fasta")Do not run this unchanged on important systems. Update the scripts to require explicit input and output arguments, quote shell paths safely, and run in a sandboxed project directory.
A user or agent could believe the QC results came from the requested input when they actually came from, or attempted to use, a built-in path.
The user-facing instructions present a parameterized workflow, but the supplied main script does not implement these options and instead hard-codes a different dataset path. This mismatch can cause misplaced trust in what data was processed.
python3 scripts/run_complete_qc.py \
--input raw_sequences.fasta \
--output qc_results/ \
--threads 8Treat the documentation as unreliable until the scripts are corrected and tested. Verify that outputs are generated from the intended input files before using results in publications.
Future package versions from public repositories could behave differently or break the workflow.
The dependency installs are expected for this bioinformatics workflow, but versions are not pinned and the registry metadata reports no install spec, so dependency provenance and reproducibility are weaker.
install:
- id: cd-hit
kind: conda
package: cd-hit
channel: bioconda
- id: biopython
kind: pip
package: biopythonInstall dependencies in a dedicated conda/virtual environment and prefer pinned, reviewed package versions.
