Back to skill
v1.0.1

Data Construction Skill

BenignClawScan verdict for this skill. Analyzed May 1, 2026, 3:52 AM.

Analysis

This appears to be a local markdown-to-training-data workflow, with the main caution that it can copy source document content into persistent output files.

GuidanceThis skill is reasonable to install if you want to generate training data from local markdown documents. Before using it, choose the input directory carefully, avoid secrets or documents you are not allowed to reuse, and review the generated chunk and supervision files before sharing or training on them.

Findings (2)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Abnormal behavior control

Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.

Tool Misuse and Exploitation
SeverityLowConfidenceHighStatusNote
SKILL.md
scripts/build_manifest.py <input_dir> --output work/manifest.jsonl
scripts/split_markdown_book.py <input_md> --output work/chunks/<name>.chunks.jsonl --source-root <input_dir>

The workflow expects local helper scripts to read user-selected markdown inputs and write work artifacts. This is central to the stated purpose and not hidden, but users should run it only on intended paths.

User impactIf pointed at a broad directory, the workflow may process more markdown files than intended and create local output files from them.
RecommendationUse a narrow input directory and a dedicated work output directory; review commands and generated files before sharing or using the dataset.
Sensitive data protection

Checks for exposed credentials, poisoned memory or context, unclear communication boundaries, or sensitive data that could leave the user's control.

Memory and Context Poisoning
SeverityLowConfidenceHighStatusNote
scripts/split_markdown_book.py
'text': text,

Each chunk record includes the source markdown text, so the local work directory can retain copies of input document content.

User impactPrivate or proprietary markdown supplied as input may be copied into chunk files and reflected in generated training data.
RecommendationProcess only documents you intend to convert, review generated chunks and supervision records before sharing or training on them, and delete work artifacts when no longer needed.