Doc-to-LoRA

Security checks across static analysis, malware telemetry, and agentic risk

Overview

The skill matches its document-internalization purpose, but it relies on unsafe model-checkpoint loading and setup steps whose downloaded or unreviewed code should be checked first.

Install only if you are comfortable auditing the surrounding doc-to-lora repository and setup helper. Use a least-privilege HuggingFace token, pin and verify model/checkpoint sources before loading them, and treat generated adapters as sensitive copies of the source document.

Static analysis

Dynamic code execution

Critical

Finding: Dynamic code execution detected.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

#ASI05: Unexpected Code Execution

High

What this means

A malicious or compromised checkpoint could run code on the user's machine under the user's account.

Why it was flagged

`torch.load` with `weights_only=False` can execute Python pickle payloads during checkpoint loading, and the checkpoint path is user-selectable while the default is downloaded from an external model repository.

Skill content

parser.add_argument("--checkpoint", default="trained_d2l/gemma_demo/checkpoint-80000/pytorch_model.bin", ...)
...
state_dict = torch.load(checkpoint_path, map_location="cpu", weights_only=False)

Recommendation

Only use checkpoints from a verified, pinned source; prefer safetensors or `weights_only=True` where possible; verify hashes/revisions; and require explicit user approval before loading any non-default checkpoint.

#ASI04: Agentic Supply Chain Vulnerabilities

Medium

What this means

Setup could execute or install code that was not reviewed in these artifacts, and future upstream package/model changes could affect what runs locally.

Why it was flagged

The setup script executes a repo-local helper not included in the supplied manifest, installs some packages without version pins despite the pinned-version claim, and downloads external weights without a pinned revision in the command.

Skill content

# All installations use uv pip with pinned package versions.
...
bash install_mac.sh
...
uv pip install mlx mlx-lm safetensors 2>/dev/null || true
...
uv run huggingface-cli download SakanaAI/doc-to-lora --local-dir trained_d2l

Recommendation

Include or audit the external helper, pin package versions and HuggingFace revisions, verify hashes, and make the setup provenance clear before users run it.

#ASI03: Identity and Privilege Abuse

Low

What this means

The HuggingFace token may grant access to the user's account resources and should be handled as a secret.

Why it was flagged

The skill requires a HuggingFace token for gated model access, which is sensitive but disclosed and aligned with the stated model-download purpose.

Skill content

HF_TOKEN env var with Gemma model access ... The scripts only pass it to `huggingface-cli download` and `transformers` model loading. It is not sent anywhere else.

Recommendation

Use a least-privilege HuggingFace token, avoid sharing logs or shells that expose it, and revoke or rotate it if it may have been exposed.

#ASI06: Memory and Context Poisoning

Low

What this means

Adapters or generated model state may reveal or reproduce information from sensitive documents if shared or reused.

Why it was flagged

The purpose is to encode document information into model/adaptor state, which can persist outside the original document prompt.

Skill content

Internalize any document into a small model's weights ... The model "knows" the document.

Recommendation

Treat generated adapters, checkpoints, and JSON outputs as sensitive when the source document is sensitive, and delete them when no longer needed.