PyTorch
Avoid common PyTorch mistakes — train/eval mode, gradient leaks, device mismatches, and checkpoint gotchas.
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 3 · 815 · 3 current installs · 4 all-time installs
byIván@ivangdavila
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description (PyTorch best-practices) align with the content of SKILL.md. The only declared binary dependency is python3, which is reasonable for a PyTorch-focused skill. The metadata and runtime instructions all focus on model training/inference pitfalls and do not request unrelated capabilities.
Instruction Scope
SKILL.md contains static guidance (train/eval, gradient control, device management, saving/loading, etc.). It does not instruct the agent to run system commands, read files, access environment variables, or transmit data to external endpoints. No scope creep detected.
Install Mechanism
No install spec and no code files are present (instruction-only). That is the lowest-risk model — nothing is downloaded or written to disk by the skill itself.
Credentials
The skill declares no environment variables, no credentials, and no config paths. This is proportionate to an advisory/reference skill that only provides textual recommendations.
Persistence & Privilege
always is false and the skill is user-invocable; it does not request permanent presence, elevated privileges, or modify other skills. Normal autonomous invocation is allowed by platform defaults but there is no indication this skill requires special persistence.
Assessment
This skill is a read-only cheat sheet for common PyTorch mistakes and appears safe: it doesn't ask for credentials, install code, or read files. Because it's instruction-only, it will only return text guidance — it will not execute code by itself. Note the skill source/homepage is unknown; if you need higher assurance, prefer skills from a known publisher or ones that link to an official homepage. Also remember this is guidance only — the agent still needs PyTorch/environment set up to run real code, and you should avoid pasting secrets or private data into prompts when asking for debugging help.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
🔥 Clawdis
OSLinux · macOS · Windows
Binspython3
SKILL.md
Train vs Eval Mode
model.train()enables dropout, BatchNorm updates — default after initmodel.eval()disables dropout, uses running stats — MUST call for inference- Mode is sticky — train/eval persists until explicitly changed
model.eval()doesn't disable gradients — still needtorch.no_grad()
Gradient Control
torch.no_grad()for inference — reduces memory, speeds up computationloss.backward()accumulates gradients — calloptimizer.zero_grad()before backwardzero_grad()placement matters — before forward pass, not after backward.detach()to stop gradient flow — prevents memory leak in logging
Device Management
- Model AND data must be on same device —
model.to(device)andtensor.to(device) .cuda()vs.to('cuda')— both work,.to(device)more flexible- CUDA tensors can't convert to numpy directly —
.cpu().numpy()required torch.device('cuda' if torch.cuda.is_available() else 'cpu')— portable code
DataLoader
num_workers > 0uses multiprocessing — Windows needsif __name__ == '__main__':pin_memory=Truewith CUDA — faster transfer to GPU- Workers don't share state — random seeds differ per worker, set in
worker_init_fn - Large
num_workerscan cause memory issues — start with 2-4, increase if CPU-bound
Saving and Loading
torch.save(model.state_dict(), path)— recommended, saves only weights- Loading: create model first, then
model.load_state_dict(torch.load(path)) map_locationfor cross-device —torch.load(path, map_location='cpu')if saved on GPU- Saving whole model pickles code path — breaks if code changes
In-place Operations
- In-place ops end with
_—tensor.add_(1)vstensor.add(1) - In-place on leaf variable breaks autograd — error about modified leaf
- In-place on intermediate can corrupt gradient — avoid in computation graph
tensor.databypasses autograd — legacy, prefer.detach()for safety
Memory Management
- Accumulated tensors leak memory —
.detach()logged metrics torch.cuda.empty_cache()releases cached memory — but doesn't fix leaks- Delete references and call
gc.collect()— before empty_cache if needed with torch.no_grad():prevents graph storage — crucial for validation loop
Common Mistakes
- BatchNorm with
batch_size=1fails in train mode — use eval mode ortrack_running_stats=False - Loss function reduction default is 'mean' — may want 'sum' for gradient accumulation
cross_entropyexpects logits — not softmax output.item()to get Python scalar —.numpy()or[0]deprecated/error
Files
1 totalSelect a file
Select a file to preview.
Comments
Loading comments…
