Install
openclaw skills install docker-pilotSafe, intelligent Docker container management — fleet status, lifecycle operations, cleanup, compose stacks, troubleshooting, and security hardening. Classif...
openclaw skills install docker-pilotSafe, intelligent Docker management. Not just a command reference — an operational guide that classifies risk, protects critical services, and formats output for chat.
Use when the task involves Docker, Dockerfiles, containers, images, Compose, volumes, networking, debugging, or any container lifecycle operation. This is the default Docker skill — apply it whenever Docker work appears.
This skill extends the existing ClawHub docker skill (v1.0.4 by ivangdavila). Install both for full coverage:
clawhub install docker — Dockerfile patterns, image building, security hardening referenceclawhub install docker-pilot — Operational management, safety rails, fleet view, troubleshootingEvery Docker command is classified by risk level. Follow these rules without exception.
No side effects. Use freely.
docker ps # Running containers
docker ps -a # All containers (including stopped)
docker ps --format '{{json .}}' # JSON output (parseable)
docker images # All images
docker images --filter "dangling=true" # Dangling images only
docker system df # Disk usage overview
docker system df -v # Detailed disk usage
docker logs --tail 50 CONTAINER # Recent logs
docker logs --since 1h CONTAINER # Last hour of logs
docker inspect CONTAINER # Full container config (JSON)
docker stats --no-stream # Resource snapshot (not streaming)
docker network ls # List networks
docker network inspect NETWORK # Network details
docker volume ls # List volumes
docker volume inspect VOLUME # Volume details
docker history IMAGE # Image layer history
docker diff CONTAINER # Filesystem changes in container
docker port CONTAINER # Port mappings
docker top CONTAINER # Processes in container
docker events --since 1h # Recent daemon events
Parsing tip: Always use --format '{{json .}}' with python3 -m json.tool for structured data. docker inspect returns an array — always index [0].
Requires showing the user what will change before executing.
docker stop CONTAINER # Cuts service — show uptime first
docker start CONTAINER # Starts stopped container
docker restart CONTAINER # Brief outage — confirm first
docker pull IMAGE # Network + disk usage — check free space
docker tag SOURCE TARGET # Namespace change — confirm intended tag
docker network create/connect # Topology change — check port conflicts
docker volume create # Low risk but irreversible mount
docker update --restart=always # Changes restart behavior — good practice
docker container rename # May break scripts — check dependencies
docker compose up -d # Starts/modifies stack — show diff first
docker compose stop # Stops stack — show what's running
docker compose restart # Restarts stack — brief outage
Rule: Before any 🟡 command, show:
NEVER run without:
docker rm $(docker ps -aq) is FORBIDDEN)docker rm CONTAINER # Deletes container — check volumes, networks first
docker rmi IMAGE # Deletes image — check dependent containers
docker volume rm VOLUME # DATA LOSS — show contents, confirm twice
docker system prune # Removes stopped containers + dangling images
docker system prune -a # Removes ALL unused images — full audit required
docker system prune --volumes # Removes unused volumes — DATA LOSS
docker compose down -v # Destroys volumes — triple confirm
docker network rm NETWORK # Breaks attached containers — show list
docker rm -f CONTAINER # Force-remove running container — dangerous
docker exec CONTAINER rm -rf / # Destructive inside container — catch pattern
docker swarm leave --force # Dissolves swarm — catastrophic
Confirmation pattern:
⚠️ DESTRUCTIVE OPERATION
Will remove: [list items]
Impact: [data loss / service disruption / etc.]
Type "confirm" to proceed:
Some services are critical infrastructure. Never stop, restart, or remove these without explicit override:
# Default protected services (customize per deployment)
protected_services:
- adguardhome # DNS for entire network — stopping breaks internet
- unbound # DNS resolver
- nginx # Reverse proxy — stopping breaks all web services
- traefik # Reverse proxy
- pihole # DNS/ad-blocking
Rule: Before stopping a protected service, check DNS fallback:
# Verify host has alternative DNS
cat /etc/resolv.conf | grep -v adguard | grep nameserver
# If no fallback — WARN USER: "Stopping this will break DNS resolution"
The primary interface for understanding what's running. Use this format for all status reports in chat:
🐳 Docker Fleet — 5 containers
🟢 adguardhome Up 4 days 43MB DNS/ad-blocking [PROTECTED]
🟢 buck-dashboard Up 8 days 120MB System dashboard
🟢 verdaccio Up 21 days 58MB NPM registry
🟢 mockserver Up 21 days 42MB API mocking
🟢 gitbox Up 21 days 35MB Git server
📦 Images: 45 total (37 dangling, ~3GB reclaimable)
💾 Disk: 68GB/233GB used (31%)
🔧 Compose: NOT INSTALLED
# Container status with resource usage
docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Image}}'
# Resource usage snapshot
docker stats --no-stream --format 'table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}'
# Image count and dangling
docker images | wc -l
docker images --filter "dangling=true" -q | wc -l
# Disk usage
docker system df
# Check if compose is installed
docker compose version 2>/dev/null || docker-compose version 2>/dev/null || echo "NOT INSTALLED"
Map container names to functional roles. Maintain this in a local config:
# ~/.openclaw/workspace/docker-pilot/services.yaml (create if needed)
services:
adguardhome:
role: "DNS/ad-blocking"
critical: true
protected: true
port: 53
network: host
buck-dashboard:
role: "System dashboard"
critical: false
port: 8080
network: bridge
verdaccio:
role: "NPM registry"
critical: false
port: 4873
network: bridge
mockserver:
role: "API mocking"
critical: false
port: 1080
network: bridge
gitbox:
role: "Git server"
critical: false
port: 8081
network: bridge
If docker compose is not installed, install it first:
# Check current status
docker compose version 2>/dev/null || echo "NOT INSTALLED"
# Install compose plugin (no daemon restart needed)
sudo apt install docker-compose-v2
# Verify
docker compose version
Why compose matters: Without compose, every container is a docker run command with 10+ flags that must be memorized or scripted. Compose gives you declarative, version-controlled, reproducible deployments.
Run this when disk usage is high or when docker system df shows bloat.
# Show what's reclaimable
docker system df
# Dangling images (tagged <none>)
docker images --filter "dangling=true"
# Stopped containers
docker ps --filter "status=exited" --filter "status=created"
# Unused networks
docker network ls --filter "type=custom"
# Unused volumes
docker volume ls --filter "dangling=true"
# Build cache size
docker system df -v | grep "Build Cache"
# Remove dangling images (no running container uses them)
docker image prune
# Remove stopped containers
docker container prune
# Remove unused networks
docker network prune
# Remove build cache
docker builder prune
# Remove ALL unused images (not just dangling)
docker image prune -a
# ⚠️ CONFIRM: "This removes images not used by any running container. Next pull will re-download."
# Remove unused volumes (DATA LOSS RISK)
docker volume prune
# ⚠️ CONFIRM: "This deletes volume data. Show volume contents first."
# Before: docker volume inspect VOLUME_NAME
# Show contents: docker run --rm -v VOLUME_NAME:/mnt alpine ls -la /mnt
# Nuclear option
docker system prune -a --volumes
# ⚠️ DOUBLE CONFIRM: "This removes everything not used by a running container including volumes."
docker system df
docker ps
docker images
# Check if container has a health check
docker inspect --format='{{.Config.Health}}' CONTAINER
# Add health check to existing container (requires recreate)
docker update --health-cmd="curl -f http://localhost:8080/ || exit 1" \
--health-interval=30s \
--health-timeout=5s \
--health-retries=3 \
CONTAINER
# HTTP endpoint
curl -f http://localhost:PORT/ || exit 1
# TCP port
nc -z localhost PORT || exit 1
# DNS (for AdGuard)
dig +short google.com @localhost || exit 1
# Process check
pgrep -x PROCESS_NAME || exit 1
# Set restart policy (prevents manual restart after reboot)
docker update --restart=always CONTAINER
# Check current policy
docker inspect --format='{{.HostConfig.RestartPolicy.Name}}' CONTAINER
# Policies:
# no — Never restart (default)
# on-failure — Restart only on non-zero exit
# always — Always restart, including on daemon start
# unless-stopped — Always restart except when manually stopped
# Add log limits to existing container (requires recreate)
docker run --log-opt max-size=10m --log-opt max-file=3 ...
# Global daemon config: /etc/docker/daemon.json
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
# Last 50 lines
docker logs --tail 50 CONTAINER
# Last hour
docker logs --since 1h CONTAINER
# Follow with timeout (don't leave streaming)
docker logs -f --since 5m CONTAINER & PID=$! ; sleep 30 ; kill $PID
# Search for errors
docker logs CONTAINER 2>&1 | grep -i "error\|exception\|fail\|fatal" | tail -20
# JSON log format (if container outputs JSON)
docker logs CONTAINER --since 1h | python3 -m json.tool | grep "error"
# 1. Check exit code
docker inspect --format='{{.State.ExitCode}}' CONTAINER
# Common codes: 0=graceful, 1=app error, 137=OOM killed, 139=segfault, 125=docker error
# 2. Check logs
docker logs --tail 50 CONTAINER
# 3. Check if OOM killed
docker inspect --format='{{.State.OOMKilled}}' CONTAINER
# 4. Check resource limits
docker inspect --format='{{.HostConfig.Memory}}' CONTAINER
# 5. Try interactive debug
docker run --rm -it --entrypoint /bin/sh IMAGE
# Find what's using a port
ss -tlnp | grep :PORT
# or
lsof -i :PORT
# Check if it's a Docker container
docker ps --filter "publish=PORT"
# Fix: change host port mapping or stop conflicting service
# 1. Check Docker disk usage
docker system df -v
# 2. Check host disk
df -h /var/lib/docker
# 3. Quick reclaim (safe)
docker image prune
docker container prune
docker builder prune
# 4. If still full (confirm first!)
docker image prune -a # Remove ALL unused images
# 1. Check network
curl -I https://registry-1.docker.io/v2/
# 2. Check auth
docker login
# 3. Check rate limits (Docker Hub)
# Anonymous: 100 pulls/6hr, Authenticated: 200 pulls/6hr
# 4. Try specific digest instead of tag
docker pull image@sha256:DIGEST
# 1. See restart count
docker inspect --format='{{.RestartCount}}' CONTAINER
# 2. Read crash logs
docker logs --tail 100 CONTAINER
# 3. Common causes:
# - Missing env vars: look for "required" or "must set" in logs
# - File permissions: look for "permission denied"
# - Port conflict: look for "address already in use"
# - OOM: check docker inspect State.OOMKilled
# Containers can't reach each other
# Default bridge has NO DNS — use custom network
docker network create mynet
docker network connect mynet CONTAINER
# Container can't reach host
# Use host.docker.internal (Docker Desktop) or host IP
# On Linux: add to /etc/docker/daemon.json:
# {"hosts": ["tcp://0.0.0.0:2375", "unix:///var/run/docker.sock"]}
# DNS not resolving in container
docker exec CONTAINER cat /etc/resolv.conf
docker exec CONTAINER nslookup google.com
# docker-compose.yml — declarative, version-controlled, reproducible
version: "3.8"
services:
app:
image: myapp:1.0
restart: unless-stopped
ports:
- "8080:8080"
environment:
- NODE_ENV=production
volumes:
- app-data:/data
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
deploy:
resources:
limits:
memory: 512M
cpus: "0.5"
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
volumes:
app-data:
# Start stack
docker compose up -d
# View stack status
docker compose ps
# View logs
docker compose logs -f --tail 50
# Restart single service
docker compose restart app
# Pull and recreate (update)
docker compose pull && docker compose up -d
# Stop (keep data)
docker compose down
# Stop AND remove volumes (⚠️ DATA LOSS)
docker compose down -v
depends_on waits for container start, NOT service ready — use condition: service_healthy.env file must be next to docker-compose.yml — wrong directory = silently ignoreddocker compose run does NOT start dependencies# Run as non-root (always prefer this)
docker run --user 1000:1000 ...
# Drop all capabilities, add only what's needed
docker run --cap-drop ALL --cap-add NET_BIND_SERVICE ...
# Read-only root filesystem
docker run --read-only --tmpfs /tmp ...
# Resource limits (always set these)
docker run -m 512m --cpus=0.5 ...
# No new privileges
docker run --security-opt=no-new-privileges ...
# Pin versions (never use :latest in production)
docker pull nginx:1.25.3-alpine
# Scan for vulnerabilities
docker scout cves IMAGE
# Verify image integrity
docker pull image@sha256:DIGEST
docker run --privileged — disables ALL security-v /:/host — mounts entire host filesystem--pid=host — can see/kill host processes--network=host on non-DNS containers — unnecessary exposuredocker inspect and docker historydocker rm $(docker ps -aq) — chained destructive commanddocker system prune -a without audit first# One-liner fleet health
docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'
# Resource usage
docker stats --no-stream --format 'table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}'
# Per-container disk usage
docker system df -v
# Host resources
df -h /var/lib/docker
free -h
| Metric | Warning | Critical | Action |
|---|---|---|---|
| Disk usage | >80% | >90% | Run cleanup playbook |
| Memory | >80% | >95% | Add limits or restart heavy containers |
| Container restarts | >3/hour | >10/hour | Check logs, likely crash loop |
| Dangling images | >10 | >30 | Run image prune |
| Log file size | >100MB | >1GB | Add log rotation |
# ✅ GOOD — requirements rarely change, code changes often
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
# ❌ BAD — invalidates cache on every code change
COPY . .
RUN pip install -r requirements.txt
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
EXPOSE 3000
CMD ["node", "dist/server.js"]
--from=builder copies from wrong stage silentlyCOPY . . before RUN npm install = cache invalidated on every code changeADD extracts archives automatically — use COPY unless you need extractionrm -rf /var/lib/apt/lists in separate RUN = space not reclaimed (layers).git copied = megabytes of bloat — use .dockerignoreARG only available during build, visible in docker history — NEVER for secretsENV persists at runtime — use for configurationARG with empty override uses default, not empty stringARG must be re-declared after each FROM in multi-stageWhen reporting Docker status in Telegram, use this format:
🐳 **Docker Fleet** — 5 running
🟢 **adguardhome** — DNS/ad-blocking [PROTECTED]
Up 4 days · 43MB RAM · :53
🟢 **buck-dashboard** — Dashboard
Up 8 days · 120MB RAM · :8080
🟢 **verdaccio** — NPM registry
Up 21 days · 58MB RAM · :4873
🟡 **mockserver** — API mocking
Up 21 days · 42MB RAM · :1080
🟢 **gitbox** — Git server
Up 21 days · 35MB RAM · :8081
📦 37 dangling images (3GB reclaimable)
💾 68GB/233GB disk (31%)
⚠️ **Container Alert**
🔴 **mockserver** — Exited (1) 2min ago
Last log: `Connection refused on port 1080`
Restart? (3 restarts in last hour)
🧹 **Docker Cleanup**
Removed:
- 12 dangling images (450MB)
- 3 stopped containers
- 1 unused network
Reclaimed: **1.2GB**
Current disk: 62GB/233GB (27%)
| Task | Command |
|---|---|
| Fleet status | docker ps --format 'table {{.Names}}\t{{.Status}}' |
| Resource usage | docker stats --no-stream |
| Disk usage | docker system df |
| Container logs | docker logs --tail 50 CONTAINER |
| Inspect JSON | docker inspect CONTAINER | python3 -m json.tool |
| Find dangling | docker images --filter "dangling=true" -q | wc -l |
| Safe cleanup | docker image prune && docker container prune && docker builder prune |
| Health check | docker inspect --format='{{.State.Health.Status}}' CONTAINER |
| Restart policy | docker update --restart=always CONTAINER |
| Compose up | docker compose up -d |
| Compose logs | docker compose logs -f --tail 50 |
When this skill is activated for the first time on a new machine:
docker compose version — if missing, install itdocker ps -a + docker system df — understand current statedocker update --restart=unless-stopped for all running containersBuilt on top of the docker skill by ivangdavila (v1.0.4). This skill adds: