Install
openclaw skills install ramalama-cliRun and interact with AI agents.
openclaw skills install ramalama-cliUse when an alternative AI agent is better suited to a task. For example, working with sensitive data or solving simple tasks with a cheap and local agent, or accessing specialist models with unique capabilities.
Use this skill to execute ramalama tasks in a consistent, low-risk workflow.
Prefer local discovery (--help, local config files, existing project scripts) before making assumptions about flags or runtime defaults.
Prefer ramalama when tasks need:
hf://, oci://, rlcr://, url://)Run these checks before first invocation in a session:
ramalama version
podman info >/dev/null 2>&1 || docker info >/dev/null 2>&1
ramalama run --help
If serving on default port, verify availability:
lsof -i :8080
ramalama run <model> "<prompt>"ramalama run <model>ramalama serve <model>ramalama chat --url <url> "<prompt>"ramalama rag <paths...> <destination>ramalama bench <model> and ramalama perplexity <model>inspect, pull, push, convert, list, rmStart with top-level discovery:
ramalama --help
ramalama version
Apply global options before the subcommand when needed:
ramalama [--debug|--quiet] [--dryrun] [--engine podman|docker] [--nocontainer] [--runtime llama.cpp|vllm|mlx] [--store <path>] <subcommand> ...
Use command-level help before invoking unknown flags:
ramalama <subcommand> --help
ramalama run granite3.3:2b "Summarize this in 3 bullets: <text>"
ramalama serve -d granite3.3:2b
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"granite3.3:2b","messages":[{"role":"user","content":"Hello"}]}'
ramalama serve hf://unsloth/gemma-3-270m-it-GGUF
ramalama rag ./docs my-rag
ramalama run --rag my-rag granite3.3:2b "What are the auth requirements?"
ramalama bench granite3.3:2b
ramalama benchmarks list
For agent automation, prefer explicit and deterministic flags:
ramalama --engine podman run -c 4096 --pull missing granite3.3:2b "<prompt>"
Recommended defaults:
--engine explicitly when environment is mixed-c/--ctx-size on constrained hosts--pull missing for faster repeat runs--engine podmanpodman machine list and start machine if neededtimed out during startup:
podman logs <container>-c 4096) and retry-p <port>serve exposes an OpenAI-compatible endpoint for external clients.list --json, inspect --json) for robust parsing in automation.ramalama chat --url <endpoint> when the model is already served elsewhere.brew install ramalama