Homelab Ai

Home lab AI — turn your spare machines into a local AI home lab cluster. LLM inference, image generation, speech-to-text, and embeddings across macOS, Linux, and Windows devices. Zero-config mDNS discovery, real-time dashboard, 7-signal scoring. No cloud, no Docker, no Kubernetes. The home lab AI setup that just works. 家庭实验室AI本地推理集群。Laboratorio IA para inferencia local en casa.

Audits

Pending

Install

openclaw skills install homelab-ai

Home Lab AI — Your Spare Machines Are a Cluster

You have machines sitting around your home lab. A mini PC in the closet. A workstation on the desk. Maybe a desktop doing light work. Together, your home lab has more compute than most cloud instances — you just need software that treats them as one home lab system. Works on macOS, Linux, and Windows.

Ollama Herd turns your home lab into a local AI cluster. One home lab endpoint, zero config, four model types.

What your home lab gets

Device 1 (32GB)    ─┐
Device 2 (64GB)     ├──→  Home Lab Router (:11435)  ←──  Your apps / agents
Device 3 (256GB)   ─┘
  • Home lab LLM inference — Llama, Qwen, DeepSeek, Phi, Mistral, Gemma
  • Home lab image generation — Stable Diffusion 3, Flux, z-image-turbo
  • Home lab speech-to-text — Qwen3-ASR transcription
  • Home lab embeddings — nomic-embed-text, mxbai-embed for RAG

All routed to the best available home lab device automatically.

Home Lab Setup (5 minutes)

On every home lab machine:

pip install ollama-herd    # Home lab AI router

Pick one home lab machine as the router:

herd    # starts the home lab router

On every other home lab machine:

herd-node    # joins the home lab fleet automatically

That's it. Home lab devices discover each other automatically on your local network. No IP addresses, no config files, no Docker, no Kubernetes.

Optional: add home lab image generation

uv tool install mflux           # Flux models (fastest for home labs)
uv tool install diffusionkit    # Stable Diffusion 3/3.5

Use Your Home Lab

Home lab LLM chat

from openai import OpenAI

# Home lab inference client
homelab_client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")
homelab_response = homelab_client.chat.completions.create(
    model="llama3.3:70b",
    messages=[{"role": "user", "content": "How do I set up a home lab NAS?"}],
    stream=True,
)
for chunk in homelab_response:
    print(chunk.choices[0].delta.content or "", end="")

Home lab image generation

curl -o homelab_output.png http://localhost:11435/api/generate-image \
  -H "Content-Type: application/json" \
  -d '{"model": "z-image-turbo", "prompt": "a cozy home lab with servers and RGB lighting", "width": 1024, "height": 1024}'

Home lab transcription

curl http://localhost:11435/api/transcribe -F "file=@homelab_standup.wav" -F "model=qwen3-asr"

Home lab knowledge base

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "home lab networking and AI inference best practices"}'

How the Home Lab Routes Requests

The home lab router scores each device on 7 signals and picks the best one:

Home Lab SignalWhat it measures
Thermal stateIs the home lab model already loaded (hot) or needs cold-loading?
Memory fitDoes the home lab device have enough RAM for this model?
Queue depthIs the home lab device already busy with other requests?
Wait timeHow long has the home lab request been waiting?
Role affinityBig models prefer big home lab machines, small models prefer small ones
Availability trendIs this home lab device reliably available at this time of day?
Context fitDoes the loaded context window fit the home lab request?

You don't manage any of this. The home lab router handles it.

The Home Lab Dashboard

Open http://localhost:11435/dashboard in your browser — your home lab command center:

  • Home Lab Fleet Overview — see every device, loaded models, queue depths, health
  • Trends — home lab requests per hour, latency, token throughput over 24h-7d
  • Health — 15 automated home lab checks with recommendations
  • Recommendations — optimal home lab model mix per device based on your hardware

Recommended Home Lab Models by Device

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

Home Lab DeviceRAMStart with
MacBook Air (8GB)8GBphi4-mini, gemma3:1b
Mac Mini (16GB)16GBphi4, gemma3:4b, nomic-embed-text
Mac Mini (32GB)32GBqwen3:14b, deepseek-r1:14b
MacBook Pro (64GB)64GBqwen3:32b, codestral, z-image-turbo
Mac Studio (128GB)128GBllama3.3:70b, qwen3:72b
Mac Studio (256GB)256GBgpt-oss:120b, sd3.5-large

The home lab router's model recommender suggests the optimal mix: GET /dashboard/api/recommendations.

Works with Every Home Lab Tool

The home lab fleet exposes an OpenAI-compatible API. Any tool that works with OpenAI works with your home lab:

ToolHome Lab Connection
Open WebUISet Ollama URL to http://homelab-router:11435
Aideraider --openai-api-base http://homelab-router:11435/v1
Continue.devBase URL: http://homelab-router:11435/v1
LangChainChatOpenAI(base_url="http://homelab-router:11435/v1")
CrewAISet OPENAI_API_BASE=http://homelab-router:11435/v1
Any OpenAI SDKBase URL: http://homelab-router:11435/v1, API key: any string

Full documentation

Contribute

Ollama Herd is open source (MIT) and built by home lab enthusiasts for home lab enthusiasts:

  • Star on GitHub — help other home lab builders find us
  • Open an issue — share your home lab setup, report bugs
  • PRs welcome — from humans and AI agents. CLAUDE.md gives full context.
  • Built by twin brothers in Alaska who run their own home lab fleet.

Home Lab Guardrails

  • No automatic downloads — home lab model pulls require explicit user confirmation. Some models are 70GB+.
  • Home lab model deletion requires explicit user confirmation.
  • All home lab requests stay local — no data leaves your home network.
  • Never delete or modify files in ~/.fleet-manager/ (home lab routing data and logs).
  • No cloud dependencies — your home lab works offline after initial model downloads.