Linux Ai Server

v1.0.0

Linux AI Server — turn Linux servers into a local AI inference cluster. Headless Linux AI with systemd, NVIDIA CUDA, and zero GUI overhead. Linux AI server f...

⭐ 0· 98·2 current·2 all-time

byTwin Geeks@twinsgeeks

Security Scan

VirusTotal

Pending

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description (Linux AI Server) match the instructions: installing Ollama, pip installing an 'ollama-herd' router/node component, configuring systemd, firewall, and using GPU tooling. Required binaries (curl/wget, optional python3/pip/systemctl/nvidia-smi) are appropriate for this purpose.

ℹ

Instruction Scope

SKILL.md stays within the stated purpose (install, run, and monitor a local inference fleet). It does instruct editing systemd, enabling services, opening a network port, and querying local APIs. No steps ask for unrelated files, cross-skill configs, or unrelated credentials. One scope note: it instructs running an install script fetched from the network (curl | sh), which executes remote code — this is within scope of installation but is an operational risk to inspect before running.

ℹ

Install Mechanism

There is no registry install spec, but the runtime instructions recommend network installs: `curl -fsSL https://ollama.ai/install.sh | sh` and `pip install ollama-herd`. These are common for this kind of project (ollama.ai is the expected upstream), but piping a remote shell script and installing Python packages from PyPI are moderate-risk operations and should be reviewed before execution.

✓

Credentials

The skill does not request environment variables, credentials, or config paths beyond local fleet logs/db under the user's home. The example SDK usage does not request secret keys. Suggested systemd environment variables (OLLAMA_*) are configuration for Ollama and are proportional to the task.

✓

Persistence & Privilege

The skill does not request always:true and does not modify other skills or system-wide agent settings. It instructs enabling systemd services (router/node) so the software runs at boot — this is expected for server software and is within scope.

Assessment

This skill appears coherent for running an Ollama-based inference fleet, but you should be cautious before running the network install steps. Before installing: (1) review the contents of https://ollama.ai/install.sh rather than piping it blindly to sh; (2) verify the 'ollama-herd' PyPI package and its maintainers; (3) run initial installs in an isolated VM or test server; (4) prefer least-privileged service users (the systemd examples use 'ollama'); (5) restrict the exposed port (11435) to your internal network or put an authentication/reverse-proxy in front of it; and (6) keep an eye on logs and resource usage. If you want a lower-risk approach, obtain the installers from official release artifacts, verify checksums/signatures, or use distro-packaged alternatives where available.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

server Clawdis

OSLinux

Any bincurl, wget

latestvk97145sj0g25r5hqh0bvdmrzxx845f69

98downloads

0stars

1versions

Updated 2w ago

v1.0.0

MIT-0

Linux

Linux AI Server — Headless AI Inference Cluster

Turn your Linux servers into a distributed AI inference cluster. No GUI, no Docker, no Kubernetes — just Linux + pip install. Your rack-mounted servers, cloud VMs, and spare Linux boxes all serve AI through one endpoint.

Why Linux AI server

Zero GUI overhead — headless Linux AI uses all resources for inference, not desktops
systemd native — Linux AI server starts on boot, restarts on failure, logs to journald
SSH management — manage your Linux AI server cluster entirely over SSH
Any Linux distro — Ubuntu, Debian, RHEL, Fedora, Arch, Alpine — if it runs Ollama, it joins the fleet
NVIDIA CUDA — Linux AI server uses NVIDIA GPUs natively. No compatibility issues.
Fleet routing — multiple Linux AI servers share the load. 7-signal scoring picks the best one.

Linux AI server setup

Quick install on each Linux server

# Install Ollama on Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Install the Linux AI router
pip install ollama-herd

Linux AI server router (pick one server)

herd          # start Linux AI server router on port 11435
herd-node     # register this Linux AI server

Linux AI server nodes (all other servers)

herd-node     # auto-discovers the Linux AI server router
# Or explicit: herd-node --router-url http://router-ip:11435

Linux AI server systemd services

# /etc/systemd/system/herd-router.service
[Unit]
Description=Linux AI Server Router
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd
Restart=always
RestartSec=5
User=ollama

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/herd-node.service
[Unit]
Description=Linux AI Server Node
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd-node
Restart=always
RestartSec=5
User=ollama

[Install]
WantedBy=multi-user.target

sudo systemctl enable --now herd-router    # on the Linux AI router
sudo systemctl enable --now herd-node      # on all Linux AI nodes

Linux AI server hardware guide

Linux AI Server	GPU	RAM	Best Linux AI models
Rack server (NVIDIA A100)	80GB	256GB	`deepseek-v3`, `qwen3.5:72b` — frontier
Rack server (NVIDIA L40S)	48GB	128GB	`llama3.3:70b`, `qwen3.5:32b`
Desktop server (RTX 4090)	24GB	64GB	`llama3.3:70b` (Q4), `deepseek-r1:32b`
Mini PC / NUC (no GPU)	CPU	32GB	`phi4`, `gemma3:12b` — CPU inference
Cloud VM (no GPU)	CPU	16GB	`phi4-mini`, `gemma3:4b`
Raspberry Pi 5	CPU	8GB	`gemma3:1b`, `phi4-mini` — edge AI

Linux AI server works with NVIDIA CUDA GPUs, AMD ROCm (experimental), and CPU-only inference.

Use your Linux AI server

OpenAI SDK

from openai import OpenAI

# Your Linux AI server endpoint
client = OpenAI(base_url="http://linux-ai-server:11435/v1", api_key="not-needed")

response = client.chat.completions.create(
    model="llama3.3:70b",
    messages=[{"role": "user", "content": "Write a Terraform module for AWS ECS"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

curl from any machine

# Hit your Linux AI server from anywhere on the network
curl http://linux-ai-server:11435/api/chat -d '{
  "model": "codestral",
  "messages": [{"role": "user", "content": "Write a Dockerfile for a FastAPI app"}],
  "stream": false
}'

Linux AI server environment

# Optimize Linux AI server Ollama
sudo systemctl edit ollama
# Add under [Service]:
#   Environment="OLLAMA_KEEP_ALIVE=-1"
#   Environment="OLLAMA_MAX_LOADED_MODELS=-1"
#   Environment="OLLAMA_NUM_PARALLEL=2"
sudo systemctl restart ollama

Linux AI server firewall

# UFW (Ubuntu/Debian)
sudo ufw allow 11435/tcp

# firewalld (RHEL/Fedora)
sudo firewall-cmd --add-port=11435/tcp --permanent && sudo firewall-cmd --reload

Linux AI server monitoring

# Linux AI server fleet status
curl -s http://localhost:11435/fleet/status | python3 -m json.tool

# Linux AI server health — 15 automated checks
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

# Linux AI server traces — recent requests
curl -s "http://localhost:11435/dashboard/api/traces?limit=10" | python3 -m json.tool

# Linux AI server logs
journalctl -u herd-router -f
tail -f ~/.fleet-manager/logs/herd.jsonl.$(date +%Y-%m-%d)

Dashboard at http://linux-ai-server:11435/dashboard — access from any browser on the network.

Also available on Linux AI server

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "server rack visualization", "width": 1024, "height": 1024}'

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Linux AI server headless inference"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Linux server admins welcome:

Guardrails

Linux AI server model downloads require explicit user confirmation.
Linux AI server model deletion requires explicit user confirmation.
Never delete or modify files in ~/.fleet-manager/.
No models are downloaded automatically — all pulls are user-initiated or require opt-in.

Comments

Loading comments...