Gemma Gemma3

v1.0.1

Gemma 3 by Google — run Gemma 3 (4B, 12B, 27B) across your local device fleet. Google's most capable open model with 128K context, strong coding, and multili...

0· 45·1 current·1 all-time
byTwin Geeks@twinsgeeks
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description claim (run Gemma models locally across a fleet via an Ollama Herd router) matches the instructions: pip-install an 'ollama-herd' package and run 'herd' and 'herd-node' to provide a local endpoint. Required binaries (curl/wget) and optional python/pip are reasonable for this functionality.
Instruction Scope
SKILL.md stays on-topic: it tells the agent to install/run the herd/router, how to call the local API (localhost:11435), how to check status, and documents model choices and guardrails (downloads require user confirmation). It does not instruct reading unrelated system files or exfiltrating secrets.
Install Mechanism
There is no built-in install spec; the instructions tell the user to 'pip install ollama-herd' from PyPI. Installing a third-party package and running a network service is expected for this use case, but it is a higher-risk action because the package code executes locally and is not vetted by this scanner.
Credentials
The skill declares no required environment variables or credentials. Metadata references a couple of config paths (~/.fleet-manager/...), which are plausible for a fleet manager and are mentioned in the guardrails (do not modify). There are no unexplained secret requests.
Persistence & Privilege
The skill is not always-enabled and does not request elevated platform privileges. It instructs running a local service (herd) and per-node agents (herd-node), which is appropriate for a fleet router and does not modify other skill configurations.
Assessment
This skill is internally consistent with its purpose, but before installing you should: 1) Verify the upstream project and PyPI package (https://github.com/geeks-accelerator/ollama-herd and the PyPI package 'ollama-herd') to ensure they are official/trustworthy and inspect the code if possible; 2) Prefer pinning a known-good package version rather than installing an unpinned latest; 3) Run installation/testing in an isolated environment (VM/container) first; 4) Be aware that running 'herd'/'herd-node' opens a local network service (port 11435) and may pull multi-gigabyte model files — restrict network/firewall access to trusted hosts and confirm that model downloads truly require explicit confirmation; 5) Review ~/.fleet-manager/* logs/configs for sensitive data and follow the documented guardrails rather than blindly deleting/modifying files. If you cannot verify the package source or code, treat the installation as higher risk.

Like a lobster shell, security has layers — review code before you run it.

128k-contextvk97bnykwx8ek7maygr8722b7r183yhs5apple-siliconvk97bnykwx8ek7maygr8722b7r183yhs5codegemmavk97bnykwx8ek7maygr8722b7r183yhs5fleet-routingvk97bnykwx8ek7maygr8722b7r183yhs5gemmavk97bnykwx8ek7maygr8722b7r183yhs5gemma-3vk97bnykwx8ek7maygr8722b7r183yhs5google-gemmavk97bnykwx8ek7maygr8722b7r183yhs5latestvk97268veqav42c6t1qgytzkmj9844j4dlocal-llmvk97bnykwx8ek7maygr8722b7r183yhs5mac-studiovk97bnykwx8ek7maygr8722b7r183yhs5multilingualvk97bnykwx8ek7maygr8722b7r183yhs5ollamavk97bnykwx8ek7maygr8722b7r183yhs5open-sourcevk97bnykwx8ek7maygr8722b7r183yhs5

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

gem Clawdis
OSmacOS · Linux · Windows
Any bincurl, wget

SKILL.md

Gemma 3 — Run Google's Open Models Across Your Fleet

Gemma 3 is Google's most capable open-source LLM family. 128K context window, strong coding performance, multilingual support across 140+ languages. The fleet router picks the best device for every request — no manual load balancing.

Supported Gemma models

ModelParametersOllama nameBest for
Gemma 3 27B27Bgemma3:27bHighest quality — rivals much larger models
Gemma 3 12B12Bgemma3:12bBalanced quality and speed
Gemma 3 4B4Bgemma3:4bFast, runs on low-RAM devices
Gemma 3 1B1Bgemma3:1bUltra-light, instant responses
CodeGemma 7B7BcodegemmaCode-focused variant

Quick start

pip install ollama-herd    # PyPI: https://pypi.org/project/ollama-herd/
herd                       # start the router (port 11435)
herd-node                  # run on each device — finds the router automatically

No models are downloaded during installation. Models are pulled on demand when a request arrives, or manually via the dashboard. All pulls require user confirmation.

Use Gemma through the fleet

OpenAI SDK (drop-in replacement)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# Gemma 3 27B for complex reasoning
response = client.chat.completions.create(
    model="gemma3:27b",
    messages=[{"role": "user", "content": "Explain quantum entanglement to a 10-year-old"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Code generation with CodeGemma

response = client.chat.completions.create(
    model="codegemma",
    messages=[{"role": "user", "content": "Write a binary search tree in Rust with insert, delete, and search"}],
)
print(response.choices[0].message.content)

curl (Ollama format)

# Gemma 3 27B
curl http://localhost:11435/api/chat -d '{
  "model": "gemma3:27b",
  "messages": [{"role": "user", "content": "Translate to Japanese: The weather is beautiful today"}],
  "stream": false
}'

curl (OpenAI format)

curl http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gemma3:4b", "messages": [{"role": "user", "content": "Hello"}]}'

Which Gemma for your hardware

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

DeviceRAMBest Gemma model
MacBook Air (8GB)8GBgemma3:1b — instant responses
Mac Mini (16GB)16GBgemma3:4b — strong for its size
Mac Mini (24GB)24GBgemma3:12b — great balance
MacBook Pro (36GB)36GBgemma3:27b — full power
Mac Studio (64GB+)64GB+gemma3:27b + codegemma simultaneously

Why Gemma locally

  • 128K context — process entire codebases and long documents
  • 140+ languages — multilingual without switching models
  • Google quality, zero cost — no per-token charges after hardware
  • Privacy — all data stays on your network
  • Fleet routing — multiple machines share the load

Check what's running

# Models loaded in memory
curl -s http://localhost:11435/api/ps | python3 -m json.tool

# Fleet health
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

Web dashboard at http://localhost:11435/dashboard — live monitoring.

Also available on this fleet

Other LLMs

Llama 3.3, Qwen 3.5, DeepSeek-V3, DeepSeek-R1, Phi 4, Mistral, Codestral — same endpoint.

Image generation

curl -o image.png http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "a gemstone catching light", "width": 1024, "height": 1024}'

Speech-to-text

curl http://localhost:11435/api/transcribe -F "file=@meeting.wav" -F "model=qwen3-asr"

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Google Gemma open source language model"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Stars, issues, and PRs welcome — from humans and AI agents alike:

  • GitHub — 444 tests, fully async, CLAUDE.md makes AI agents productive instantly
  • Found a bug? Open an issue
  • Want to add a feature? Fork, branch, PR — the test suite runs in under 40 seconds

Guardrails

  • Model downloads require explicit user confirmation — Gemma models range from 1GB (1B) to 16GB (27B).
  • Model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in via auto_pull.

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…