Phi Phi4

v1.0.1

Phi 4 by Microsoft — small but powerful LLMs that run on minimal hardware. Phi-4 (14B), Phi-4-mini (3.8B), and Phi-3.5 across your device fleet. Perfect for...

2· 39·1 current·1 all-time
byTwin Geeks@twinsgeeks
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description, required binaries (curl|wget, optional python/pip), and referenced config paths (~/.fleet-manager/...) align with a local fleet/router tool (ollama-herd). The assets and commands shown (pip install ollama-herd, herd, herd-node, curl to localhost:11435) are coherent with the described purpose.
Instruction Scope
SKILL.md instructs the agent/user to install/run a local router and to use curl/python to call localhost:11435; examples only target local endpoints and reference the fleet config directory. This stays within scope, but running the router is a network service — you should verify it binds only to localhost and does not expose ports to external networks.
Install Mechanism
The skill is instruction-only (no automated install spec). It tells the user to pip install 'ollama-herd' from PyPI, which is a reasonable delivery for this purpose but does involve downloading third-party code at install time; users should verify the package origin and integrity before installing.
Credentials
No environment variables, credentials, or unrelated config paths are requested. The listed config paths (~/.fleet-manager/...) are reasonable for a fleet manager and are explicitly referenced in the metadata and instructions.
Persistence & Privilege
The skill does not request always:true or elevated/ongoing system privileges; it only documents running a local service and does not instruct modifying other skills or system-wide agent settings.
Assessment
This skill appears to do what it says (run a local herd/router for Phi models), but before installing or running anything: 1) verify the PyPI package 'ollama-herd' and the GitHub repo owners are legitimate and inspect the package source if possible, 2) install in an isolated environment (virtualenv/container) and check what files it writes, 3) confirm the router binds to localhost (not 0.0.0.0) or protect the port with firewall rules, 4) be aware models will consume GBs of disk/RAM even if downloads require confirmation, and 5) review model licensing and privacy implications for any data routed through the fleet.

Like a lobster shell, security has layers — review code before you run it.

apple-siliconvk97fg4shf3kgbpvqy94sa5vztd83z8g1efficientvk97fg4shf3kgbpvqy94sa5vztd83z8g1fleet-routingvk97fg4shf3kgbpvqy94sa5vztd83z8g1latestvk979njd8hdcj4xadp1e6hf6x9n8459zelocal-llmvk97fg4shf3kgbpvqy94sa5vztd83z8g1low-ramvk97fg4shf3kgbpvqy94sa5vztd83z8g1mac-minivk97fg4shf3kgbpvqy94sa5vztd83z8g1macbook-airvk97fg4shf3kgbpvqy94sa5vztd83z8g1microsoft-phivk97fg4shf3kgbpvqy94sa5vztd83z8g1ollamavk97fg4shf3kgbpvqy94sa5vztd83z8g1phivk97fg4shf3kgbpvqy94sa5vztd83z8g1phi-4vk97fg4shf3kgbpvqy94sa5vztd83z8g1phi4vk97fg4shf3kgbpvqy94sa5vztd83z8g1small-llmvk97fg4shf3kgbpvqy94sa5vztd83z8g1

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

zap Clawdis
OSmacOS · Linux · Windows
Any bincurl, wget

SKILL.md

Phi 4 — Microsoft's Small Models, Big Results

Phi models prove you don't need 70B parameters for great results. Phi-4 matches much larger models on reasoning benchmarks while running on hardware as modest as an 8GB MacBook Air. Route them across your fleet for even better throughput.

Supported Phi models

ModelParametersOllama nameRAM neededBest for
Phi-414Bphi410GBReasoning, math, code — punches way above its weight
Phi-4-mini3.8Bphi4-mini4GBUltra-fast on any device, even 8GB Macs
Phi-3.5-mini3.8Bphi3.54GBProven lightweight model
Phi-3-medium14Bphi3:14b10GBBalanced quality and speed

Quick start

pip install ollama-herd    # PyPI: https://pypi.org/project/ollama-herd/
herd                       # start the router (port 11435)
herd-node                  # run on each device — finds the router automatically

No models are downloaded during installation. All pulls require user confirmation.

Why Phi for small devices

A Mac Mini with 16GB RAM can run Phi-4 (14B) with room to spare. A MacBook Air with 8GB runs Phi-4-mini comfortably. These models start in seconds and respond fast — ideal for devices that can't load a 70B model.

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# Phi-4 for reasoning
response = client.chat.completions.create(
    model="phi4",
    messages=[{"role": "user", "content": "Solve: if 3x + 7 = 22, what is x?"}],
)
print(response.choices[0].message.content)

Phi-4-mini — fastest response times

curl http://localhost:11435/api/chat -d '{
  "model": "phi4-mini",
  "messages": [{"role": "user", "content": "Summarize this in 3 bullet points: ..."}],
  "stream": false
}'

OpenAI-compatible API

curl http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "phi4", "messages": [{"role": "user", "content": "Write a unit test for a login function"}]}'

Ideal hardware pairings

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

Your deviceRAMBest Phi modelWhy
MacBook Air (8GB)8GBphi4-miniFits with room for other apps
Mac Mini (16GB)16GBphi4Full Phi-4 with headroom
Mac Mini (24GB)24GBphi4Can run Phi-4 + an embedding model simultaneously
MacBook Pro (36GB)36GBphi4 + phi4-miniBoth loaded, router picks based on task

Monitor your fleet

# What's loaded and where
curl -s http://localhost:11435/api/ps | python3 -m json.tool

# Fleet health overview
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

# Model recommendations based on your hardware
curl -s http://localhost:11435/dashboard/api/recommendations | python3 -m json.tool

Web dashboard at http://localhost:11435/dashboard — live view of nodes, queues, and performance.

Also available on this fleet

Larger LLMs (when you need more power)

Llama 3.3 (70B), Qwen 3.5, DeepSeek-R1, Mistral Large — route to a bigger machine in the fleet.

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "minimalist circuit board art", "width": 512, "height": 512}'

Speech-to-text

curl http://localhost:11435/api/transcribe -F "file=@meeting.wav" -F "model=qwen3-asr"

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Microsoft Phi small language model"}'

Full documentation

Guardrails

  • Model downloads require explicit user confirmation — Phi models are small (2-8GB) but still require confirmation.
  • Model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in.

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…