Phi Phi4

v1.0.1

Phi 4 by Microsoft — small but powerful LLMs that run on minimal hardware. Phi-4 (14B), Phi-4-mini (3.8B), and Phi-3.5 across your device fleet. Perfect for...

⭐ 2· 39·1 current·1 all-time

byTwin Geeks@twinsgeeks

MIT-0

Security Scan

VirusTotal

Pending

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description, required binaries (curl|wget, optional python/pip), and referenced config paths (~/.fleet-manager/...) align with a local fleet/router tool (ollama-herd). The assets and commands shown (pip install ollama-herd, herd, herd-node, curl to localhost:11435) are coherent with the described purpose.

ℹ

Instruction Scope

SKILL.md instructs the agent/user to install/run a local router and to use curl/python to call localhost:11435; examples only target local endpoints and reference the fleet config directory. This stays within scope, but running the router is a network service — you should verify it binds only to localhost and does not expose ports to external networks.

ℹ

Install Mechanism

The skill is instruction-only (no automated install spec). It tells the user to pip install 'ollama-herd' from PyPI, which is a reasonable delivery for this purpose but does involve downloading third-party code at install time; users should verify the package origin and integrity before installing.

✓

Credentials

No environment variables, credentials, or unrelated config paths are requested. The listed config paths (~/.fleet-manager/...) are reasonable for a fleet manager and are explicitly referenced in the metadata and instructions.

✓

Persistence & Privilege

The skill does not request always:true or elevated/ongoing system privileges; it only documents running a local service and does not instruct modifying other skills or system-wide agent settings.

Assessment

This skill appears to do what it says (run a local herd/router for Phi models), but before installing or running anything: 1) verify the PyPI package 'ollama-herd' and the GitHub repo owners are legitimate and inspect the package source if possible, 2) install in an isolated environment (virtualenv/container) and check what files it writes, 3) confirm the router binds to localhost (not 0.0.0.0) or protect the port with firewall rules, 4) be aware models will consume GBs of disk/RAM even if downloads require confirmation, and 5) review model licensing and privacy implications for any data routed through the fleet.

Like a lobster shell, security has layers — review code before you run it.

apple-siliconvk97fg4shf3kgbpvqy94sa5vztd83z8g1efficientvk97fg4shf3kgbpvqy94sa5vztd83z8g1fleet-routingvk97fg4shf3kgbpvqy94sa5vztd83z8g1latestvk979njd8hdcj4xadp1e6hf6x9n8459zelocal-llmvk97fg4shf3kgbpvqy94sa5vztd83z8g1low-ramvk97fg4shf3kgbpvqy94sa5vztd83z8g1mac-minivk97fg4shf3kgbpvqy94sa5vztd83z8g1macbook-airvk97fg4shf3kgbpvqy94sa5vztd83z8g1microsoft-phivk97fg4shf3kgbpvqy94sa5vztd83z8g1ollamavk97fg4shf3kgbpvqy94sa5vztd83z8g1phivk97fg4shf3kgbpvqy94sa5vztd83z8g1phi-4vk97fg4shf3kgbpvqy94sa5vztd83z8g1phi4vk97fg4shf3kgbpvqy94sa5vztd83z8g1small-llmvk97fg4shf3kgbpvqy94sa5vztd83z8g1

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

zap Clawdis

OSmacOS · Linux · Windows

Any bincurl, wget

SKILL.md

Phi 4 — Microsoft's Small Models, Big Results

Phi models prove you don't need 70B parameters for great results. Phi-4 matches much larger models on reasoning benchmarks while running on hardware as modest as an 8GB MacBook Air. Route them across your fleet for even better throughput.

Supported Phi models

Model	Parameters	Ollama name	RAM needed	Best for
Phi-4	14B	`phi4`	10GB	Reasoning, math, code — punches way above its weight
Phi-4-mini	3.8B	`phi4-mini`	4GB	Ultra-fast on any device, even 8GB Macs
Phi-3.5-mini	3.8B	`phi3.5`	4GB	Proven lightweight model
Phi-3-medium	14B	`phi3:14b`	10GB	Balanced quality and speed

Quick start

pip install ollama-herd    # PyPI: https://pypi.org/project/ollama-herd/
herd                       # start the router (port 11435)
herd-node                  # run on each device — finds the router automatically

No models are downloaded during installation. All pulls require user confirmation.

Why Phi for small devices

A Mac Mini with 16GB RAM can run Phi-4 (14B) with room to spare. A MacBook Air with 8GB runs Phi-4-mini comfortably. These models start in seconds and respond fast — ideal for devices that can't load a 70B model.

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# Phi-4 for reasoning
response = client.chat.completions.create(
    model="phi4",
    messages=[{"role": "user", "content": "Solve: if 3x + 7 = 22, what is x?"}],
)
print(response.choices[0].message.content)

Phi-4-mini — fastest response times

curl http://localhost:11435/api/chat -d '{
  "model": "phi4-mini",
  "messages": [{"role": "user", "content": "Summarize this in 3 bullet points: ..."}],
  "stream": false
}'

OpenAI-compatible API

curl http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "phi4", "messages": [{"role": "user", "content": "Write a unit test for a login function"}]}'

Ideal hardware pairings

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

Your device	RAM	Best Phi model	Why
MacBook Air (8GB)	8GB	`phi4-mini`	Fits with room for other apps
Mac Mini (16GB)	16GB	`phi4`	Full Phi-4 with headroom
Mac Mini (24GB)	24GB	`phi4`	Can run Phi-4 + an embedding model simultaneously
MacBook Pro (36GB)	36GB	`phi4` + `phi4-mini`	Both loaded, router picks based on task

Monitor your fleet

# What's loaded and where
curl -s http://localhost:11435/api/ps | python3 -m json.tool

# Fleet health overview
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

# Model recommendations based on your hardware
curl -s http://localhost:11435/dashboard/api/recommendations | python3 -m json.tool

Web dashboard at http://localhost:11435/dashboard — live view of nodes, queues, and performance.

Also available on this fleet

Larger LLMs (when you need more power)

Llama 3.3 (70B), Qwen 3.5, DeepSeek-R1, Mistral Large — route to a bigger machine in the fleet.

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "minimalist circuit board art", "width": 512, "height": 512}'

Speech-to-text

curl http://localhost:11435/api/transcribe -F "file=@meeting.wav" -F "model=qwen3-asr"

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Microsoft Phi small language model"}'

Full documentation

Agent Setup Guide — all 4 model types
API Reference — complete endpoint docs

Guardrails

Model downloads require explicit user confirmation — Phi models are small (2-8GB) but still require confirmation.
Model deletion requires explicit user confirmation.
Never delete or modify files in ~/.fleet-manager/.
No models are downloaded automatically — all pulls are user-initiated or require opt-in.

Files

1 total

Select a file

Select a file to preview.

Comments

Loading comments…