Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Qwen Qwen3

v1.0.2

Qwen Qwen3 — run Qwen3.5, Qwen3, Qwen3-Coder, Qwen2.5-Coder, and Qwen3-ASR across your local fleet. LLM inference, code generation, and speech-to-text from A...

1· 54·1 current·1 all-time
byTwin Geeks@twinsgeeks
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Suspicious
medium confidence
!
Purpose & Capability
The SKILL.md's purpose (run Qwen models via an Ollama-based fleet router) matches the commands and examples. However, the declared required binaries (curl or wget, optional python/pip) omit other tools the instructions actually rely on: 'ollama' is used for pulling models and calling the Ollama API, and 'uv' (used for mlx-qwen3-asr install) is referenced but not declared. This mismatch is a coherence issue: either the skill assumes these will be preinstalled or the metadata is incomplete.
Instruction Scope
Instructions direct the agent to pip install 'ollama-herd', run 'herd' and 'herd-node', pull large models via 'ollama pull', and enable transcription via local API calls to localhost:11435. These actions are within the described scope (setting up a local model fleet). The instructions will create and use files under ~/.fleet-manager (latency.db, logs/herd.jsonl) and start services that listen on a local port; they do not ask for unrelated system files or external credentials.
Install Mechanism
This is instruction-only (no install spec). The SKILL.md tells users to 'pip install ollama-herd' and to run 'ollama pull' (and 'uv tool install' for ASR). Installing from PyPI and pulling models via the ollama CLI are normal, but pip installs execute arbitrary code and present supply-chain risk. No raw download URLs or archive extracts appear in the skill itself, but the skill relies on external package managers & the ollama toolchain whose provenance the user should verify.
Credentials
No environment variables or secrets are requested. The only declared config paths (~/.fleet-manager/latency.db and logs/herd.jsonl) are consistent with a local fleet manager. There is no request for unrelated credentials or wide-scoped secrets.
Persistence & Privilege
The skill is not always-enabled and does not request elevated platform privileges. It instructs installing and running local services and writing to its own config paths, which is expected for this functionality. It does not modify other skills or global agent settings in the provided instructions.
What to consider before installing
Before installing: (1) verify the provenance of the pip package (check the PyPI page and the linked GitHub repo and confirm the publisher), because 'pip install' runs code on your machine. (2) Ensure you have the ollama CLI installed from its official source — the SKILL.md uses 'ollama' but the metadata doesn't declare it. (3) The instructions also reference 'uv' for ASR installs; confirm that tool or the intended installer. (4) Expect large disk, memory, and network usage when pulling models; run in an isolated VM/container if you want to reduce risk. (5) Check ~/.fleet-manager files the service will create and ensure firewall/network exposure is acceptable (services listen on localhost:11435 by default). If you are unsure about the pip package or binary provenance, do not install it on sensitive systems; instead review the package source code or run it in an isolated environment.

Like a lobster shell, security has layers — review code before you run it.

apple-siliconvk97bh4c96gk995m1kcah87dq1583xvqncode-generationvk97ct818pyftwq45qh8n190aw983wq9qcodingvk97bh4c96gk995m1kcah87dq1583xvqnfleet-routingvk97bh4c96gk995m1kcah87dq1583xvqnlatestvk977qb21807106td104s1hf9k584545rlocal-llmvk97bh4c96gk995m1kcah87dq1583xvqnollamavk97bh4c96gk995m1kcah87dq1583xvqnqwenvk97bh4c96gk995m1kcah87dq1583xvqnqwen-asrvk97bh4c96gk995m1kcah87dq1583xvqnqwen2.5vk97bh4c96gk995m1kcah87dq1583xvqnqwen2.5-codervk97ct818pyftwq45qh8n190aw983wq9qqwen3vk97bh4c96gk995m1kcah87dq1583xvqnqwen3-codervk97bh4c96gk995m1kcah87dq1583xvqnqwen3.5vk97ct818pyftwq45qh8n190aw983wq9qspeech-to-textvk97bh4c96gk995m1kcah87dq1583xvqn

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

sparkles Clawdis
OSmacOS · Linux · Windows
Any bincurl, wget

SKILL.md

Qwen — Run Qwen Models Across Your Local Fleet

Run Qwen3.5, Qwen3, Qwen3-Coder, and Qwen ASR on your own hardware. The fleet router picks the best device for every request — chat, code generation, and speech-to-text from one endpoint.

Supported Qwen models

LLM (Chat & Reasoning)

ModelParametersOllama nameBest for
Qwen3.50.8B–397B MoEqwen3.5Latest — multimodal, best reasoning
Qwen30.6B–235B MoEqwen3Competitive with GPT-4o
Qwen2.50.5B–72Bqwen2.5Proven, stable, multilingual

Code Generation

ModelParametersOllama nameBest for
Qwen3-Coder30B MoE (3.3B active)qwen3-coderAgentic coding workflows
Qwen2.5-Coder0.5B–32Bqwen2.5-coderCode — matches GPT-4o at 32B

Speech-to-Text

ModelParametersToolBest for
Qwen3-ASR0.6B–1.7Bmlx-qwen3-asrState-of-the-art local transcription

Setup

pip install ollama-herd
herd              # start the router (port 11435)
herd-node         # run on each machine

# Pull Qwen models
ollama pull qwen3.5:32b
ollama pull qwen3-coder

For speech-to-text:

uv tool install "mlx-qwen3-asr[serve]" --python 3.14
curl -X POST http://localhost:11435/dashboard/api/settings \
  -H "Content-Type: application/json" -d '{"transcription": true}'

Package: ollama-herd | Repo: github.com/geeks-accelerator/ollama-herd

Use Qwen through the fleet

OpenAI SDK

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# Qwen3.5 for general chat
response = client.chat.completions.create(
    model="qwen3.5:32b",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Qwen3-Coder for code

response = client.chat.completions.create(
    model="qwen3-coder",
    messages=[{"role": "user", "content": "Write a FastAPI CRUD app with SQLAlchemy"}],
)
print(response.choices[0].message.content)

Qwen ASR for transcription

curl http://localhost:11435/api/transcribe -F "audio=@meeting.wav"
import httpx

def transcribe(audio_path):
    with open(audio_path, "rb") as f:
        resp = httpx.post(
            "http://localhost:11435/api/transcribe",
            files={"audio": (audio_path, f)},
            timeout=300.0,
        )
    resp.raise_for_status()
    return resp.json()["text"]

Ollama API

# Qwen3.5 chat
curl http://localhost:11435/api/chat -d '{
  "model": "qwen3.5:32b",
  "messages": [{"role": "user", "content": "Explain transformers"}],
  "stream": false
}'

# Qwen2.5-Coder
curl http://localhost:11435/api/chat -d '{
  "model": "qwen2.5-coder:32b",
  "messages": [{"role": "user", "content": "Optimize this SQL query: ..."}],
  "stream": false
}'

Hardware recommendations

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

ModelMin RAMRecommended hardware
qwen3.5:0.8b2GBAny Mac
qwen3.5:9b8GBMac Mini M4 (16GB)
qwen3.5:32b24GBMac Mini M4 Pro (48GB)
qwen3.5:122b-a10b64GBMac Studio M4 Max (128GB)
qwen3.5:397b-a17b256GB+Mac Studio M3 Ultra (512GB)
qwen3-coder24GBMac Mini M4 Pro (48GB)
qwen2.5-coder:32b24GBMac Mini M4 Pro (48GB)
Qwen3-ASR (0.6B)1.2GBAny Mac
Qwen3-ASR (1.7B)3.4GBAny Mac (8GB+)

Why run Qwen locally

  • Zero cost — no per-token charges for Qwen API
  • Privacy — Chinese and English content stays on your devices
  • Full Qwen family — chat, code, reasoning, and speech-to-text from one fleet
  • No rate limits — Alibaba Cloud throttles API access. Local runs unlimited
  • Fleet routing — multiple machines share the load. The router picks the fastest available

The Qwen advantage on this fleet

Qwen models are uniquely suited for fleet routing:

  • MoE architecture — Qwen3.5 (397B total, 17B active) and Qwen3-Coder (30B total, 3.3B active) use Mixture of Experts. Only a fraction of parameters activate per request, making them fast despite large total size.
  • Size variety — from 0.6B to 397B, there's a Qwen model for every device in your fleet. Small Macs run the small models, big Macs run the big ones.
  • Code + Chat + STT — Qwen covers three modalities. One vendor, one fleet, three capabilities.

Also available on this fleet

Other LLM models

Llama 3.3, DeepSeek-V3, DeepSeek-R1, Phi 4, Mistral, Gemma 3 — any Ollama model routes through the same endpoint.

Image generation

curl -o image.png http://localhost:11435/api/generate-image \
  -H "Content-Type: application/json" \
  -d '{"model":"z-image-turbo","prompt":"a sunset","width":1024,"height":1024,"steps":4}'

Embeddings

curl http://localhost:11435/api/embeddings -d '{"model":"nomic-embed-text","prompt":"query"}'

Dashboard

http://localhost:11435/dashboard — monitor Qwen requests alongside all other models. Per-model latency, token throughput, error rates, health checks.

Full documentation

Agent Setup Guide

Guardrails

  • Never pull or delete Qwen models without user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • If a Qwen model is too large for available memory, suggest a smaller variant or MoE version.

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…