iaworker
v1.0.0Intelligent Automation Worker — analyzes video/image streams and generates structured, real-time operating steps for physical tasks (debug, repair, assembly,...
Like a lobster shell, security has layers — review code before you run it.
iaworker — Intelligent Automation Worker
Analyze video/image streams, diagnose physical problems, and generate structured step-by-step operating guidance. Deliver instructions both visually (displayed markdown) and audibly (TTS spoken aloud).
Core Workflow
┌─────────────────────────────────────────────────────────────────────┐
│ iaworker PROCESS │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ [1] RECEIVE INPUT │
│ Video file path, image path, or live camera frame │
│ ↓ │
│ [2] ANALYZE (video_analyzer.py) │
│ - Extract key frames │
│ - Identify objects, damage, components │
│ - Detect anomaly patterns (cracks, loose parts, fluid leaks) │
│ - Classify task type (repair / assembly / inspection / debug) │
│ ↓ │
│ [3] GENERATE STEPS (step_engine.py) │
│ - Build ordered, numbered action steps │
│ - Include tool requirements, safety warnings │
│ - Flag prerequisite steps (disconnect power, etc.) │
│ - Estimate difficulty/time for each step │
│ ↓ │
│ [4] DELIVER (speaker.py + display) │
│ - Display formatted markdown step guide │
│ - Speak each step aloud via TTS │
│ - Step-by-step progression (not all at once) │
│ - Wait for user confirmation before advancing (configurable) │
│ │
└─────────────────────────────────────────────────────────────────────┘
Quick Start
Analyze an image and get spoken steps
python scripts/video_analyzer.py \
--input /path/to/image.jpg \
--task repair \
--lang en \
--speak
Analyze a video and get per-segment steps
python scripts/video_analyzer.py \
--input /path/to/video.mp4 \
--task debug \
--lang en \
--speak \
--step-by-step
Analyze from camera feed (live)
python scripts/video_analyzer.py \
--input camera \
--task inspection \
--lang en \
--speak \
--live
Scripts
video_analyzer.py
Entry point. Analyzes visual input and triggers step generation.
python scripts/video_analyzer.py [options]
Options:
| Flag | Description | Default |
|---|---|---|
--input PATH | Image path, video path, or camera for live | Required |
--task TYPE | repair, debug, assembly, inspection, auto | auto |
--lang CODE | en or zh | en |
--speak | Enable TTS for step output | Disabled |
--step-by-step | Speak and display one step at a time, wait for confirmation | Sequential mode |
--live | Live camera mode with continuous analysis | Off |
--output PATH | Write steps to markdown file | None (console only) |
--frame-skip N | Skip every N frames in video (speed up analysis) | 10 |
Task auto-detection:
repair— Something is broken; find damage, suggest fixesdebug— Something isn't working; trace fault to causeassembly— Something needs to be built/put togetherinspection— Check condition, report findings
step_engine.py
Generates structured steps from analysis results.
from step_engine import StepEngine
engine = StepEngine(lang="en")
steps = engine.generate(
task_type="repair",
objects=["wheel", "chain", "brake caliper"],
anomalies=["chain loose", "brake pad worn"],
context={"bike_type": "mountain"}
)
for step in steps:
print(step["number"], step["title"])
print(step["description"])
print(f"[Tools: {step['tools']}] [Time: {step['time_estimate']}]")
if step["safety_warning"]:
print(f"⚠️ {step['safety_warning']}")
Step object schema:
{
"number": int, # 1-based step number
"title": str, # Short action title
"description": str, # Detailed description
"tools": list[str], # Required tools
"time_estimate": str, # e.g. "5-10 min"
"difficulty": str, # "easy" | "medium" | "hard" | "expert"
"safety_warning": str|null,# Warning text if any
"prerequisite": bool, # Must be done before others proceed
"common_mistakes": list[str],# What to avoid
}
Difficulty classification:
| Level | Indicator |
|---|---|
easy | No special tools, minimal risk |
medium | Basic tools, some disassembly |
hard | Specialty tools, significant disassembly |
expert | Professional tools, structural risk |
speaker.py
Handles TTS output and markdown display.
from speaker import Speaker
speaker = Speaker(lang="en", tts_enabled=True)
speaker.display_and_speak("Step 1: Inspect the chain tensioner")
speaker.display_steps([...steps...])
speaker.speak_only("Make sure to wear safety glasses.")
speaker.wait_for_user("Press Enter when ready to continue")
Features:
- gtts (Google TTS) — default, works out of the box
- pyttsx3 — offline fallback
- Markdown rendering in terminal with
richlibrary - Per-step speak with configurable pacing
- Confirmation gating between steps (for
--step-by-stepmode)
Step Generation Guidelines
Steps must follow this structure:
- Prerequisites — Things that must be done first (disconnect power, secure object, etc.)
- Assessment — Inspect and confirm the problem
- Preparation — Gather tools, clear workspace
- Main actions — Numbered, one clear action per step
- Verification — Test that the fix/assembly worked
- Cleanup — Put back together, tidy tools
Rules:
- Each step = one action. If it has "and", it's two steps.
- Always include a safety check step after anything involving power, hot parts, or fluids.
- Difficulty and time estimate must be realistic.
- Flag the most common mistakes for each step.
Configuration
Config file: scripts/config.yaml
tts:
engine: "gtts" # "gtts" or "pyttsx3"
lang: "en"
speed: 1.0 # 0.5 = slow, 2.0 = fast
volume: 1.0 # 0.0 to 1.0
display:
use_rich: true # Pretty terminal output
color: "cyan" # Step highlight color
show_icons: true # Show ✅ ⚠️ 🔧 icons
analysis:
default_task: "auto"
frame_skip: 10
confidence_threshold: 0.6
step_delivery:
auto_speak: true
wait_confirmation: false
speak_difficulty: true
speak_time_estimate: true
Task Reference
Bike Repair — Chain Adjustment
🔧 Tools: Hex keys (4mm, 5mm), chain tool, lubricant
⏱ Time: 15-25 min
⚠️ Safety: Flip bike first — chain tension releases can snap
- Flip bike, rest on seat and handlebars
- Inspect chain for stiff links, rust, kinks
- Loosen rear axle bolts (5mm hex)
- Adjust chain tension via horizontal dropouts
- Check tension: 10-15mm deflection at midpoint
- Re-tighten axle bolts
- Lubricate if needed, wipe excess
- Test ride
Car Debug — Engine Won't Start
🔧 Tools: OBD2 scanner, multimeter, basic socket set
⏱ Time: 20-40 min (diagnosis first)
⚠️ Safety: Disable ignition, disconnect battery negative first
- Check if fuel pump primes (turn key to ON, listen)
- Test battery voltage (>12.4V idle, >13.5V running)
- Connect OBD2 scanner, read fault codes
- Inspect spark plugs for gap/damage
- Check for crank/cam position sensor signals
- Verify immobilizer status
- Narrow to most likely cause, then address
Generic Assembly — IKEA-style
🔧 Tools: Hex key (included), Phillips screwdriver, hammer
⏱ Time: varies
⚠️ Safety: Enlist a second person for large panels
- Unpack and sort all hardware (count screws, dowels)
- Lay out all panels, identify front/back
- Pre-assemble sub-groups before final join
- Hand-tighten all screws first
- Use cardboard to protect floors
- Final torque pass after 24h
Troubleshooting
"No audio output"
- Check if
gttsis installed:pip install gtts - Fallback:
engine: pyttsx3in config (offline) - On headless servers: set
DISPLAYenv var or usepyttsx3
"Analysis is slow on video"
- Increase
--frame-skip(e.g.,--frame-skip 30) - Use
--input camera --livefor real-time with throttled analysis
"Steps are too generic"
- Provide more context in the initial prompt
- Use
--task repairexplicitly if auto-detect fails - For specialized equipment, the LLM analysis quality depends on prompt specificity
"OpenCV camera not found"
- Check camera index:
python scripts/video_analyzer.py --input camera --list-devices - Try
--input camera --camera-index 1if default is wrong
Extending for Specific Domains
iaworker ships with general-purpose analysis. To add domain-specific knowledge:
- Create
references/domains/MYDOMAIN.mdwith known failure modes and tool lists - In
step_engine.py, add aDOMAIN_HANDLERSmap that loads these - The step engine will then reference domain files when generating steps
Example domain file:
# Domain: electric_bike
## Common Failures
- Motor controller overheating → reduce load, check ventilation
- Battery BMS cutout → reset via unplugging 30s
- Torque sensor miscalibration → re-zero via display menu
## Safety
- Never open motor housing — high voltage capacitors retain charge
- Battery must be removed before any repair
Comments
Loading comments...
