OpenClaw VLN Planner

v1.0.0

Plan the next high-level navigation step for a robot from a user navigation instruction, one current image, and a sequence of historical images. Use when the...

0· 123·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for tiktokdad/openclaw-vln-planner.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "OpenClaw VLN Planner" (tiktokdad/openclaw-vln-planner) from ClawHub.
Skill page: https://clawhub.ai/tiktokdad/openclaw-vln-planner
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install openclaw-vln-planner

ClawHub CLI

Package manager switcher

npx clawhub@latest install openclaw-vln-planner
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description, SKILL.md, config, and vln_bridge.py all align: the planner builds a prompt from historical+current images and an instruction, queries a multimodal model, parses JSON, validates bounds, and forwards a mid-level action. Network access to a model and reading image files are expected for this purpose.
Instruction Scope
Runtime instructions and the Python bridge explicitly read image files, load a YAML config containing model base_url/api_key/model_id, and send base64-encoded images to the configured OpenAI-compatible gateway. This behavior is necessary for a multimodal planner but means camera frames (potentially sensitive) are transmitted to an external service. SKILL.md and code limit outputs to pure JSON and define safety fallbacks.
Install Mechanism
This is an instruction-only skill with a small example Python bridge; there is no install spec, no external downloads, and only a minimal requirements.txt (requests, PyYAML). No extraction from arbitrary URLs or package installs are present.
Credentials
The package does not declare required env vars or primary credentials, but the bridge requires a model base_url and api_key in a YAML config file (config/vln-config.yaml). Expect to provide credentials to the model gateway; that is proportional to the task but the manifest omission of a required config/credential declaration is a small inconsistency to be aware of.
Persistence & Privilege
The skill does not request persistent/system privileges, does not set always:true, and has no install actions that modify other skills or system-wide settings. The bridge runs as a standalone script and prints dry-run execution by default.
Assessment
This skill appears to do what it says: it will read local camera frame files, base64-encode them, and POST them (with the model API key from a YAML config) to whatever OpenAI-compatible gateway you configure. Before installing: (1) confirm you trust the gateway endpoint and operator because camera images and any scene data will be transmitted; (2) store the API key securely (the example uses a config file rather than env vars) and update the manifest if you need policy/audit visibility; (3) keep executor in dry_run while testing and review/replace the placeholder execute_* functions so the planner cannot command hardware until you've integrated a vetted execution bridge; (4) if you require stricter telemetry controls, inspect/modify image_to_data_url and build_messages to avoid sending raw images or to anonymize them. The small manifest omission (no declared required config path/credential) is not malicious but worth correcting for clarity and safety.

Like a lobster shell, security has layers — review code before you run it.

latestvk97e0e2ba2s5jh60cpk3b42ksx83j9as
123downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

OpenClaw VLN Planner

Use this skill when the user wants a robot to follow a natural-language navigation instruction from visual observations.

This skill is a high-level navigation planner. It does not produce motor, joint, torque, or trajectory control. It only produces one structured mid-level navigation action at a time.

When this skill triggers

Trigger this skill when the task includes one or more of the following:

  • Vision-language navigation (VLN)
  • Robot next-step planning from camera images
  • Closed-loop navigation with replanning after each observation
  • Converting a current frame plus historical frames into a single next navigation action
  • Sending current + history images to an OpenAI-compatible multimodal gateway for action prediction

Required inputs

The planner expects:

  • user_instruction: natural-language navigation instruction
  • current_frame: exactly one current image
  • history_frames: zero or more previous images in temporal order

Optional inputs:

  • robot_state: heading, speed, pose estimate, execution feedback, etc.
  • safety_flags: blocked, collision_risk, lost, target_reached, low_visibility, etc.
  • config_path: path to the runtime config file

Output contract

Output must be pure JSON only. Do not prepend or append prose.

Allowed action types only:

  • MOVE_FORWARD
  • TURN_LEFT
  • TURN_RIGHT
  • STOP

Expected JSON shape:

{
  "next_action": {
    "type": "MOVE_FORWARD",
    "value": 75,
    "unit": "cm"
  },
  "task_status": "in_progress",
  "confidence": 0.87,
  "notes": "continue along the hallway"
}

Completion example:

{
  "next_action": {
    "type": "STOP"
  },
  "task_status": "completed",
  "confidence": 0.93,
  "notes": "goal reached"
}

Core rules

  1. Plan only the next action.
  2. Never output a full route.
  3. Replan after each execution step.
  4. If uncertain, unsafe, blocked, unable to parse, or visually ambiguous, output STOP.
  5. Enforce action bounds:
    • MOVE_FORWARD: 10-150 cm
    • TURN_LEFT: 5-90 deg
    • TURN_RIGHT: 5-90 deg
    • STOP: no value/unit required
  6. If safety_flags.target_reached == true, output STOP with task_status = completed.
  7. If blocked, collision_risk, lost, or severe uncertainty is present, prefer STOP.

Runtime configuration

Before running, load a YAML config file such as config/vln-config.yaml.

The config should define:

  • subscribed or logical input topics / channels for current frame and history frame collection
  • optional robot state and safety flag sources
  • OpenAI-compatible multimodal gateway settings: base_url, api_key, model_id
  • planner behavior such as confidence threshold and safety fallback
  • executor bridge mode (default: Python function bridge)

Read references/navigation-schema.md for the expected config structure.

Internal module design

1) context builder

Build a model input payload from:

  • user instruction
  • historical observations
  • current observation
  • optional robot state
  • optional safety flags

The prompt must explicitly separate:

  • historical observations
  • current observation
  • user instruction

2) action planner

Call an OpenAI-compatible multimodal gateway with:

  • one current image
  • historical images
  • planner prompt
  • optional structured context

The model should be asked to return pure JSON for exactly one next action.

3) action parser

Parse the model result as JSON.

If parsing fails:

  • try safe extraction of the first JSON object
  • if still invalid, fall back to STOP

4) action validator

Validate:

  • action type is one of the four allowed values
  • distance and angle ranges are legal
  • unit matches action type
  • confidence is numeric if present
  • task_status is one of in_progress, completed, failed

Any invalid output falls back to STOP.

5) executor bridge

Forward the validated mid-level action to a separate execution layer.

Reserved Python bridge interface:

  • execute_move_forward(distance_cm)
  • execute_turn_left(angle_deg)
  • execute_turn_right(angle_deg)
  • execute_stop()
  • get_robot_state()
  • get_safety_flags()

Do not hardcode a robot SDK into the planner logic.

6) replanning loop

Use the planner in a closed loop:

  1. gather current frame + history frames
  2. gather optional robot state / safety flags
  3. call multimodal planner
  4. parse and validate JSON action
  5. execute through bridge
  6. observe again
  7. repeat until task_status = completed or forced stop

7) safety fallback

Always stop on:

  • parse failure
  • invalid action
  • confidence below threshold
  • blocked / collision risk / lost / target reached
  • missing visual evidence for safe motion

Prompt template

Use this prompt pattern:

You are a robot navigation planner.
You will receive:
1. historical observations
2. current observation
3. a user instruction
4. optional robot state and safety flags

Your job is to decide the robot's next single mid-level navigation action.
You may output only one of these actions:
- MOVE_FORWARD with distance in cm
- TURN_LEFT with angle in deg
- TURN_RIGHT with angle in deg
- STOP

Rules:
- Plan only the next step, not the whole route.
- If the goal has been reached, output STOP.
- If you are uncertain, the scene is unclear, or there is any safety risk, output STOP.
- MOVE_FORWARD must be 10-150 cm.
- TURN_LEFT and TURN_RIGHT must be 5-90 deg.
- Output pure JSON only, with no extra explanation.

Example user requests

  • "Go down the hallway and stop at the blue door."
  • "Move to the kitchen entrance."
  • "Find the end of the corridor and stop."
  • "Turn right at the next intersection and continue."

Failure handling

If anything is wrong with the output, return:

{
  "next_action": {
    "type": "STOP"
  },
  "task_status": "failed",
  "confidence": 0.0,
  "notes": "fallback_stop"
}

Bundled resources

  • references/navigation-schema.md: schema, bounds, safety fallback, examples, config contract
  • scripts/vln_bridge.py: example OpenAI-compatible multimodal planner + Python executor bridge
  • scripts/requirements.txt: Python dependencies
  • config/vln-config.yaml: runtime config template

Comments

Loading comments...