Nm Leyline Utility

v1.0.0

Score candidate agent actions by expected gain, cost, uncertainty, and redundancy to guide dispatch and termination decisions

0· 49·1 current·1 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (utility for scoring action candidates) match the instructions: all files describe how to build state and compute Gain/Cost/Uncertainty/Redundancy. No unrelated binaries, credentials, or install steps are requested.
Instruction Scope
SKILL.md and modules instruct the LLM to read conversation history, TaskList, session token counters, files read, and dispatched-agent metadata to construct state — these are expected for an orchestration utility, but they do assume the consumer skill/agent grants access to that contextual data. The prescriptive mode (utility_gated: true) can force consumers to follow the selected action, which is a behavioral enforcement worth noting before enabling.
Install Mechanism
Instruction-only skill with no install spec and no code files; nothing is written to disk or downloaded. This is the lowest-risk install profile.
Credentials
No environment variables, credentials, or config paths are required. The skill relies on in-agent data (conversation history, TaskList, token counters), which is appropriate for its function.
Persistence & Privilege
always:false and no install means it does not persist or auto-enable itself. However, if a consumer sets frontmatter utility_gated:true, the consumer skill must follow the selected action (prescriptive enforcement) — this gives the utility influence over run-time decisions but is opt-in by the consumer.
Assessment
This skill is internally coherent and low-risk: it only provides heuristics and requires no credentials or external installs. Before enabling it in production, confirm (1) which consuming skills will be allowed to set utility_gated:true (prescriptive mode can force actions), (2) that you are comfortable granting consumer skills access to conversation history/TaskList/token counters, and (3) that consumers correctly log/justify any overrides. Absence of regex findings is expected for an instruction-only skill — review integrations and frontmatter of consumer skills that adopt it to ensure they don't inadvertently grant excessive privileges.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🦞 Clawdis
latestvk97bk88xas11z8yfc4xna76wsd84vpfm
49downloads
0stars
1versions
Updated 6d ago
v1.0.0
MIT-0

Night Market Skill — ported from claude-night-market/leyline. For the full experience with agents, hooks, and commands, install the Claude Code plugin.

Utility Skill

Overview

A decision framework for agent orchestration based on Liu et al., "Utility-Guided Agent Orchestration for Efficient LLM Tool Use" (arXiv:2603.19896). Each candidate action is scored by subtracting weighted costs from expected gain, producing a single utility value that guides action selection. The framework prevents over-calling tools and premature stopping by making both errors costly. Utility range is [-2.3, 1.0].

When To Use

  • Deciding whether to dispatch another agent or tool call
  • Gating expensive tool calls (search, code execution, delegation)
  • Selecting the right model tier for a sub-task
  • Continuation decisions after receiving partial results
  • Verification gating before writing or committing output

When NOT to Use

  • Single-step operations with one obvious action
  • Trivial tasks where cost of scoring exceeds benefit
  • Already-committed actions that cannot be undone

Action Space

A = {respond, retrieve, tool_call, verify, delegate, stop}

ActionDescription
respondEmit a final answer from current context
retrieveFetch additional information (search, read, lookup)
tool_callExecute a tool (code runner, API, file write)
verifyCheck a prior result for correctness or completeness
delegateSpawn a sub-agent or hand off to a specialist
stopTerminate the loop and return current state

Utility Function

U(a | s_t) = Gain(a | s_t)
           - λ₁ · StepCost(a | s_t)
           - λ₂ · Uncertainty(a | s_t)
           - λ₃ · Redundancy(a | s_t)
ParameterDefaultRationale
λ₁1.0Cost baseline; all other weights relative to this
λ₂0.5Weak empirical correlation with outcome (r=0.0131)
λ₃0.8Redundancy pruning yields ~10% token savings

Utility range: [-2.3, 1.0]. Positive values indicate the action is worth taking. Values below the floor (-0.5 default) indicate the action should be skipped.

Termination Conditions

Stop the loop when any of the following is true:

  • (a) Selected action is stop
  • (b) Step budget exhausted (default: 10 steps)
  • (c) All non-stop actions score below the floor (default: -0.5)

High-gain override: If Gain >= 0.7 for any action, condition (c) may be overridden. Document the override and the gain value in your reasoning trace.

Quick Start

Minimal 4-step advisory pattern:

  1. Construct state -- gather task context per modules/state-builder.md
  2. Score candidates -- evaluate each action in A per modules/action-selector.md
  3. Prefer highest utility -- select the action with the maximum U(a | s_t), subject to termination conditions
  4. Log score and decision -- record the winning action, its utility value, and step count before executing

Detailed Resources

  • State Builder: modules/state-builder.md -- how to populate s_t from task context
  • Gain: modules/gain.md -- estimating expected information or progress gain
  • Step Cost: modules/step-cost.md -- token, latency, and monetary cost tables
  • Uncertainty: modules/uncertainty.md -- confidence estimation and calibration
  • Redundancy: modules/redundancy.md -- detecting duplicate or low-delta actions
  • Action Selector: modules/action-selector.md -- scoring loop and tie-breaking rules
  • Integration: modules/integration.md -- wiring utility scoring into existing orchestration loops

Exit Criteria

  • State constructed with task goal and prior steps
  • All six actions scored before selecting one
  • Termination condition checked after each step
  • Score and decision logged for each step taken
  • High-gain overrides documented with gain value

Comments

Loading comments...