workload-balancing

v0.1.0

Optimize workload distribution across workers, processes, or nodes for efficient parallel execution. Use when asked to balance work distribution, improve par...

⭐ 0· 72·0 current·0 all-time

by@lnj22

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for lnj22/parallel-tfidf-search-workload-balancing.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "workload-balancing" (lnj22/parallel-tfidf-search-workload-balancing) from ClawHub.
Skill page: https://clawhub.ai/lnj22/parallel-tfidf-search-workload-balancing
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install parallel-tfidf-search-workload-balancing

ClawHub CLI

Package manager switcher

npx clawhub@latest install parallel-tfidf-search-workload-balancing

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description align with the content: the SKILL.md and references provide standard load‑balancing strategies and helper algorithms. No unrelated libs, credentials, or system access are requested.

✓

Instruction Scope

Runtime instructions and code snippets are limited to partitioning, scheduling, and monitoring logic. They do not direct the agent to read system files, access credentials, or exfiltrate data. Example snippets reference a fetch() in an I/O example but do not instruct external data submission beyond normal network I/O patterns expected for I/O-bound tasks.

✓

Install Mechanism

Instruction-only skill with no install spec and no code files to be executed by the platform; nothing is downloaded or written to disk by an installer.

✓

Credentials

No required environment variables, credentials, or config paths are declared. The strategies shown do not require secrets and the declared requirements are minimal and proportionate.

✓

Persistence & Privilege

always is false and the skill does not request persistent or elevated platform privileges, nor does it modify other skills or system-wide configuration.

Assessment

This skill appears coherent and safe: it contains example algorithms and patterns for balancing workloads and asks for no credentials or installs. Before running any provided snippets in your environment, review and test them (they are illustrative and may need adaptation), ensure any referenced network calls (e.g., fetch in the I/O example) point to trusted endpoints, and avoid copying snippets directly into production without standard safety checks (timeouts, input validation, resource limits).

Like a lobster shell, security has layers — review code before you run it.

latestvk975ap6c4fva8yrtccczawwb7584xy9y

72downloads

0stars

1versions

Updated 1w ago

v0.1.0

MIT-0

Workload Balancing Skill

Distribute work efficiently across parallel workers to maximize throughput and minimize completion time.

Workflow

Characterize the workload (uniform vs. variable task times)
Identify bottlenecks (stragglers, uneven distribution)
Select balancing strategy based on workload characteristics
Implement partitioning and scheduling logic
Monitor and adapt to runtime conditions

Load Balancing Decision Tree

What's the workload characteristic?

Uniform task times:
├── Known count → Static partitioning (equal chunks)
├── Streaming input → Round-robin distribution
└── Large items → Size-aware partitioning

Variable task times:
├── Predictable variance → Weighted distribution
├── Unpredictable → Dynamic scheduling / work stealing
└── Long-tail distribution → Work stealing + time limits

Resource constraints:
├── Memory-bound workers → Memory-aware assignment
├── Heterogeneous workers → Capability-based routing
└── Network costs → Locality-aware placement

Balancing Strategies

Strategy 1: Static Chunking (Uniform Workloads)

Best for: predictable, similar-sized tasks

from concurrent.futures import ProcessPoolExecutor
import numpy as np

def static_balanced_process(items, num_workers=4):
    """Divide work into equal chunks upfront."""
    chunks = np.array_split(items, num_workers)

    with ProcessPoolExecutor(max_workers=num_workers) as executor:
        results = list(executor.map(process_chunk, chunks))

    return [item for chunk_result in results for item in chunk_result]

Strategy 2: Dynamic Task Queue (Variable Workloads)

Best for: unpredictable task durations

from concurrent.futures import ProcessPoolExecutor, as_completed
from queue import Queue

def dynamic_balanced_process(items, num_workers=4):
    """Workers pull tasks dynamically as they complete."""
    results = []

    with ProcessPoolExecutor(max_workers=num_workers) as executor:
        # Submit one task per worker initially
        futures = {executor.submit(process_item, item): item
                   for item in items[:num_workers]}
        pending = list(items[num_workers:])

        while futures:
            done, _ = wait(futures, return_when=FIRST_COMPLETED)

            for future in done:
                results.append(future.result())
                del futures[future]

                # Submit next task if available
                if pending:
                    next_item = pending.pop(0)
                    futures[executor.submit(process_item, next_item)] = next_item

    return results

Strategy 3: Work Stealing (Long-Tail Tasks)

Best for: when some tasks take much longer than others

import asyncio
from collections import deque

class WorkStealingPool:
    def __init__(self, num_workers):
        self.queues = [deque() for _ in range(num_workers)]
        self.num_workers = num_workers

    def distribute(self, items):
        """Initial round-robin distribution."""
        for i, item in enumerate(items):
            self.queues[i % self.num_workers].append(item)

    async def worker(self, worker_id, process_fn):
        """Process own queue, steal from others when empty."""
        while True:
            # Try own queue first
            if self.queues[worker_id]:
                item = self.queues[worker_id].popleft()
            else:
                # Steal from busiest queue
                item = self._steal_work(worker_id)
                if item is None:
                    break

            await process_fn(item)

    def _steal_work(self, worker_id):
        """Steal from the queue with most items."""
        busiest = max(range(self.num_workers),
                      key=lambda i: len(self.queues[i]) if i != worker_id else 0)
        if self.queues[busiest]:
            return self.queues[busiest].pop()  # Steal from end
        return None

Strategy 4: Weighted Distribution

Best for: when task costs are known or estimable

def weighted_partition(items, weights, num_workers):
    """Partition items to balance total weight per worker."""
    # Sort by weight descending (largest first fit)
    sorted_items = sorted(zip(items, weights), key=lambda x: -x[1])

    worker_loads = [0] * num_workers
    worker_items = [[] for _ in range(num_workers)]

    for item, weight in sorted_items:
        # Assign to least loaded worker
        min_worker = min(range(num_workers), key=lambda i: worker_loads[i])
        worker_items[min_worker].append(item)
        worker_loads[min_worker] += weight

    return worker_items

Strategy 5: Async Semaphore Balancing (I/O Workloads)

Best for: limiting concurrent I/O operations

import asyncio

async def semaphore_balanced_fetch(urls, max_concurrent=10):
    """Limit concurrent operations while processing queue."""
    semaphore = asyncio.Semaphore(max_concurrent)

    async def bounded_fetch(url):
        async with semaphore:
            return await fetch(url)

    return await asyncio.gather(*[bounded_fetch(url) for url in urls])

Partitioning Strategies

Strategy	Best For	Implementation
Equal chunks	Uniform tasks	`np.array_split(items, n)`
Round-robin	Streaming	`items[i::n_workers]`
Size-weighted	Known sizes	Bin packing algorithm
Hash-based	Consistent routing	`hash(key) % n_workers`
Range-based	Sorted/ordered data	Contiguous ranges

Handling Stragglers

Techniques to mitigate slow workers:

# 1. Timeout with fallback
from concurrent.futures import TimeoutError

try:
    result = future.result(timeout=30)
except TimeoutError:
    result = fallback_value

# 2. Speculative execution (backup tasks)
async def speculative_execute(task, timeout=10):
    primary = asyncio.create_task(execute(task))
    try:
        return await asyncio.wait_for(primary, timeout)
    except asyncio.TimeoutError:
        backup = asyncio.create_task(execute(task))  # Retry
        done, pending = await asyncio.wait(
            [primary, backup], return_when=asyncio.FIRST_COMPLETED
        )
        for p in pending:
            p.cancel()
        return done.pop().result()

# 3. Dynamic rebalancing
def rebalance_on_straggler(futures, threshold_ratio=2.0):
    """Redistribute work if one worker falls behind."""
    avg_completion = statistics.mean(completion_times)
    for future, worker_id in futures.items():
        if future.running() and elapsed(future) > threshold_ratio * avg_completion:
            # Cancel and redistribute
            remaining_work = cancel_and_get_remaining(future)
            redistribute(remaining_work, fast_workers)

Monitoring Metrics

Track these for balanced execution:

Metric	Calculation	Target
Load imbalance	`max(load) / avg(load)`	< 1.2
Straggler ratio	`max(time) / median(time)`	< 2.0
Worker utilization	`busy_time / total_time`	> 90%
Queue depth variance	`std(queue_lengths)`	Low

Anti-Patterns

Problem	Cause	Fix
Starvation	Large tasks block queue	Break into subtasks
Thundering herd	All workers wake at once	Jittered scheduling
Hot spots	Uneven key distribution	Better hash function
Convoy effect	Workers wait on same resource	Fine-grained locking
Over-partitioning	Too many small tasks	Batch small items

Verification Checklist

Before finalizing balanced code:

Work distribution is roughly even (measure completion times)
No starvation (all workers stay busy)
Stragglers are handled (timeout/retry logic)
Overhead is acceptable (partitioning cost vs. task cost)
Results are complete and correct
Resource utilization is high across workers

Comments

Loading comments...