mhc-algorithm

v0.1.0

Implement mHC (Manifold-Constrained Hyper-Connections) for stabilizing deep network training. Use when implementing residual connection improvements with dou...

0· 71·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for lnj22/mhc-layer-impl-mhc-algorithm.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "mhc-algorithm" (lnj22/mhc-layer-impl-mhc-algorithm) from ClawHub.
Skill page: https://clawhub.ai/lnj22/mhc-layer-impl-mhc-algorithm
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install mhc-layer-impl-mhc-algorithm

ClawHub CLI

Package manager switcher

npx clawhub@latest install mhc-layer-impl-mhc-algorithm
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name/description (mHC for stabilizing deep nets) aligns with the contents: PyTorch code snippets, Sinkhorn projection, and GPT integration patterns. The only external dependency suggested (torch, einops, numpy) is appropriate for the stated goal.
Instruction Scope
SKILL.md contains concrete implementation guidance, example code, and algorithm notes. Instructions are confined to model-code concerns (tensor shapes, Sinkhorn iterations, wrapping layers) and do not instruct reading arbitrary files, accessing environment variables, or contacting external endpoints beyond citing arXiv links.
Install Mechanism
No install spec is embedded; the doc recommends 'pip install torch einops numpy'. This is expected for a PyTorch implementation but be aware 'pip install torch' can be large and platform-specific (CUDA variants). There are no downloads from untrusted URLs or archive/extract steps.
Credentials
The skill requests no environment variables, credentials, or config paths. All required resources are typical Python packages needed to run the examples.
Persistence & Privilege
The skill is instruction-only, does not request 'always: true', and does not instruct changing agent-wide configuration or storing credentials. It does not grant persistent or elevated privileges.
Assessment
This skill appears internally consistent and focused on implementing mHC in PyTorch. Before using: (1) Review the code snippets and references to ensure they fit your model and framework versions; (2) install PyTorch via the official channel appropriate for your OS/GPU (avoid arbitrary wheel URLs); (3) run the examples in an isolated environment (virtualenv/container) because mHC multiplies memory usage by num_streams; (4) confirm the referenced paper(s) if you need research provenance. If you need broader audits (license, benchmark results, or GPU/CUDA compatibility), request the author's complete implementation or test on a small toy model first.

Like a lobster shell, security has layers — review code before you run it.

latestvk975qnc3jz6sfc7wxwvwqyx95x84v8wq
71downloads
0stars
1versions
Updated 2w ago
v0.1.0
MIT-0

mHC: Manifold-Constrained Hyper-Connections

Overview

mHC (Manifold-Constrained Hyper-Connections) stabilizes deep network training by constraining residual mixing matrices to be doubly stochastic. It provides:

  • Stable Training: Lower gradient norm variance via doubly stochastic constraints
  • Multiple Streams: Hyper-Connections with learnable mixing across residual streams
  • Sinkhorn Projection: Log-space Sinkhorn-Knopp algorithm for doubly stochastic projection
  • GPT Integration: Pattern for wrapping attention and MLP layers

Two components:

  • HyperConnections Module: Core PyTorch module with H_res, H_pre, H_post matrices
  • Sinkhorn-Knopp: Log-space projection to doubly stochastic manifold

Quick Reference

TopicReference
Core Concepts & MathCore Concepts
Sinkhorn AlgorithmSinkhorn-Knopp
HyperConnections ModuleModule Implementation
GPT IntegrationGPT Integration
Common PitfallsPitfalls

Installation

# Required packages
pip install torch einops numpy

Minimal Example

import torch
import torch.nn as nn
from einops import rearrange, einsum

def sinkhorn_knopp(logits, num_iters=20, tau=0.05):
    log_alpha = logits / tau
    for _ in range(num_iters):
        log_alpha = log_alpha - torch.logsumexp(log_alpha, dim=-1, keepdim=True)
        log_alpha = log_alpha - torch.logsumexp(log_alpha, dim=-2, keepdim=True)
    return torch.exp(log_alpha)

class HyperConnections(nn.Module):
    def __init__(self, num_streams, dim, branch=None, layer_idx=0):
        super().__init__()
        self.num_streams = num_streams
        self.branch = branch

        # Initialize H_res near identity (use small negative for gradient flow)
        init_h_res = torch.full((num_streams, num_streams), -0.1)
        init_h_res.fill_diagonal_(0.0)
        self.H_res_logits = nn.Parameter(init_h_res)

        # H_pre/H_post for depth connections
        init_h_pre = torch.full((1, num_streams), -0.1)
        init_h_pre[0, layer_idx % num_streams] = 0.0
        self.H_pre_logits = nn.Parameter(init_h_pre)
        self.H_post_logits = nn.Parameter(torch.zeros(1, num_streams))

    def forward(self, x):
        s = self.num_streams
        x = rearrange(x, "(b s) t d -> b t s d", s=s)

        h_res = sinkhorn_knopp(self.H_res_logits)
        x_mixed = einsum(h_res, x, "s t, b n s d -> b n t d")

        h_pre = self.H_pre_logits.softmax(dim=-1)
        branch_in = einsum(h_pre, x, "v s, b n s d -> b n v d").squeeze(-2)

        branch_out = self.branch(branch_in) if self.branch else branch_in

        h_post = self.H_post_logits.softmax(dim=-1)
        depth_out = einsum(branch_out, h_post, "b t d, v s -> b t s d")

        output = x_mixed + depth_out
        return rearrange(output, "b t s d -> (b s) t d")

Common Imports

import torch
import torch.nn as nn
import torch.nn.functional as F
from einops import rearrange, einsum, repeat, reduce

When to Use What

ScenarioApproach
Standard residual connectionNo mHC needed
Deep networks (>12 layers) with stability issuesUse mHC with num_streams=4
GPT/Transformer trainingWrap both attention and MLP with HyperConnections
Custom Sinkhorn iterationsAdjust num_iters (20 default) and tau (0.05 default)
Memory-constrained trainingReduce num_streams or batch size

External Resources

Comments

Loading comments...