Kubernetes Agent Swarm

Kubernetes & OpenShift Platform Agent Swarm — A coordinated multi-agent system for cluster operations. Includes Orchestrator (Jarvis), Cluster Ops (Atlas), GitOps (Flow), Security (Shield), Observability (Pulse), Artifacts (Cache), and Developer Experience (Desk). Pure instruction-based skill — no executable scripts.

Audits

Warn

Install

openclaw skills install kubernetes

Kubernetes Agent Swarm — Platform Operations

A multi-agent system for Kubernetes and OpenShift platform operations. Seven specialized agents work together as a coordinated swarm.

Runtime Requirements

RequirementRequiredDescription
kubectl✅ YesKubernetes CLI — must be in PATH
ocOptionalOpenShift CLI — needed for OCP/ROSA/ARO
helmOptionalFor GitOps agent Helm operations
jqOptionalFor JSON output parsing
KUBECONFIG✅ YesCluster access via env var or ~/.kube/config

Optional cloud CLIs (aws, az, gcloud, rosa) — only needed for managed cluster operations.

Installation

clawhub install kubernetes

Or install individual agents:

clawhub install orchestrator
clawhub install cluster-ops
clawhub install gitops
clawhub install security
clawhub install observability
clawhub install artifacts
clawhub install developer-experience

The Swarm — Agent Roster

AgentCode NameDomain
OrchestratorJarvisTask routing, coordination, standups
Cluster OpsAtlasCluster lifecycle, nodes, upgrades
GitOpsFlowArgoCD, Helm, Kustomize, deploys
SecurityShieldRBAC, policies, secrets, scanning
ObservabilityPulseMetrics, logs, alerts, incidents
ArtifactsCacheRegistries, SBOM, promotion, CVEs
Developer ExperienceDeskNamespaces, onboarding, support

How It Works

This is an instruction-only skill. Agents receive markdown instructions describing what commands to run and how to interpret output. No executable scripts are included — the agent translates instructions into actions using the host's installed CLI tools.

Session Setup

Before using the swarm, establish cluster context:

# Verify access
kubectl cluster-info
kubectl get nodes

# For OpenShift
oc status

Agent Communication

Agents communicate via @mentions in shared task comments:

@Shield Please review the RBAC for payment-service v3.2 before I sync.
@Pulse Is the CPU spike related to the deployment or external traffic?
@Atlas The staging cluster needs 2 more worker nodes.

Escalation Path

  1. Agent detects issue
  2. Agent attempts resolution within guardrails
  3. If blocked → @mention another agent or escalate to human
  4. P1 incidents → all relevant agents auto-notified

Heartbeat Schedule

*/5  * * * *  Atlas, Pulse, Shield     (fast response: incidents, alerts, CVEs)
*/10 * * * *  Flow, Cache              (scheduled: deploys, promotions)
*/15 * * * *  Desk, Orchestrator       (batch: onboarding, standups)

Agent Capabilities

What Agents CAN Do

  • Read cluster state (kubectl get, kubectl describe, oc get)
  • Deploy via GitOps (argocd app sync, Flux reconciliation)
  • Create documentation and reports
  • Investigate and triage incidents
  • Provision standard resources (namespaces, quotas, RBAC)
  • Run health checks and audits
  • Query metrics and logs

What Agents CANNOT Do (Human-in-the-Loop Required)

  • Delete production resources
  • Modify cluster-wide policies
  • Make direct changes to secrets without rotation workflow
  • Perform irreversible cluster upgrades
  • Approve production deployments (can prepare, human approves)

Key Principles

  • Roles over genericism — Each agent has a defined domain
  • Files over mental notes — Only files persist between sessions
  • Human-in-the-loop — Critical actions require approval
  • Guardrails over freedom — Define what agents can and cannot do
  • Audit everything — Every action logged

File Structure

kubernetes/
├── SKILL.md                    # This file — combined swarm
├── AGENTS.md                   # Swarm configuration and protocols
├── skills/
│   ├── orchestrator/SKILL.md   # Jarvis — task routing
│   ├── cluster-ops/SKILL.md    # Atlas — cluster operations
│   ├── gitops/SKILL.md         # Flow — GitOps
│   ├── security/SKILL.md       # Shield — security
│   ├── observability/SKILL.md  # Pulse — monitoring
│   ├── artifacts/SKILL.md      # Cache — artifacts
│   └── developer-experience/SKILL.md  # Desk — DevEx
├── memory/MEMORY.md            # Long-term agent memory
├── working/WORKING.md          # Session progress
└── logs/LOGS.md                # Action audit trail

Detailed Agent Documentation

See individual SKILL.md files for each agent's full capabilities, personality, and workflow instructions.