Install
openclaw skills install sonic-kvm-testbedDeploy and manage a SONiC sonic-mgmt KVM virtual testbed with cEOS neighbors for running pytest-based network tests. Use when setting up a local KVM testbed, deploying T0/T1/T1-LAG topologies with converged cEOS peers, troubleshooting testbed-cli.sh or deploy-mg issues, tearing down and redeploying testbeds, or running sonic-mgmt test cases against a virtual SONiC switch (sonic-vs).
openclaw skills install sonic-kvm-testbedDeploy a local sonic-mgmt KVM testbed with cEOS neighbors on a single machine.
Host Machine (KVM + Docker)
├── vlab-XX (KVM VM running sonic-vs) — DUT
├── ceos_vmsX-Y_VMZZ (Docker) — cEOS neighbor(s)
├── ptf_vmsX-Y (Docker) — PTF test traffic generator
└── sonic-mgmt (Docker) — Ansible + pytest framework
Management network: br1 bridge, 10.250.0.0/24 (host at .1).
| Testbed Name | Topo | DUT | VM Base | Neighbors (raw → converged) |
|---|---|---|---|---|
vms-kvm-t0 | t0 | vlab-01 | VM0100 | 4 → 1 cEOS |
vms-kvm-t1-lag | t1-lag | vlab-03 | VM0104 | 24 → 2 cEOS |
Use use_converged_peers: true in vtestbed.yaml to reduce cEOS containers via multi-VRF convergence (requires PR #22399 in master branch).
kvm-ok)docker, kvm, libvirt groupssonic-vs.img.gz from sonic-buildimagecEOS64-lab-4.32.5M.tar.xz)sshpass installed on host# Clone repo
git clone https://github.com/sonic-net/sonic-mgmt.git ~/sonic-mgmt
cd ~/sonic-mgmt && git checkout master # PR #22399 needed for auto-convergence
# Prepare images
mkdir -p ~/veos-vm/images ~/sonic-vm/images
gunzip -k sonic-vs.img.gz
cp sonic-vs.img ~/veos-vm/images/ && cp sonic-vs.img ~/sonic-vm/images/
# Import cEOS (docker import, NOT docker load)
xz -d cEOS64-lab-4.32.5M.tar.xz
docker import cEOS64-lab-4.32.5M.tar ceosimage:4.32.5M
# Management bridge
cd ~/sonic-mgmt/ansible && sudo ./setup-management-network.sh
# debian:jessie dependency
docker pull publicmirror.azurecr.io/debian:jessie
docker tag publicmirror.azurecr.io/debian:jessie debian:jessie
# sonic-mgmt container
./setup-container.sh -n sonic-mgmt -d /data
# Create vault password file
echo "abc" > ~/sonic-mgmt/ansible/password.txt
See references/credentials.md for all config files.
Critical files (these reset on git operations — automate fixes in a script):
| File | Key Setting | Why |
|---|---|---|
group_vars/vm_host/creds.yml | vm_host_user: <your_user> | Host SSH access |
group_vars/all/creds.yml | sonic_login: "<dut_user>" | DUT SSH user (matches sonic-vs build user) |
group_vars/all/ceos.yml | skip_ceos_image_downloading: true | Use local cEOS image |
group_vars/vm_host/main.yml | max_fp_num: 127 | Default 4 is too low for T0/T1 |
veos_vtb | ansible_user: <your_user> | Inventory host user |
veos | Comment out STR-ACS-SERV-01 | Avoid dual-host conflict |
vars/docker_registry.yml | Remove :443 from host | :443 causes docker pull to hang |
vtestbed.yaml | use_converged_peers: true | Enable multi-VRF convergence |
Create a fix script to re-apply all settings. Run it before EVERY testbed operation.
# Fix configs + remove stale .bak
bash fix-configs.sh
rm -f vars/topo_<TOPO>.yml.bak
# Inside sonic-mgmt container:
./testbed-cli.sh -t vtestbed.yaml add-topo <TESTBED_NAME> password.txt
Duration: ~15-20 minutes (VM boot + cEOS startup).
After add-topo, the DUT boots with the build user. The multi_passwd_ssh plugin expects admin:
# SSH to DUT as build user
ssh <build_user>@<DUT_IP>
# Create admin user
sudo useradd -m -s /bin/bash -G sudo,docker admin
echo 'admin:password' | sudo chpasswd
sudo bash -c "echo 'admin ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/admin"
# Fix docker socket
sudo chmod 666 /var/run/docker.sock
# Fix configs + remove .bak AGAIN (they revert!)
bash fix-configs.sh
rm -f vars/topo_<TOPO>.yml.bak
./testbed-cli.sh -t vtestbed.yaml deploy-mg <TESTBED_NAME> veos_vtb password.txt
Duration: ~5-10 minutes.
# Check containers
docker ps | grep -E "ceos|ptf"
# Check BGP (use admin after deploy-mg)
sshpass -p password ssh admin@<DUT_IP> "show ip bgp summary"
Expected BGP state with converged peers:
cd /data/sonic-mgmt/tests
./run_tests.sh -n <TESTBED_NAME> -d <DUT_NAME> -c <test_path> \
-f vtestbed.yaml -i ../ansible/veos_vtb
bash fix-configs.sh
rm -f vars/topo_<TOPO>.yml.bak
./testbed-cli.sh -t vtestbed.yaml remove-topo <TESTBED_NAME> password.txt
Duration: ~12-15 minutes.
.bak files before add-topo — stale backups cause KeyError in convergerdocker import for cEOS (not docker load):443 in docker_registry_host silently hangs docker pullsmax_fp_num: 4 is too low — set to 127br1 bridge is not persistent across reboots — add netplan configadminuse_converged_peers: true requires master branch (PR #22399) for auto-convergenceSee references/troubleshooting.md for detailed diagnosis of common failures.