Swmm Calibration

Other

Calibration and validation scaffold for EPA SWMM. Use when an agent needs to (1) compare simulated vs observed flow, (2) evaluate candidate parameter sets, (3) rank explicit candidates by an objective, (4) run a bounded random / LHS / adaptive search for the best-fitting parameters, (5) run a publication-grade SCE-UA calibration with KGE as the primary objective and (r, alpha, beta) decomposition reported, or (6) run a DREAM-ZS Bayesian calibration producing a posterior over parameters with Gelman-Rubin convergence checks. Dedicated sensitivity-analysis methods (OAT, Morris, Sobol') now live on the `swmm-uncertainty` skill.

Install

openclaw skills install swmm-calibration

SWMM Calibration / Validation

Part of Agentic SWMM — install the project first for the executable toolchain (aiswmm CLI, SWMM solver, MCP servers).

What this skill provides

  • A practical calibration scaffold around the existing SWMM runner workflow.
  • A strict calibration boundary: calibration and validation require observed data. Without observed flow, depth, soil-moisture, or volume data, use swmm-uncertainty for prior uncertainty propagation instead of calling the run calibrated.
  • Observed-flow ingestion from delimited text files (.csv, .tsv, .dat, whitespace-separated text).
  • Metric calculation for simulated vs observed hydrographs:
    • KGE (Kling-Gupta Efficiency) + (r, alpha, beta) decomposition — primary metric for publication-grade calibration.
    • NSE
    • RMSE
    • Bias / PBIAS%
    • Peak flow error
    • Peak timing error
  • Simple INP text patching using an explicit mapping from parameter names to line selectors.
  • Batch evaluation of candidate parameter sets for:
    • sensitivity
    • calibrate
    • validate
  • Bounded internal search for calibration candidate generation:
    • search --strategy random — uniform random sampling (fast prototyping).
    • search --strategy lhs — Latin Hypercube Sampling (fast prototyping).
    • search --strategy adaptive — multi-round LHS refinement around elite trials (fast prototyping).
    • search --strategy sceua — Shuffled Complex Evolution (SCE-UA); recommended for publication-grade point-estimate calibration. Minimises (1 - KGE) via spotpy.algorithms.sceua and emits a calibration_summary.json with KGE decomposition + secondary metrics.
    • search --strategy dream-zs — DREAM-ZS Bayesian calibration with a KGE-based likelihood exp(-0.5 * (1 - KGE) / sigma^2). Produces a posterior over parameters via spotpy.algorithms.dream, writes 5 audit artefacts (posterior_samples.csv, best_params.json, chain_convergence.json, posterior_<param>.png, posterior_correlation.png) plus a Slice 1 -compatible calibration_summary.json with a posterior_summary block (Gelman-Rubin Rhat per parameter + per-parameter quantiles).
  • Dedicated sensitivity-analysis methods (OAT, Morris elementary-effects, Sobol' indices) have moved to the swmm-uncertainty skill — see skills/swmm-uncertainty/scripts/sensitivity.py and the swmm_sensitivity_oat / swmm_sensitivity_morris / swmm_sensitivity_sobol MCP tools.
  • MCP wrapper so the agent runtime can call the workflow as tools.

Strategy guidance

StrategyWhen to useCostReports
randomFirst-pass prototyping, smoke-testing the patch mapVery lowRanking table
lhsQuick coverage of a small search spaceVery lowRanking table
adaptiveLHS with multi-round refinement around elite trialsLowRanking table per round
sceuaPublication-grade point-estimate calibration on a fixed search spaceMediumcalibration_summary.json with KGE primary + decomposition + secondary metrics + convergence.csv
dream-zsBayesian posterior calibration with Gelman-Rubin convergence checksHighcalibration_summary.json + posterior_samples.csv + chain_convergence.json + per-parameter marginal PNGs + correlation PNG

MCP tools

mcp/swmm-calibration/server.js exposes six tools, all thin wrappers around scripts/swmm_calibrate.py.

  1. swmm_sensitivity_scan — evaluate a list of explicit candidate parameter sets against an observed series and rank them by an objective (KGE / NSE / RMSE / Bias / peak-flow / peak-timing). Use to score a curated candidate list. (This is not a screening method; for OAT / Morris / Sobol' screening use the swmm_sensitivity_* tools on the swmm-uncertainty MCP server.)

  2. swmm_calibrate — same evaluation as above, but report the single best-scoring set and write a best_params.json. Use when you already have a curated candidate list.

  3. swmm_calibrate_search — generate bounded candidate sets internally and score them. Strategies: random, lhs, adaptive (multi-round LHS refinement around elite trials). Use when you have a search-space JSON instead of an explicit candidate list.

  4. swmm_calibrate_sceua — global SCE-UA calibration with KGE as the primary objective. Emits a calibration_summary.json containing primary_objective, primary_value, kge_decomposition (r / alpha / beta), secondary_metrics (NSE, PBIAS%, RMSE, peak-flow error, peak-timing error), and a convergence.csv trace. Use for publication-grade point-estimate calibration. Requires the optional spotpy dependency.

  5. swmm_calibrate_dream_zs — DREAM-ZS Bayesian posterior calibration with a KGE-based likelihood exp(-0.5 * (1 - KGE) / sigma^2). Writes 5 posterior artefacts to the chosen audit directory (defaults to the parent of summaryJson): posterior_samples.csv (post-burn-in MCMC samples), best_params.json (MAP estimate), chain_convergence.json (Gelman-Rubin Rhat per parameter), posterior_<param>.png (marginal histogram per parameter), posterior_correlation.png (parameter correlation matrix). The calibration_summary.json keeps the Slice 1 shape (primary_objective=kge, primary_value, kge_decomposition, secondary_metrics) plus a posterior_summary block with chain count, Rhat values, and per-parameter quantiles. Use for Bayesian uncertainty quantification on top of (or instead of) the SCE-UA point estimate. Requires the optional spotpy dependency.

  6. swmm_validate — apply one chosen parameter set to a second event (validation) and score it.

Sensitivity analysis (OAT / Morris / Sobol') is owned by swmm-uncertainty. See mcp/swmm-uncertainty/server.js for swmm_sensitivity_oat, swmm_sensitivity_morris, and swmm_sensitivity_sobol.

Scripts (Python implementations behind the MCP tools)

  • scripts/swmm_calibrate.py — backs swmm_sensitivity_scan, swmm_calibrate, swmm_calibrate_search, swmm_validate. Subcommands: sensitivity, calibrate, search, validate.
  • scripts/obs_reader.py — heuristically reads timestamp + flow series from text tables.
  • scripts/metrics.py — computes hydrograph comparison metrics after time alignment.
  • scripts/inp_patch.py — patches selected numeric tokens in an .inp file using a simple JSON mapping.

Expected workflow

  1. Prepare a base SWMM INP for the event.
  2. Prepare an observed flow file with at least one timestamp column and one flow column.
  3. Define a patch map JSON that explains where each calibration parameter lives in the INP.
  4. Prepare either:
    • a parameter sets JSON (explicit candidate sets), or
    • a search-space JSON (min/max/type/precision) for internal bounded search.
  5. Run one of:
    • sensitivity
    • calibrate
    • validate
  6. Inspect the output summary JSON and generated trial directories.

Relationship to uncertainty analysis

swmm-calibration and swmm-uncertainty share parameter patching but answer different questions.

Calibration asks:

Given observed data, which parameter set best reproduces the observed hydrograph?

Uncertainty / sensitivity analysis asks:

Given uncertain parameters, how much does the SWMM output ensemble spread,
and which parameters drive that spread?

Use this skill only when observed data are available and the workflow can compute metrics such as NSE, RMSE, bias, peak-flow error, or peak-timing error. If no observed data are available, use swmm-uncertainty for prior Monte Carlo, fuzzy, entropy, or sensitivity analysis.

Per issue #49 the OAT / Morris / Sobol' sensitivity-analysis path lives on swmm-uncertainty (skills/swmm-uncertainty/scripts/sensitivity.py). The calibration scaffold consumes its output via:

  • runs/<case>/09_audit/sensitivity_indices.json — per-parameter ranking with mu_star/sigma (Morris) or S_i/S_T_i (Sobol'). Use this to pre-screen which parameters to feed into SCE-UA or LHS search.

A calibration run can feed uncertainty / sensitivity analysis back by exporting:

  • best_params.json for a baseline parameter set
  • ranking.json for candidate performance
  • narrowed or acceptable parameter ranges for calibration-informed Monte Carlo

MVP assumptions / limitations

  • This is intentionally a transparent scaffold, not a black-box optimizer.
  • Internal search supports bounded random, LHS-like sampling, simple adaptive LHS refinement, SCE-UA (Shuffled Complex Evolution) for global optimisation against KGE, and DREAM-ZS (DiffeRential Evolution Adaptive Metropolis) for KGE-likelihood posterior sampling. SCE-UA produces a point estimate; DREAM-ZS produces a posterior plus a MAP point estimate.
  • INP patching is line-oriented and works best for one-line table records with stable object names.
  • Observed-flow parsing uses heuristics. If your file is messy, give explicit column names and time format whenever possible.
  • Simulated flow is read either from:
    • SWMM .out (preferred, via swmmtoolbox), or
    • a delimited simulation series file.
  • The validation command assumes you already chose a parameter set (via JSON object or file).
  • The swmm_sensitivity_scan tool here scores explicit candidate sets against an observed series; it is not parameter screening. Use swmm-uncertainty's swmm_sensitivity_oat / swmm_sensitivity_morris / swmm_sensitivity_sobol tools for OAT / Morris / Sobol' screening.

Patch-map idea

A patch-map JSON connects friendly parameter names to concrete INP edits.

Example:

{
  "pct_imperv_s1": {
    "section": "[SUBCATCHMENTS]",
    "object": "S1",
    "field_index": 4
  },
  "n_imperv_s1": {
    "section": "[SUBAREAS]",
    "object": "S1",
    "field_index": 1
  }
}

Interpretation:

  • section = INP section header to search within
  • object = first token on the target row
  • field_index = zero-based token index within the data row

Candidate parameter-set JSON idea

[
  {"name": "trial_001", "params": {"pct_imperv_s1": 42.0, "n_imperv_s1": 0.015}},
  {"name": "trial_002", "params": {"pct_imperv_s1": 47.0, "n_imperv_s1": 0.018}}
]

Search-space JSON idea

{
  "pct_imperv_s1": {"min": 15.0, "max": 40.0, "type": "float", "precision": 3},
  "n_imperv_s1": {"min": 0.01, "max": 0.03, "type": "float", "precision": 4}
}

CLI examples

Sensitivity scan

python3 skills/swmm-calibration/scripts/swmm_calibrate.py sensitivity \
  --base-inp <your case>/event.inp \
  --patch-map path/to/patch_map.json \
  --parameter-sets path/to/parameter_sets.json \
  --observed path/to/observed_flow.csv \
  --run-root runs/calibration \
  --swmm-node O1 \
  --objective nse

Calibration (pick best candidate set)

python3 skills/swmm-calibration/scripts/swmm_calibrate.py calibrate \
  --base-inp <your case>/event.inp \
  --patch-map path/to/patch_map.json \
  --parameter-sets path/to/parameter_sets.json \
  --observed path/to/observed_flow.csv \
  --run-root runs/calibration \
  --swmm-node O1 \
  --objective nse

Validation on a second event

python3 skills/swmm-calibration/scripts/swmm_calibrate.py validate \
  --base-inp path/to/validation_event.inp \
  --patch-map path/to/patch_map.json \
  --best-params path/to/best_params.json \
  --observed path/to/validation_observed.csv \
  --run-root runs/validation \
  --swmm-node O1

Internal bounded search (LHS)

python3 skills/swmm-calibration/scripts/swmm_calibrate.py search \
  --base-inp <your case>/event.inp \
  --patch-map <your case>/calibration/patch_map.json \
  --search-space <your case>/calibration/search_space.json \
  --observed <your case>/calibration/observed_flow.csv \
  --run-root runs/calibration-search \
  --summary-json runs/calibration-search/summary.json \
  --ranking-json runs/calibration-search/ranking.json \
  --strategy lhs \
  --iterations 12 \
  --seed 42

Internal bounded search (adaptive multi-round)

python3 skills/swmm-calibration/scripts/swmm_calibrate.py search \
  --base-inp <your case>/event.inp \
  --patch-map <your case>/calibration/patch_map.json \
  --search-space <your case>/calibration/search_space.json \
  --observed <your case>/calibration/observed_flow.csv \
  --run-root runs/calibration-search-adaptive \
  --summary-json runs/calibration-search-adaptive/summary.json \
  --strategy adaptive \
  --iterations 8 \
  --rounds 3 \
  --seed 42

SCE-UA calibration (publication-grade, KGE primary)

Requires spotpy to be installed (it ships as a runtime dependency in pyproject.toml).

python3 skills/swmm-calibration/scripts/swmm_calibrate.py search \
  --base-inp <your case>/event.inp \
  --patch-map <your case>/calibration/patch_map.json \
  --search-space <your case>/calibration/search_space.json \
  --observed <your case>/calibration/observed_flow.csv \
  --run-root runs/calibration-sceua \
  --summary-json runs/calibration-sceua/calibration_summary.json \
  --best-params-out runs/calibration-sceua/best_params.json \
  --convergence-csv runs/calibration-sceua/convergence.csv \
  --strategy sceua \
  --objective kge \
  --iterations 200 \
  --seed 42

calibration_summary.json shape:

{
  "primary_objective": "kge",
  "primary_value": 0.78,
  "kge_decomposition": {"r": 0.92, "alpha": 1.05, "beta": 0.97},
  "secondary_metrics": {
    "nse": 0.74, "pbias_pct": -3.2, "rmse": 0.043,
    "peak_error_rel": 0.08, "peak_timing_min": 12
  },
  "strategy": "sceua",
  "iterations": 200,
  "convergence_trace_ref": "convergence.csv"
}

DREAM-ZS Bayesian calibration (posterior over parameters)

Requires spotpy (already a runtime dependency). Likelihood is exp(-0.5 * (1 - KGE) / sigma^2).

python3 skills/swmm-calibration/scripts/swmm_calibrate.py search \
  --base-inp <your case>/event.inp \
  --patch-map <your case>/calibration/patch_map.json \
  --search-space <your case>/calibration/search_space.json \
  --observed <your case>/calibration/observed_flow.csv \
  --run-root runs/calibration-dream-zs/trials \
  --summary-json runs/calibration-dream-zs/09_audit/calibration_summary.json \
  --dream-output-dir runs/calibration-dream-zs/09_audit \
  --best-params-out runs/calibration-dream-zs/09_audit/best_params.json \
  --strategy dream-zs \
  --objective kge \
  --iterations 2000 \
  --dream-chains 4 \
  --dream-sigma 0.1 \
  --dream-rhat-threshold 1.2 \
  --seed 42

The 09_audit/ folder will contain five DREAM-ZS artefacts plus calibration_summary.json:

  • posterior_samples.csv — post-burn-in MCMC samples (chain, iteration_in_chain, likelihood, one column per parameter).
  • best_params.json — MAP estimate (highest-likelihood row from the chains).
  • chain_convergence.json — Gelman-Rubin Rhat per parameter, threshold, and a converged flag.
  • posterior_<param>.png — marginal histogram per parameter.
  • posterior_correlation.png — posterior parameter correlation matrix.

calibration_summary.json keeps the same shape as SCE-UA (so downstream tooling stays compatible) and adds a posterior_summary block:

{
  "primary_objective": "kge",
  "primary_value": 0.83,
  "kge_decomposition": {"r": 0.94, "alpha": 1.02, "beta": 0.99},
  "secondary_metrics": {"nse": 0.79, "pbias_pct": -1.4, "rmse": 0.038, "peak_error_rel": 0.05, "peak_timing_min": 8},
  "strategy": "dream-zs",
  "iterations": 2000,
  "convergence_trace_ref": "chain_convergence.json",
  "posterior_summary": {
    "n_chains": 4,
    "n_chains_requested": 4,
    "n_samples_post_burnin": 1996,
    "converged": true,
    "rhat_threshold": 1.2,
    "rhat": {"pct_imperv_s1": 1.07, "n_imperv_s1": 1.04, "...": "..."},
    "per_parameter": {
      "pct_imperv_s1": {"mean": 29.7, "median": 29.8, "std": 1.2, "q05": 27.6, "q95": 31.5}
    }
  }
}

Candidate handover contract (issue #54)

Calibration runs never patch the canonical INP. Every strategy (random / lhs / adaptive / SCE-UA / DREAM-ZS) emits three artefacts to <run_dir>/09_audit/ when invoked with --candidate-run-dir <run_dir>:

ArtefactPurpose
candidate_calibration.jsonBest params + KGE + decomposition + secondary metrics + evidence_boundary: "candidate_not_accepted_yet" + SHA256 of the patch file + (DREAM only) posterior_samples_ref.
candidate_inp_patch.jsonOne row per parameter (section, object, field_index, old_value, new_value) — the diff to apply when the human accepts.
calibration_report.mdHuman-readable summary: KGE decomposition table, secondary metrics, best parameters, convergence trace reference (SCE-UA) and posterior block (DREAM-ZS).

The canonical INP file SHA256 is unchanged before and after calibration — the scaffold only reads it, to extract the old_value for each diff row.

Promotion is gated behind the expert-only CLI:

aiswmm calibration accept <run_dir>

aiswmm calibration accept:

  1. Reads candidate_calibration.json; refuses if missing.
  2. Reads candidate_inp_patch.json; refuses if missing.
  3. Recomputes the SHA256 of the patch payload and compares against the SHA recorded inside the candidate; refuses on mismatch (tamper detection).
  4. Applies the patch to the canonical INP via the same inp_patch machinery the agent uses.
  5. Records a human_decisions row on the run's 09_audit/experiment_provenance.json with action == "calibration_accept", by == $USER, evidence_ref == "09_audit/candidate_calibration.json", and decision_text containing the applied patch SHA.

The agent has no path to step 4. Only the human can promote the candidate.

Example: SCE-UA with candidate handover

python3 skills/swmm-calibration/scripts/swmm_calibrate.py search \
  --base-inp runs/<case>/model.inp \
  --patch-map runs/<case>/calibration/patch_map.json \
  --search-space runs/<case>/calibration/search_space.json \
  --observed runs/<case>/calibration/observed_flow.csv \
  --run-root runs/<case>/calibration-sceua/trials \
  --summary-json runs/<case>/09_audit/calibration_summary.json \
  --strategy sceua --objective kge --iterations 200 --seed 42 \
  --candidate-run-dir runs/<case>

# After review:
aiswmm calibration accept runs/<case>

Recommended near-term extensions

  • Add multi-event calibration/validation.
  • Add observed-vs-simulated overlay plots to the calibration script.
  • Extend patch-map selectors beyond simple one-line object rows.
  • Wire the DREAM-ZS posterior into source-decomposition uncertainty propagation — issue #55.