Install
openclaw skills install @wu-uk/flood-risk-analysis-usgs-data-downloadDownload water level data from USGS using the dataretrieval package. Use when accessing real-time or historical streamflow data, downloading gage height or discharge measurements, or working with USGS station IDs.
openclaw skills install @wu-uk/flood-risk-analysis-usgs-data-downloadThis guide covers downloading water level data from USGS using the dataretrieval Python package. USGS maintains thousands of stream gages across the United States that record water levels at 15-minute intervals.
pip install dataretrieval
The NWIS module is reliable and straightforward for accessing gage height data.
from dataretrieval import nwis
# Get instantaneous values (15-min intervals)
df, meta = nwis.get_iv(
sites='<station_id>',
start='<start_date>',
end='<end_date>',
parameterCd='00065'
)
# Get daily values
df, meta = nwis.get_dv(
sites='<station_id>',
start='<start_date>',
end='<end_date>',
parameterCd='00060'
)
# Get site information
info, meta = nwis.get_info(sites='<station_id>')
| Code | Parameter | Unit | Description |
|---|---|---|---|
00065 | Gage height | feet | Water level above datum |
00060 | Discharge | cfs | Streamflow volume |
| Function | Description | Data Frequency |
|---|---|---|
nwis.get_iv() | Instantaneous values | ~15 minutes |
nwis.get_dv() | Daily values | Daily |
nwis.get_info() | Site information | N/A |
nwis.get_stats() | Statistical summaries | N/A |
nwis.get_peaks() | Annual peak discharge | Annual |
The DataFrame has a datetime index and these columns:
| Column | Description |
|---|---|
site_no | Station ID |
00065 | Water level value |
00065_cd | Quality code (can ignore) |
from dataretrieval import nwis
station_ids = ['<id_1>', '<id_2>', '<id_3>']
all_data = {}
for site_id in station_ids:
try:
df, meta = nwis.get_iv(
sites=site_id,
start='<start_date>',
end='<end_date>',
parameterCd='00065'
)
if len(df) > 0:
all_data[site_id] = df
except Exception as e:
print(f"Failed to download {site_id}: {e}")
print(f"Successfully downloaded: {len(all_data)} stations")
# Find the gage height column (excludes quality code column)
gage_col = [c for c in df.columns if '00065' in str(c) and '_cd' not in str(c)]
if gage_col:
water_levels = df[gage_col[0]]
print(water_levels.head())
| Issue | Cause | Solution |
|---|---|---|
| Empty DataFrame | Station has no data for date range | Try different dates or use get_iv() |
get_dv() returns empty | No daily gage height data | Use get_iv() and aggregate |
| Connection error | Network issue | Wrap in try/except, retry |
| Rate limited | Too many requests | Add delays between requests |
len(df) > 0 before processingget_iv() for gage height, as daily data is often unavailable_cd)