MeteoSwiss Open Data → Python API, CLI, MCP server, Parquet & Delta tables

foehn downloads every MeteoSwiss OGD collection via the STAC API, converts CSV/TXT to Parquet with Polars, and optionally ingests everything into Databricks Unity Catalog Delta tables on a daily schedule. It also ships an MCP server so LLMs can query Swiss weather data directly.

Why foehn?

20+ collections in one command — weather stations, radar, hail maps, forecasts, climate scenarios, and more
MCP server for LLMs — give your favorite LLM live access to MeteoSwiss data with the MCP server
Significantly smaller on disk — columnar Parquet with Zstandard compression vs. raw CSVs
Incremental by default — only re-downloads files that changed since your last run, tracked via _last_run.json
No Spark required locally — download + conversion uses Polars only; Spark is optional for Delta ingestion
Ships a Declarative Automation Bundle — ready-to-deploy daily job and historical backfill, no pipeline config needed

Quick start

pip install foehn
foehn download

Recent data (Jan 1 → yesterday) is downloaded and converted to Parquet under ./data/meteoswiss/.

Collections

MeteoSwiss organises its open data into five categories. Category B (atmosphere measurements — radio soundings, ceilometer, ozone, etc.) is not yet released (B1 radio soundings expected first half of 2026).

A — Ground-based measurements

Station-level time series in CSV, split into time slices (historical, recent, now). Converted to Parquet.

Dataset	ID	Description	Frequencies	Stations	Parameters
`smn`	A1	Automatic weather stations — the core SwissMetNet network. ~160 stations across Switzerland measuring temperature, humidity, pressure, precipitation, wind, radiation, sunshine, soil temperature, and dew point.	10-min, hourly, daily, monthly, yearly	158	181
`smn_precip`	A2	Automatic precipitation stations — rain-gauge-only network. Reports precipitation totals at multiple granularities.	10-min, hourly, daily, monthly, yearly	141	6
`smn_tower`	A3	Tower stations — tall mast measurements for temperature, humidity, wind (scalar + gusts), radiation, and sunshine at tower height.	10-min, hourly, daily, monthly, yearly	4	46
`nime`	A5	Manual precipitation stations — observer-read gauges reporting daily precipitation, plus fresh snow depth and snow cover.	daily, monthly, yearly	273	17
`tot`	A6	Totaliser precipitation — remote alpine rain gauges read once per year, reporting precipitation reduced to hydrological year (Oct 1 – Sep 30).	yearly	57	1
`pollen`	A7	Pollen stations — airborne pollen concentrations for 7 taxa: alder, birch, hazel, beech, ash, oak, and grasses (Poaceae).	hourly, daily, yearly	16	28
`obs`	A8	Visual / meteorological observations — human-observed daily cloud cover, counts of days with rain, snowfall, hail, fog, and snow coverage.	daily, monthly, yearly	20	27
`phenology`	A9	Phenological observations — day-of-year for lifecycle events (leaf unfolding, flowering, fruit maturity, leaf colouring, leaf drop) across 26 plant species including horse chestnut, beech, cherry, apple, grape vine, and larch.	yearly	175	71

C — Climate data

Key	ID	Description	Format
`nbcn`	C1	Homogeneous climate stations — break-adjusted series for temperature, pressure, precipitation, sunshine, and cloud cover (29 stations). Used for long-term trend analysis.	CSV → Parquet
`nbcn_precip`	C2	Homogeneous precipitation — break-adjusted precipitation series (46 stations).	CSV → Parquet
`surface_derived_grid`	C3	Ground-based spatial analyses — gridded fields of precipitation, temperature, and sunshine duration derived from station interpolation.	NetCDF (opt-in)
`satellite_derived_grid`	C4	Satellite-based spatial analyses — gridded radiation, cloud cover, and land surface temperature derived from satellite.	NetCDF (opt-in)
`climate_normals`	C6	Station normals — 30-year reference averages for 1961–1990 and 1991–2020. Monthly values per station.	TXT → Parquet
`climate_normals_*`	C7	Spatial normals — gridded 30-year reference maps for precipitation, sunshine, and temperature (both reference periods).	NetCDF / GeoTIFF (opt-in)
`climate_scenarios`	C8	CH2025 local scenarios — station-level climate projections.	CSV → Parquet
`climate_scenarios_grid`	C9	CH2025 gridded scenarios — spatially gridded climate projections.	NetCDF (opt-in)

D — Radar data

Key	ID	Description	Format
`radar_precip`	D1	Precipitation radar — composite precipitation grids at 5–10 min intervals.	HDF5 (opt-in)
`radar_hail`	D3	Hail radar — probability-of-hail grids at 5 min intervals.	HDF5 (opt-in)

Radar collections are large and require --grids to download.

E — Forecast data

Key	ID	Description	Format
`forecast_icon_ch1`	E2	ICON-CH1-EPS — 1 km ensemble forecast model over Switzerland.	GRIB2 (opt-in)
`forecast_icon_ch2`	E3	ICON-CH2-EPS — 2.1 km ensemble forecast model.	GRIB2 (opt-in)
`forecast_local`	E4	Local point forecasts — forecasts for ~5,600 points (stations + postal codes) covering temperature, precipitation, wind, radiation, and more (32 parameters).	CSV → Parquet

GRIB2 forecast collections are large and require --grids to download.

Hail hazard maps

Static spatial reference grids showing expected hail grain size (cm) at different return periods. These are not categorised under A–E because they are static hazard assessments, not measured or forecasted time series — they represent probabilistic climatological analyses published as fixed reference maps.

Key	Description	Format
`hail_hazard_10y`	Hail grain size — 10-year return period	NetCDF / GeoTIFF (opt-in)
`hail_hazard_20y`	Hail grain size — 20-year return period	NetCDF / GeoTIFF (opt-in)
`hail_hazard_50y`	Hail grain size — 50-year return period	NetCDF / GeoTIFF (opt-in)
`hail_hazard_100y`	Hail grain size — 100-year return period	NetCDF / GeoTIFF (opt-in)

Time slices

MeteoSwiss splits CSV data into three time slices, encoded in the filename:

Slice	Range	Update frequency	Frequencies
`recent`	Jan 1 this year → yesterday	Daily at 12:00 UTC	10-min, hourly, daily, monthly
`historical`	Start of measurement → Dec 31 last year	Once per year (early January)	10-min, hourly, daily, monthly
`now`	Yesterday 12:00 UTC → now	Every 10 minutes	10-min, hourly only

Some collections (phenology, totaliser, yearly aggregates) don't use time slices — they publish a single file per station.

All timestamps are UTC. For 10-min and hourly data the timestamp marks the end of the interval (16:00 = 15:50:01–16:00:00). For daily, monthly, and yearly data the timestamp marks the start (2023-06-01 = the whole of June).

Installation

From PyPI:

pip install foehn

From source:

git clone https://github.com/kayhendriksen/foehn
cd foehn
pip install -e .

With Databricks extras (PySpark + Delta):

pip install "foehn[databricks]"

Requires Python ≥ 3.10.

Python API

Use foehn directly from notebooks or scripts:

import foehn

# List all available datasets
foehn.list_datasets()
# [{'dataset': 'smn', 'collection_id': 'ch.meteoschweiz.ogd-smn', 'category': 'A',
#   'subcategory': 'A1', 'description': 'Automatic weather stations',
#   'format': 'CSV', 'frequencies': ['t', 'h', 'd', 'm'],
#   'time_slices': ['historical', 'recent', 'now']}, ...]

# Load data directly into a Polars DataFrame (nothing written to disk)
df = foehn.load("smn", station="BER", frequency="d")

# Filter by multiple stations and frequencies
df = foehn.load("smn", station=["BER", "ZUR"], frequency=["d", "h"])

# Include historical data
df = foehn.load("smn", station="BER", frequency="d", time_slice=["historical", "recent"])

# Explore dataset metadata (fetched live from the API)
foehn.parameters("smn")   # column name mappings: shortname, description, unit, type, ...
foehn.stations("smn")     # station info: abbr, name, canton, altitude, lat, lon, ...
foehn.inventory("smn")    # what each station measures and since when

# Download a single dataset to disk
foehn.download("smn", data_dir="./data/meteoswiss")

# Download with specific time slices
foehn.download("smn", time_slice=["historical", "recent"])

# Convert downloaded CSVs to Parquet
foehn.to_parquet("smn", data_dir="./data/meteoswiss")

MCP server

foehn includes an MCP server that gives LLMs live access to MeteoSwiss data. It's published on the MCP Registry.

pip install "foehn[mcp]"

Add it to your MCP client config:

{
  "mcpServers": {
    "foehn": {
      "command": "foehn",
      "args": ["mcp"]
    }
  }
}

Tools available:

Tool	Description
`list_datasets`	Browse all MeteoSwiss datasets with metadata
`load_data`	Fetch weather measurements as rows
`get_parameters`	Look up what each column measures
`get_stations`	Find station abbreviations and locations
`get_inventory`	Check data availability per station

CLI reference

The CLI uses subcommands that mirror the Python API:

`foehn list`

List all available datasets.

foehn list

`foehn download [DATASET...]`

Download datasets. Without arguments, downloads all CSV collections. Specify one or more datasets to download specific ones.

foehn download              # all CSV collections
foehn download smn pollen   # specific datasets only

Flag	Description
`--historical`	Include historical time slice
`--now`	Include realtime 'now' time slice
`--all`	Include all time slices (historical + recent + now)
`--full-refresh`	Ignore incremental tracking, re-download everything
`--grids`	Include grid/binary datasets (GRIB2, NetCDF)
`--no-parquet`	Skip CSV → Parquet conversion
`--data-dir PATH`	Output root (default: `./data/meteoswiss`)

`foehn to-parquet [DATASET...]`

Convert downloaded CSVs to Parquet. Without arguments, converts all collections.

foehn to-parquet            # all collections
foehn to-parquet smn        # single dataset

Flag	Description
`--data-dir PATH`	Root data directory

`foehn metadata KIND DATASET`

Show dataset metadata fetched live from the API. KIND is one of parameters, stations, or inventory.

foehn metadata parameters smn   # what each column name means
foehn metadata stations smn     # station locations and info
foehn metadata inventory smn    # which station has which parameter

`foehn load DATASET`

Load a dataset from the API and print a preview (no files written to disk).

foehn load smn --station BER --frequency d
foehn load smn --station BER ZUR --frequency d h -n 50

Flag	Description
`--station`	Filter by station(s)
`--frequency`	Filter by frequency (t, h, d, m, y)
`--time-slice`	Time slices to include (default: recent)
`-n`	Number of rows to show (default: 20)

Parquet files land in <data-dir>/parquet/<collection>/.

Environment variables

Settings can also be configured via environment variables. CLI flags always take precedence.

Variable	Equivalent	Description
`FOEHN_DATA_DIR`	`--data-dir`	Root data directory
`FOEHN_FULL_REFRESH`	`--full-refresh`	Set to `1`, `true`, or `yes` to ignore incremental tracking

Databricks pipeline

The recommended setup uses Declarative Automation Bundles.

1. Set variables:

export BUNDLE_VAR_host=https://adb-xxx.azuredatabricks.net
export BUNDLE_VAR_alert_email=you@example.com

2. Deploy:

pip install databricks-cli
databricks bundle validate
databricks bundle deploy -t prod

This deploys two jobs:

foehn_daily — runs at 13:30 UTC every day; downloads recent data and refreshes Delta tables
foehn_historical — paused by default; trigger manually for first run or on Jan 1 for the annual archive slice

Data sources


STAC API	https://data.geo.admin.ch/api/stac/v1
Documentation	https://opendatadocs.meteoswiss.ch
MeteoSwiss OGD	https://github.com/MeteoSwiss/opendata

License

MIT

foehn

Quick Install