Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

RIOMAR — HEALPix (DGGS) regridding

Software quality

This repository provides an end-to-end workflow to regrid ocean model data (RiOMar/GAMAR) from curvilinear grids to HEALPix (DGGS) format using xarray, Dask, and Kerchunk. It runs both locally (HTTPS mode) and on HPC infrastructure.

  1. Define a Region Of Interest (ROI) from a lon/lat bounding box

  2. Prepare a temporary small Zarr dataset (for fast iteration and reproducible testing)

  3. Regrid variables to HEALPix using healpix_regrid (via xarray.apply_ufunc)

  4. Scale the same workflow to the full dataset on HPC and publish the resulting Zarr

The workflow assumes geographic coordinates in EPSG:4326 and HEALPix (WGS84) nested indexing.


Installation

# 1. Create the conda environment
conda env create -f notebook/environment.yml
conda activate riomar

# 2. Install the healpix_regrid package in editable mode
pip install -e ".[test]"

Running tests

python -m pytest tests/ -v

Repository structure

healpix_regrid/         Reusable Python package (masking, kerchunk, dask, regridding)
tests/                  Pytest test suite for healpix_regrid
notebook/               Jupyter notebooks (interactive workflow & exploration)
bin/                    Python scripts for HPC batch runs
singularity_images/     Singularity container definitions

A. Create ROI from lon/lat bbox

Notebook: Create_ROI_from_bbox.ipynb

Purpose - Convert a lon/lat bounding box (EPSG:4326) into a HEALPix (nested) ROI\

Input - Bounding box (min_lon, min_lat, max_lon, max_lat) in EPSG:4326\

Output - HEALPix ROI cells at parent level: parent_ids.npz\

Notes - The notebook computes child-level cells covering the bbox, then maps them to the parent level and builds polygons.


B. Prepare a temporary Zarr (fast iteration dataset)

Notebook: Prep_regrid.ipynb

Purpose - Open RiOMar data via a Kerchunk catalog (HPC filesystem or HTTPS export)\

Output - A temporary Zarr dataset (e.g. small.zarr) used as input for the regridding notebook

Tip - Set OUT_ZARR to an existing path on your machine or HPC scratch.


C. Regrid to HEALPix using apply_ufunc

Notebook: regrid_apply_ufunc.ipynb

Purpose - Load the temporary Zarr created in B\

Output - A HEALPix-indexed dataset with a cell_ids coordinate (nested indexing)


Scaling to HPC

On HPC (Datarmor), the bin/ scripts run the same pipeline as the notebooks. Scripts auto-detect the environment by checking whether the HPC filesystem (/scale/project/lops-oh-fair2adapt/) exists.

# Submit a PBS job
qsub bin/submit.sh

Singularity container definitions are in singularity_images/f2a_riomar/ (layered build: hardened Debian base -> conda scientific stack -> JupyterHub).


Conventions / Expected Variables


Troubleshooting


Extra notebooks


Contributing

See Contributing to RIOMAR / healpix_regrid for guidelines on how to contribute.

License

This project is licensed under the Apache License 2.0.