Transform GEBCO Bathymetry data to DGGS#

Context#

Purpose#

The goal is to have the GEBCO Gridded Bathymetry Dataset into DGGS (healpix grid) so we can use it as bathymetry dataset for other DGGS applications.

Description#

In this notebook, we will:

  • Download the GEBCO Gridded Bathymetry Data

  • Transform the downloaded GEBCO dataset into DGGS Healpix

  • Save the GEBCO Gridded Bathymetry Data in Zarr

Contributions#

Notebook#

  • Jean-Marc Delouis, IFREMER (France), @jmdelouis

  • Anne Fouilloux, Simula Research Laboratory (Norway) (reviewer)

Bibliography and other interesting resources#

Import libraries#

import healpy as hp
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

Open GEBCO Data with Xarray#

url='./GEBCO_2024.nc'

ds = xr.open_dataset(url)
ds

Check the dimensions

ds.elevation.sizes['lat']

Transform into Healpix#

  • Set up parameters for Healpix Grid

nside = 2048
chunk = 1000
nest = True
level = int(np.log2(nside))
cell_id_name = "cell_ids"
healpix_ids = np.arange(12 * nside**2)
chunk_late = 12#*(2**level)
chunk_size = int(( 12 * (4**level)) / chunk_late)
chunk_size, 12 * nside**2
im = np.zeros([12*nside**2])
him = np.zeros([12*nside**2])
for k in range(0, ds.elevation.sizes['lat'], chunk):
    print(k)
    ed = k+chunk
    if ed > ds.elevation.sizes['lat']:
        ed = ds.elevation.sizes['lat']
    hidx = hp.ang2pix(
            nside,
            np.tile(ds.elevation.lon,ed-k),
            np.repeat(ds.elevation.lat[k:ed],ds.elevation.sizes['lon']),
            lonlat=True,
            nest=next,
        )
    im += np.bincount(hidx, weights=ds.elevation[k:ed,:].values.flatten(), minlength=12*nside**2)
    him += np.bincount(hidx, minlength=12*nside**2)

Visualize#

url='./'

hp.write_map(url+'GEBCO_%d.fits'%(nside),im/him,nest=True,overwrite=True)
hp.mollview(im/him,nest=True, flip='geo')

Save the output in Zarr#

  • Set healpix cell_ids and attributes

  • Save into Zarr with appropriate chunks

from numcodecs import Zstd

var_cell_ids = xr.DataArray(
    healpix_ids,
    dims="cells",
    attrs={"grid_name": "healpix", "nside": nside, "nest": nest},
)

ds = xr.DataArray(
    im/him,
    dims=("cells", ),
    coords={cell_id_name: var_cell_ids, 
            # "orbit": [orbit]
           },
    name='gebco', ).to_dataset().chunk({"cells": chunk_size})

ds.cells.attrs = {
    "grid_name": "healpix",
    "level": level,
    "nest": True,
}                    
 
url = f'/home/lops-oh-fair2adapt/gebco/healpix_level_{level}.zarr'
zstd_compressor = Zstd(level=3)

# Define a common compression setting
compression_settings = {"compressor": zstd_compressor#, "chunks": (chunk_size, 1)
                       }

# Create the encoding dictionary for all variables
encoding = {var: compression_settings for var in ds.data_vars}
ds.to_zarr(url, mode="w",  #consolidated=True,
                        encoding=encoding)