Performance of xesmf vs xarray-regrid

Compare the two conservative methods using a moderately-sized synthetic dask dataset of about 4GB.

import dask.array as da
import xarray as xr
import xesmf

import xarray_regrid

bounds = dict(south=-90, north=90, west=-180, east=180)

source = xarray_regrid.Grid(
    resolution_lat=0.25,
    resolution_lon=0.25,
    **bounds,
).create_regridding_dataset()

target = xarray_regrid.Grid(
    resolution_lat=1,
    resolution_lon=1,
    **bounds,
).create_regridding_dataset()


def source_data(source, chunks, n_times=1000):
    data = da.random.random(
        size=(n_times, source.latitude.size, source.longitude.size),
        chunks=chunks,
    ).astype("float32")

    data = xr.DataArray(
        data,
        dims=["time", "latitude", "longitude"],
        coords={
            "time": xr.date_range("2000-01-01", periods=n_times, freq="D"),
            "latitude": source.latitude,
            "longitude": source.longitude,
        }
    )

    return data

Chunking

Test “pancake” (chunked in time) and “churro” (chunked in space) chunks of different sizes. The “small” versions are about 4 MB, and the “large” are about 100 MB.

chunk_schemes = {
    "pancake_small": (1, -1, -1),
    "pancake_large": (25, -1, -1),
    "churro_small": (-1, 32, 32),
    "churro_large": (-1, 160, 160),
}
# For larger grids, generating weights is quite expensive
xesmf_regridder = xesmf.Regridder(source, target, "conservative")
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xesmf/backend.py:56: UserWarning: Latitude is outside of [-90, 90]
  warnings.warn('Latitude is outside of [-90, 90]')
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xesmf/backend.py:56: UserWarning: Latitude is outside of [-90, 90]
  warnings.warn('Latitude is outside of [-90, 90]')

Timings

Run timings for different chunkings schemes and with NaN skipping enabled and disabled, across both libraries. Compare the ratio of xesmf / xarray-regrid to see the speedup factor of using this library.

import time

import pandas as pd

pd.options.display.precision = 1


def do_regrid(data, target, skipna):
    data.regrid.conservative(target, skipna=skipna).compute()


def do_xesmf(data, target, skipna):
    xesmf_regridder(data, skipna=skipna).compute()


def timing_grid(func, repeats=2):
    times = pd.DataFrame(
        index=chunk_schemes.keys(),
        columns=["skipna=False", "skipna=True"],
    )
    for name, chunks in chunk_schemes.items():
        data = source_data(source, chunks)
        for skipna in [False, True]:
            execution_times = []
            for _ in range(repeats):
                start = time.perf_counter()
                func(data, target, skipna)
                end = time.perf_counter()
                execution_times.append(end - start)
            # Sometimes the first execution is a little slower
            times.loc[name, f"skipna={skipna}"] = min(execution_times)

    return times


regrid_times = timing_grid(do_regrid)
xesmf_times = timing_grid(do_xesmf)
ratio = xesmf_times / regrid_times
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xarray/core/computation.py:320: PerformanceWarning: Regridding is increasing the number of chunks by a factor of 72.0, you might want to specify sizes in `output_chunks` in the regridder call. Default behaviour is to preserve the chunk sizes from the input (32, 32).
  result_var = func(*data_vars)
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xarray/core/computation.py:320: PerformanceWarning: Regridding is increasing the number of chunks by a factor of 72.0, you might want to specify sizes in `output_chunks` in the regridder call. Default behaviour is to preserve the chunk sizes from the input (32, 32).
  result_var = func(*data_vars)
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xarray/core/computation.py:320: PerformanceWarning: Regridding is increasing the number of chunks by a factor of 72.0, you might want to specify sizes in `output_chunks` in the regridder call. Default behaviour is to preserve the chunk sizes from the input (32, 32).
  result_var = func(*data_vars)
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xarray/core/computation.py:320: PerformanceWarning: Regridding is increasing the number of chunks by a factor of 72.0, you might want to specify sizes in `output_chunks` in the regridder call. Default behaviour is to preserve the chunk sizes from the input (32, 32).
  result_var = func(*data_vars)
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xarray/core/computation.py:320: PerformanceWarning: Regridding is increasing the number of chunks by a factor of 6.0, you might want to specify sizes in `output_chunks` in the regridder call. Default behaviour is to preserve the chunk sizes from the input (160, 160).
  result_var = func(*data_vars)
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xarray/core/computation.py:320: PerformanceWarning: Regridding is increasing the number of chunks by a factor of 6.0, you might want to specify sizes in `output_chunks` in the regridder call. Default behaviour is to preserve the chunk sizes from the input (160, 160).
  result_var = func(*data_vars)
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xarray/core/computation.py:320: PerformanceWarning: Regridding is increasing the number of chunks by a factor of 6.0, you might want to specify sizes in `output_chunks` in the regridder call. Default behaviour is to preserve the chunk sizes from the input (160, 160).
  result_var = func(*data_vars)
/home/slevang/miniconda3/envs/xarray-regrid/lib/python3.12/site-packages/xarray/core/computation.py:320: PerformanceWarning: Regridding is increasing the number of chunks by a factor of 6.0, you might want to specify sizes in `output_chunks` in the regridder call. Default behaviour is to preserve the chunk sizes from the input (160, 160).
  result_var = func(*data_vars)

Results

With current implementations, xesmf is slightly faster for large pancake-style chunks. xarray-regrid is much faster for small chunks, especially churro-style.

These tests were run on an 8-core Intel i7 Ubuntu desktop:

ratio
skipna=False skipna=True
pancake_small 3.7 7.2
pancake_large 0.6 1.1
churro_small 14.2 16.9
churro_large 1.8 2.4