lilio.traintest

lilio train/test splitting methods.

Wrapper around sklearn splitters for working with (multiple) xarray dataarrays.

Module Contents

lilio.traintest.CVtype[source]
exception lilio.traintest.CoordinateMismatchError[source]

Custom exception for unmatching coordinates.

Initialize self. See help(type(self)) for accurate signature.

class lilio.traintest.TrainTestSplit(splitter: type[CVtype])[source]

Split (multiple) xr.DataArrays across a given dimension.

Split (multiple) xr.DataArrays across a given dimension.

Calling split() on this object returns an iterator that allows passing in multiple input arrays at once. They need to have matching coordinates along the given dimension.

For an overview of the sklearn Splitter Classes see: https://scikit-learn.org/stable/modules/classes.html#module-sklearn.model_selection

Parameters:

splitter (SplitterClass) – Initialized splitter class, much have a fit(X) method which splits up X into multiple folds of train/test data.

split(x_args: xarray.DataArray, y: xarray.DataArray | None = None, dim: str = 'anchor_year') collections.abc.Iterable[tuple[xarray.DataArray, xarray.DataArray, xarray.DataArray, xarray.DataArray]][source]
split(x_args: collections.abc.Iterable[xarray.DataArray], y: xarray.DataArray | None = None, dim: str = 'anchor_year') collections.abc.Iterable[tuple[collections.abc.Iterable[xarray.DataArray], collections.abc.Iterable[xarray.DataArray], xarray.DataArray, xarray.DataArray]]

Iterate over splits.

Parameters:
  • x_args – one or multiple xr.DataArray’s that share the same coordinate along the given dimension.

  • y – (optional) xr.DataArray that shares the same coordinate along the given dimension.

  • dim – name of the dimension along which to split the data.

Returns:

Iterator over the splits