sdf_xarray.open_mfdatatree

sdf_xarray.open_mfdatatree#

sdf_xarray.open_mfdatatree(paths, *, separate_times=False, keep_particles=False, probe_names=None, data_vars=None, deck_path=None)[source]#

Open a set of EPOCH SDF files as one xarray.DataTree. Variables related to boundaries, cpu and output file are excluded as they are problematic. If you wish to load these variables in see Loading raw files.

EPOCH can output variables at different periods, so each individal SDF file from one EPOCH run may have different variables in it. In order to combine all files into one xarray.Dataset, we need to concatenate variables across their time dimension.

We have two choices:

  1. One time dimension where some variables may not be defined at all time points, and so will be filled with NaNs at missing points; or

  2. Multiple time dimensions, one for each output frequency

The second option is better for memory consumption, as the missing data with the first option still takes up space. However, proper lazy-loading may mitigate this.

The separate_times argument can be used to switch between these choices.

An xarray.DataTree is constructed utilising the original names in the SDF file. This is due to the fact that these names include slashes which xarray can use to automatically build up a datatree. We do additionally replace spaces with underscores to be more pythonic. You can find the xarray.Dataset name under the attrs["flat_structure_name"] for referencing.

This function combines multiple SDF files into a single xarray.DataTree with a unified time dimension and hierarchical organization of variables.

In some cases the user may output the always + species dumpmask which means that SDF variable will have species data plus a general one. When defining a xarray.DataTree you cannot have a node of that tree contain both variable information and have leaves with variables so we move the node information to a leaf named node/All (see example of Dervied/Number_Density/All in below table)

Below are some examples of how variable names are translated from the regular xarray.open_dataset result into their more traditional names.

Dataset variable name

DataTree variable name

Derived_Number_Density

Derived/Number_Density/All

Derived_Number_Density_Electron

Derived/Number_Density/Electron

Derived_Number_Density_Ion

Derived/Number_Density/Ion

Derived_Number_Density_Photon

Derived/Number_Density/Photon

Derived_Average_Particle_Energy

Derived/Average_Particle_Energy

Parameters:
  • paths (Iterable | str | Path | Callable[..., Iterable[Path]]) – List of filenames or string glob pattern

  • separate_times (bool) – If True, create separate time dimensions for variables defined at different output frequencies

  • keep_particles (bool) – If True, also load particle data (this may use a lot of memory!)

  • probe_names (list[str] | None) – List of EPOCH probe names

  • data_vars (list[str] | None) – List of data vars to load in (If not specified loads in all variables)

  • deck_path (str | PathLike | None) – If None, attempt to load the "input.deck" from the same directory as the SDF files and silently fail if it does not exist. If a path is given, load the specified deck from a relative or absolute file path. See Loading the input.deck for details.

Return type:

DataTree

Examples

>>> dt = open_mfdatatree("*.sdf")
>>> dt["Electric_Field"]["Ex"].values  # Access all Electric_Field_Ex data
>>> dt.coords["time"].values  # Access combined time dimension