Getting Started#
Installation#
Install sdf-xarray from PyPI with:
pip install sdf-xarray
or from a local checkout:
git clone https://github.com/epochpic/sdf-xarray.git
cd sdf-xarray
pip install .
Usage#
sdf-xarray is a backend for xarray, and so is usable directly from
xarray. There are several ways to load SDF files:
To load a single file, use
xarray.open_dataset().To load multiple files, use
xarray.open_mfdataset()orsdf_xarray.open_mfdataset()(Recommended).To access the raw contents of a single SDF file, use
sdf_xarray.sdf_interface.SDFFile().
Note
When loading *.sdf files, variables related to boundaries, cpu and output file are excluded as they are problematic.
Single file loading#
Basic usage:
In [1]: import xarray as xr
In [2]: import sdf_xarray as sdfxr
In [3]: with xr.open_dataset("tutorial_dataset_1d/0010.sdf") as df:
...: print(df["Electric_Field_Ex"])
...:
<xarray.DataArray 'Electric_Field_Ex' (X_Grid_mid: 1536)> Size: 12kB
[1536 values with dtype=float64]
Coordinates:
* X_Grid_mid (X_Grid_mid) float64 12kB -9.99e-06 -9.971e-06 ... 1.999e-05
Attributes:
units: V/m
point_data: False
full_name: Electric Field/Ex
long_name: Electric Field $E_x$
Multi file loading#
To open a whole simulation’s files at once use the sdf_xarray.open_mfdataset() function:
In [4]: sdfxr.open_mfdataset("tutorial_dataset_1d/*.sdf")
Out[4]:
<xarray.Dataset> Size: 10MB
Dimensions: (X_Grid: 1537,
X_Grid_mid: 1536, time: 41,
dim_laser_x_min_phase_0: 1,
dim_Random States_0: 384)
Coordinates:
* X_Grid (X_Grid) float64 12kB -1e-0...
* X_Grid_mid (X_Grid_mid) float64 12kB -...
* time (time) float64 328B 2.606e-...
Dimensions without coordinates: dim_laser_x_min_phase_0, dim_Random States_0
Data variables: (12/40)
Wall_time (time) float64 328B 0.4287 ...
Electric_Field_Ex (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Electric_Field_Ey (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Electric_Field_Ez (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Magnetic_Field_Bx (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Magnetic_Field_By (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
... ...
Particles_Particles_Per_Cell_Electron (time) float64 328B nan ......
Particles_Particles_Per_Cell_Ion (time) float64 328B nan ......
Particles_Particles_Per_Cell_Photon (time) float64 328B nan ......
Particles_Particles_Per_Cell_Positron (time) float64 328B nan ......
Random_States (time, dim_Random States_0) float64 126kB dask.array<chunksize=(1, 384), meta=np.ndarray>
Current_Jz (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Attributes: (12/21)
filename: /home/docs/checkouts/readthedocs.org/user_builds/sdf-xa...
file_version: 1
file_revision: 4
code_name: Epoch1d
step: 0
time: 2.6059694937355635e-17
... ...
compile_machine: login1.viking2.yor.alces.network
compile_flags: unknown
defines: 50364608
compile_date: Fri Oct 11 15:12:01 2024
run_date: Fri Oct 25 11:34:55 2024
io_date: Fri Oct 25 11:34:57 2024
You can alternatively open the dataset using the xarray’s xarray.open_mfdataset()
along with the preprocess=sdfxr.SDFPreprocess():
In [5]: xr.open_mfdataset(
...: "tutorial_dataset_1d/*.sdf",
...: join="outer",
...: compat="no_conflicts",
...: preprocess=sdfxr.SDFPreprocess()
...: )
...:
Out[5]:
<xarray.Dataset> Size: 10MB
Dimensions: (X_Grid: 1537,
X_Grid_mid: 1536, time: 41,
dim_laser_x_min_phase_0: 1,
dim_Random States_0: 384)
Coordinates:
* X_Grid (X_Grid) float64 12kB -1e-0...
* X_Grid_mid (X_Grid_mid) float64 12kB -...
* time (time) float64 328B 2.606e-...
Dimensions without coordinates: dim_laser_x_min_phase_0, dim_Random States_0
Data variables: (12/40)
Wall_time (time) float64 328B 0.4287 ...
Electric_Field_Ex (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Electric_Field_Ey (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Electric_Field_Ez (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Magnetic_Field_Bx (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Magnetic_Field_By (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
... ...
Particles_Particles_Per_Cell_Electron (time) float64 328B nan ......
Particles_Particles_Per_Cell_Ion (time) float64 328B nan ......
Particles_Particles_Per_Cell_Photon (time) float64 328B nan ......
Particles_Particles_Per_Cell_Positron (time) float64 328B nan ......
Random_States (time, dim_Random States_0) float64 126kB dask.array<chunksize=(1, 384), meta=np.ndarray>
Current_Jz (time, X_Grid_mid) float64 504kB dask.array<chunksize=(1, 1536), meta=np.ndarray>
Attributes: (12/21)
filename: /home/docs/checkouts/readthedocs.org/user_builds/sdf-xa...
file_version: 1
file_revision: 4
code_name: Epoch1d
step: 0
time: 2.6059694937355635e-17
... ...
compile_machine: login1.viking2.yor.alces.network
compile_flags: unknown
defines: 50364608
compile_date: Fri Oct 11 15:12:01 2024
run_date: Fri Oct 25 11:34:55 2024
io_date: Fri Oct 25 11:34:57 2024
sdf_xarray.SDFPreprocess checks that all the files are from the same simulation, and
ensures there’s a time dimension so the files are correctly concatenated.
If your simulation has multiple output blocks so that not all variables are
output at every time step, then those variables will have NaN values at the
corresponding time points.
Alternatively, we can create a separate time dimensions for each output
block using sdf_xarray.open_mfdataset() with separate_times=True:
In [6]: sdfxr.open_mfdataset("tutorial_dataset_1d/*.sdf", separate_times=True)
Out[6]:
<xarray.Dataset> Size: 9MB
Dimensions: (X_Grid: 1537,
X_Grid_mid: 1536, time0: 41,
time1: 1,
dim_laser_x_min_phase_0: 1,
dim_Random States_0: 384)
Coordinates:
* X_Grid (X_Grid) float64 12kB -1e-0...
* X_Grid_mid (X_Grid_mid) float64 12kB -...
* time0 (time0) float64 328B 2.606e...
* time1 (time1) float64 8B 2e-13
Dimensions without coordinates: dim_laser_x_min_phase_0, dim_Random States_0
Data variables: (12/40)
Wall_time (time0) float64 328B 0.4287...
Electric_Field_Ex (time0, X_Grid_mid) float64 504kB ...
Electric_Field_Ey (time0, X_Grid_mid) float64 504kB ...
Electric_Field_Ez (time0, X_Grid_mid) float64 504kB ...
Magnetic_Field_Bx (time0, X_Grid_mid) float64 504kB ...
Magnetic_Field_By (time0, X_Grid_mid) float64 504kB ...
... ...
Particles_Particles_Per_Cell_Electron (time1) float64 8B 64.0
Particles_Particles_Per_Cell_Ion (time1) float64 8B 64.0
Particles_Particles_Per_Cell_Photon (time1) float64 8B -1.0
Particles_Particles_Per_Cell_Positron (time1) float64 8B -1.0
Random_States (time1, dim_Random States_0) int32 2kB ...
Current_Jz (time1, X_Grid_mid) float64 12kB ...
Attributes: (12/20)
file_version: 1
file_revision: 4
code_name: Epoch1d
jobid1: 1729856095
jobid2: 720
code_io_version: 1
... ...
compile_date: Fri Oct 11 15:12:01 2024
run_date: Fri Oct 25 11:34:55 2024
filename: /home/docs/checkouts/readthedocs.org/user_builds/sdf-xa...
step: 3838
time: 2.0003421833910338e-13
io_date: Fri Oct 25 11:40:14 2024
This is better for memory consumption, at the cost of perhaps slightly less friendly comparisons between variables on different time coordinates.
Reading particle data#
By default, particle data isn’t kept as it takes up a lot of space. Pass
keep_particles=True as a keyword argument to open_dataset (for single files)
or open_mfdataset (for multiple files):
In [7]: xr.open_dataset("tutorial_dataset_1d/0010.sdf", keep_particles=True)
Out[7]:
<xarray.Dataset> Size: 246kB
Dimensions: (X_Grid_mid: 1536,
X_Grid: 1537)
Coordinates:
* X_Grid_mid (X_Grid_mid) float64 12kB -...
* X_Grid (X_Grid) float64 12kB -1e-0...
Data variables: (12/27)
Wall_time float64 8B ...
Electric_Field_Ex (X_Grid_mid) float64 12kB ...
Electric_Field_Ey (X_Grid_mid) float64 12kB ...
Electric_Field_Ez (X_Grid_mid) float64 12kB ...
Magnetic_Field_Bx (X_Grid_mid) float64 12kB ...
Magnetic_Field_By (X_Grid_mid) float64 12kB ...
... ...
Derived_Number_Density_Electron (X_Grid_mid) float64 12kB ...
Derived_Number_Density_Ion (X_Grid_mid) float64 12kB ...
Derived_Number_Density_Photon (X_Grid_mid) float64 12kB ...
Derived_Number_Density_Positron (X_Grid_mid) float64 12kB ...
Absorption_Total_Laser_Energy_Injected float64 8B ...
Absorption_Fraction_of_Laser_Energy_Absorbed float64 8B ...
Attributes: (12/21)
filename: tutorial_dataset_1d/0010.sdf
file_version: 1
file_revision: 4
code_name: Epoch1d
step: 960
time: 5.003461427972353e-14
... ...
compile_machine: login1.viking2.yor.alces.network
compile_flags: unknown
defines: 50364608
compile_date: Fri Oct 11 15:12:01 2024
run_date: Fri Oct 25 11:34:55 2024
io_date: Fri Oct 25 11:35:51 2024
Loading SDF files directly#
For debugging, sometimes it’s useful to see the raw SDF files:
In [8]: from sdf_xarray import SDFFile
In [9]: with SDFFile("tutorial_dataset_1d/0010.sdf") as sdf_file:
...: print(sdf_file.variables["Electric Field/Ex"])
...:
Variable(_id='ex', name='Electric Field/Ex', dtype=dtype('float64'), shape=(1536,), is_point_data=False, sdffile=<sdf_xarray.sdf_interface.SDFFile object at 0x7c654bd4e980>, units='V/m', mult=1.0, grid='grid', grid_mid='grid_mid')