Unit Conversion#

The sdf-xarray package automatically attempts to extract the units for each dataset from an SDF file and stores them as an xarray.Dataset attribute called "units".

While this is sufficient for most use cases, we can enhance this functionality using the pint. This library allows us to specify the units of a given array and convert them to another array which is incredibly handy. We can however take this a step further and utilise the pint-xarray library which allows us to infer units from an xarray.Dataset.attrs directly while retaining all the information about xarray.Dataset.

Installing pint and pint-xarray#

To install the pint libraries you can simply run the following optional dependency pip command which will install both the pint and pint-xarray libraries. You can install these optional dependencies via pip:

$ pip install "sdf_xarray[pint]"

Loading Libraries#

First we need to load all the necessary libraries. It’s important to import the pint_xarray library explicitly, even if it appears unused. Without this import, the xarray.Dataset.pint accessor will not be initialised.

In [1]: from sdf_xarray import open_mfdataset

In [2]: import pint_xarray

In the following example we will extract the time-resolved total particle energy of electrons which is measured in Joules and convert it to electron volts.

Quantifying Arrays with Pint#

Warning

Be aware that unless you’re using dask this will load the data into memory. To avoid that, consider converting to dask first (e.g. using chunk).

When using pint-xarray, the library attempts to infer units from the "units" attribute on each xarray.DataArray. Alternatively, you can also specify the units yourself by passing a string into the xarray.Dataset .pint.quantify() function call i.e. xarray.Dataset.pint.quantify("J"). Once the type is inferred the original xarray.DataArray will be converted to a pint.Quantity and the "units" attribute will be removed.

Note

Quantification does not alter the underlying data and can be reversed at any time using .pint.dequantify().

Warning

Unit conversion is not supported on coordinates, this is due to an underlying issue with how xarray implements indexes

In [3]: with open_mfdataset("tutorial_dataset_1d/*.sdf") as ds:
   ...:     total_particle_energy = ds["Total_Particle_Energy_Electron"]
   ...: 

In [4]: total_particle_energy
Out[4]: 
<xarray.DataArray 'Total_Particle_Energy_Electron' (time: 41)> Size: 328B
array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 6.16297956e-49, 7.88767274e+08,
       1.46399149e+10, 1.85356925e+10, 8.20911178e+10, 1.46399689e+11,
       2.13329905e+11, 3.12548212e+11, 4.36718243e+11, 6.24956976e+11,
       7.19987930e+11, 9.92646565e+11, 1.11894822e+12, 1.29952607e+12,
       1.40298758e+12, 1.57296105e+12, 1.62340949e+12, 1.70344443e+12,
       1.79842990e+12, 1.80376945e+12, 1.95646327e+12, 2.05185757e+12,
       1.92225359e+12, 2.13258118e+12, 2.01732090e+12, 2.13525146e+12,
       2.21021605e+12, 2.17954627e+12, 2.22069602e+12, 2.28332716e+12,
       2.26454688e+12, 2.40372045e+12, 2.39052381e+12, 2.33472047e+12,
       2.42629115e+12])
Coordinates:
  * time     (time) float64 328B 2.606e-17 5.003e-15 ... 1.95e-13 2e-13
Attributes:
    full_name:  Total Particle Energy/Electron
    long_name:  Total Particle Energy Electron
    units:      J

In [5]: total_particle_energy = ds["Total_Particle_Energy_Electron"].pint.quantify()

In [6]: total_particle_energy
Out[6]: 
<xarray.DataArray 'Total_Particle_Energy_Electron' (time: 41)> Size: 328B
<Quantity([0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 6.16297956e-49 7.88767274e+08
 1.46399149e+10 1.85356925e+10 8.20911178e+10 1.46399689e+11
 2.13329905e+11 3.12548212e+11 4.36718243e+11 6.24956976e+11
 7.19987930e+11 9.92646565e+11 1.11894822e+12 1.29952607e+12
 1.40298758e+12 1.57296105e+12 1.62340949e+12 1.70344443e+12
 1.79842990e+12 1.80376945e+12 1.95646327e+12 2.05185757e+12
 1.92225359e+12 2.13258118e+12 2.01732090e+12 2.13525146e+12
 2.21021605e+12 2.17954627e+12 2.22069602e+12 2.28332716e+12
 2.26454688e+12 2.40372045e+12 2.39052381e+12 2.33472047e+12
 2.42629115e+12], 'joule')>
Coordinates:
  * time     (time) float64 328B [s] 2.606e-17 5.003e-15 ... 1.95e-13 2e-13
Indexes:
    time     PintIndex(PandasIndex, units={'time': 's'})
Attributes:
    full_name:  Total Particle Energy/Electron
    long_name:  Total Particle Energy Electron

Now that this dataset has been converted a pint.Quantity, we can check it’s units and dimensionality

In [7]: total_particle_energy.pint.units
Out[7]: <Unit('joule')>

In [8]: total_particle_energy.pint.dimensionality
Out[8]: <UnitsContainer({'[length]': 2, '[mass]': 1, '[time]': -2})>

Converting Units (e.g. Joules to eV)#

We can now convert it to electron volts utilising the pint.Quantity.to function

In [9]: total_particle_energy_ev = total_particle_energy.pint.to("eV")

Unit Propagation#

Suppose instead of converting to "eV", we want to convert to "W" (watts). To do this, we divide the total particle energy by time. However, since coordinates in xarray.Dataset cannot be directly converted to pint.Quantity, we must first extract the coordinate values manually and create a new Pint quantity for time.

Once both arrays are quantified, Pint will automatically handle the unit propagation when we perform arithmetic operations like division.

Note

Pint does not automatically simplify "J/s" to "W", so we use pint.Quantity.to to convert the unit string. Since these units are the same it will not change the underlying data, only the units. This is only a small formatting choice and is not required.

In [10]: import pint

In [11]: time_values = total_particle_energy.coords["time"].data

In [12]: time = pint.Quantity(time_values, "s")

In [13]: total_particle_energy_w = total_particle_energy / time

In [14]: total_particle_energy_w.pint.units
Out[14]: <Unit('joule / second')>

In [15]: total_particle_energy_w = total_particle_energy_w.pint.to("W")

In [16]: total_particle_energy_w.pint.units
Out[16]: <Unit('watt')>

Dequantifying and Restoring Units#

Note

If this function is not called prior to plotting then the units will be inferred from the pint.Quantity array which will return the long name of the units. i.e. instead of returning "eV" it will return "electron_volt".

The xarray.Dataset.pint.dequantify function converts the data from pint.Quantity back to the original xarray.DataArray and adds the "units" attribute back in. It also has an optional format parameter that allows you to specify the formatting type of "units" attribute. We have used the format="~P" option as it shortens the unit to its “short pretty” format ("eV"). For more options, see the Pint formatting documentation.

In [17]: total_particle_energy_ev = total_particle_energy_ev.pint.dequantify(format="~P")

In [18]: total_particle_energy_w = total_particle_energy_w.pint.dequantify(format="~P")

In [19]: total_particle_energy_ev
Out[19]: 
<xarray.DataArray 'Total_Particle_Energy_Electron' (time: 41)> Size: 328B
array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 3.84662929e-30, 4.92309810e+27,
       9.13751620e+28, 1.15690693e+29, 5.12372457e+29, 9.13754989e+29,
       1.33150054e+30, 1.95077250e+30, 2.72578088e+30, 3.90067464e+30,
       4.49381120e+30, 6.19561254e+30, 6.98392546e+30, 8.11100375e+30,
       8.75675970e+30, 9.81765064e+30, 1.01325251e+31, 1.06320639e+31,
       1.12249166e+31, 1.12582434e+31, 1.22112832e+31, 1.28066877e+31,
       1.19977633e+31, 1.33105248e+31, 1.25911267e+31, 1.33271914e+31,
       1.37950836e+31, 1.36036578e+31, 1.38604944e+31, 1.42514072e+31,
       1.41341899e+31, 1.50028430e+31, 1.49204760e+31, 1.45721790e+31,
       1.51437182e+31])
Coordinates:
  * time     (time) float64 328B 2.606e-17 5.003e-15 ... 1.95e-13 2e-13
Attributes:
    full_name:  Total Particle Energy/Electron
    long_name:  Total Particle Energy Electron
    units:      eV

Visualising the Converted Data#

To confirm the conversion has worked correctly, we can plot the original and converted xarray.Dataset side by side:

In [20]: import matplotlib.pyplot as plt

In [21]: plt.rcParams.update({
   ....:     "axes.labelsize": 16,
   ....:     "xtick.labelsize": 14,
   ....:     "ytick.labelsize": 14
   ....: })
   ....: 

In [22]: fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(16,8))

In [23]: ds["Total_Particle_Energy_Electron"].plot(ax=ax1)
Out[23]: [<matplotlib.lines.Line2D at 0x7b2f4c193770>]

In [24]: total_particle_energy_ev.plot(ax=ax2)
Out[24]: [<matplotlib.lines.Line2D at 0x7b2f4c1935c0>]

In [25]: total_particle_energy_w.plot(ax=ax3)
Out[25]: [<matplotlib.lines.Line2D at 0x7b2f4c193350>]

In [26]: ax4.set_visible(False)

In [27]: fig.suptitle("Comparison of conversion from Joules to electron volts and watts", fontsize="18")
Out[27]: Text(0.5, 0.98, 'Comparison of conversion from Joules to electron volts and watts')

In [28]: fig.tight_layout()
_images/unit_conversion.png