Aligning a DICOM to a NIFTI, pixel-by-pixel

random

I have struggled at work to find how to align pixels. These pixels are particuarly vexing: they came from a DICOM file or a NIFTI file, which are both formats with awful documentation. If this helps a single person deal with this issue, then this post as reached its goal.

Published

April 23, 2024

1 The problem

We encountered the following situation at work. We had two files:

a DICOM file with a MRI scan of the subjects brain.
a NIFTI file with the segmentation of the brain into regions.

These are essentially 3D arrays of numbers, with a little bit of meta-data.

The issue is that the DICOM file and the NIFTI file can have arbitrary orientations with respect to one-another:

the axes might not be in the same order. For example, maybe we have:
```
dicom.shape == 400, 300, 200
nifti.shape == 300, 400, 200
```
the axes might not be in the same order: maybe the first pixel in one format is the last pixel of the other format.

The problem is thus quite simple to state: there are 48 possibilities¹. Let’s just open the docs, read the correct header keys, figure out how the arrays are organized, code the correct re-ordering and call it a day. Right? Right?

Well, that’s certainly a plan. However, No plan survives first contact with the enemy², and I can state with absolute certainty that the “documentation” of the DICOM and NIFTI standards (and ancillary software) is definitely hostile. Our initial hopes for a quick adventure were thus immediately blown to bits.

Still, I pushed through and, after many bloopers, I have found the solution. I hope it can help somebody else in the future to tackle these f..antastic file formats.

2 The solution

2.1 A bit of geometry

To understand the solution, it is important to understand projective geometry. Do not sweat, we don’t need to understand everything³. We just need to know that, if we want to transform coordinates between two reference frames, then that can be reframed as a matrix multiplication.

Note

A change of reference frame from $R_1$ to $R_2$ can be represented by a $4,4$ (or sometimes $3,3$) matrix $M_{R_2 \leftarrow R_1}$.
Computing the inverse of the matrix gives the matrix for the opposite change of reference frame:

\[ \left(M_{R_2 \leftarrow R_1}\right)^{-1} = M_{R_1 \leftarrow R_2} \]
Computing the matrix product $M_{R_3 \leftarrow R_2} M_{R_2 \leftarrow R_1}$ gives the matrix for change of reference frame from $R_1$ to $R_3$.

That’s all we need to know, so feel free to skip to the next section: Section 2.2, or read my detailed explanations below.

For example:

let $x,y,z$ denote the coordinates in the MRI room, with respect to the earth: $x$ is the south-north axis, $y$ is the west-east axis, $z$ is the down-up axis. Let the origin be the middle of the door into the room.
let $a,b,c$ denote the coordinates in the patient space. $a$ is the left-right axis of the patient, $b$ is the back-to-front axis, $c$ is the feet-to-head axis. The origin point is the middle of the head of the patient.

Then, there exists a matrix $M$ of shape $4, 4$ which can be used to translate from $x,y,z$ coordinates to $a,b,c$:

\[ \begin{pmatrix} a \\ b \\ c \\ 1 \end{pmatrix} = M \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} \]

Wait, why do we have constant coordinates $1$ here? Why are the vectors 4D instead of 3? It’s needed so that we can also represent the change of origin using $M$. If we are representing transformations between reference frames with the same origin, we can work with a $3, 3$ matrix instead. However, I wanted to present the 4D case, because it is what is described in the DICOM and NIFTI docs.

Inverting the matrix reverses the direction of the change of variables:

\[\begin{align} \begin{pmatrix} a \\ b \\ c \\ 1 \end{pmatrix} = M \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} \\ M^{-1} \begin{pmatrix} a \\ b \\ c \\ 1 \end{pmatrix} = \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} \end{align}\]

Similary, if we had three frames of reference, then applying sequentially a change of reference from $R_1$ to $R_2$ then a second from $R_2$ to $R_3$ would give an overall change from $R_1$ to $R_3$. The same property holds for the associated matrices:

\[ M_{R_3 \leftarrow R_2} M_{R_2 \leftarrow R_1} = M_{R_3 \leftarrow R_1} \]

2.2 Finding the matrices

So now we know that we need to go looking for matrices.

For the NIFTI file format, this is immediate: the matrix is encoded as a header key⁴. If using nibabel, it is immediately accessible.

Carefully reading the docs specifies that this affine matrix specifies the transformation between the voxel indices $i,j,k$ and the RAS patient-space (the acronym gives the order and direction of the axes: Right then Anterior (i.e. back-to-front) then Superior (foot-to-head)).

Surely, the dicom format must be similarly simple. Nope. However, since I know have digested the docs, here are the steps:

First, there is a transposition between what the DICOM format calls voxels and how the array is organized on disk⁵. This is explained here.
Then, the ImageOrientationPatient header specifies the first two columns of the matrix.
```
a = np.array(dicom.ImageOrientationPatient[:3])
b = np.array(dicom.ImageOrientationPatient[3:])

1matrix = np.zeros((4, 4))
matrix[3, 3] = 1

2matrix[:3, 0] = b
matrix[:3, 0] = a
```
1

Initializing the matrix and specifying the fourth column.

2

Note the change of order with respect to ImageOrientationPatient. This is due to the transposition.

Finally, by considering the change of position between two different dicom slices, we can find the third column.

slice_diff = (np.array(dicom2.ImagePositionPatient) - np.array(dicom.ImagePositionPatient)) / (
    dicom2.InstanceNumber - dicom.InstanceNumber
)
1c = slice_diff / np.sum(slice_diff**2) ** 0.5

1: Normalizing the vector so that it has norm 1.

This gives us the matrix to transform from the DICOM array coordinates $i,j,k$ to the LPS patient-space. This is almost the same as the RAS space used by the NIFTI format: the first two axes are just pointing in the opposite direction.

Overall, we are now able to combine:

The matrix from DICOM coordinates to LPS,
The matrix from LPS to RAS,
The matrix from NIFTI coordinates to RAS,

in order to find the matrix corresponding to the change of variable we want:

\[ M_{\text{DICOM} \leftarrow \text{NIFTI}} = \left[ M_{\text{RAS} \leftarrow \text{LPS}} M_{\text{LPS} \leftarrow \text{DICOM}} \right]^{-1} M_{\text{RAS} \leftarrow \text{NIFTI}} \]

And that’s it. We can now analyze the matrix to find how it swaps axes around and reorders them, and apply that transformation to the NIFTI array:

matrix_dicom_from_nifti = ...
ornt = nib.orientations.io_orientation(matrix_dicom_from_nifti)
reoriented_nifti = nib.orientations.apply_orientation(nifti, ornt)

Honestly, the only way you’ve made it this far is if you are yourself trying to deal with this exact problem. If so, then best of luck and bon courage. You will need both.

3 Bloopers

I can’t resist but tell you about all of the hilarious moments along the way were the DICOM spec blew up in my face.

After a little bit of (unsuccesfully) poking around and trying to read the docs, I asked a colleague about the issue. We quickly went from: “Oh, but all dicoms have the same orientation.” to “No wait, there are two.” to “Well technically it’s maybe three.”. We then found a fourth one later in our tests.
The nibabel documentation, and the NIFTI docs I have found all repeat: “the NIFTI format uses a RAS reference frame” throughout. Imagine my surprise when I discovered the existence of the affine header.
In my first calculation of the shift between two DICOM slices, I initially naively thought that the slices would be in order and that the first file (with name foo00000) would actually be the first slice. Ha. Ha. Ha. They are in a random order instead.

4 References

Footnotes

$3!=6$ possibilities for the order of the axes; $2^3=8$ possibilities for the order of each axis.↩︎
History of this quote ↩︎
Seriously, don’t worry if you don’t quite get it: this projective geometry stuff is a bit crazy.↩︎
Technically three but nibabel automatically chooses the most appropriate one.↩︎
Possibly due to the differences between Fortran and C array layouts on disk?↩︎

--- title: "Aligning a DICOM to a NIFTI, pixel-by-pixel" description: | I have struggled at work to find how to align pixels. These pixels are particuarly vexing: they came from a DICOM file or a NIFTI file, which are both formats with *awful* documentation. If this helps a single person deal with this issue, then this post as reached its goal. date: "04/23/2024" categories: - random --- # The problem We encountered the following situation at work. We had two files: - a [DICOM file](https://en.wikipedia.org/wiki/DICOM) with a MRI scan of the subjects brain. - a [NIFTI file](https://en.wikipedia.org/wiki/Neuroimaging_Informatics_Technology_Initiative) with the segmentation of the brain into regions. These are essentially 3D arrays of numbers, with a little bit of meta-data. The issue is that the DICOM file and the NIFTI file can have arbitrary orientations with respect to one-another: - the axes might not be in the same order. For example, maybe we have: ```python dicom.shape == 400, 300, 200 nifti.shape == 300, 400, 200 ``` - the axes might not be in the same order: maybe the first pixel in one format is the last pixel of the other format. The problem is thus quite simple to state: there are 48 possibilities^[$3!=6$ possibilities for the order of the axes; $2^3=8$ possibilities for the order of each axis.]. Let's just open the docs, read the correct header keys, figure out how the arrays are organized, code the correct re-ordering and call it a day. Right? *Right?* Well, that's certainly *a* plan. However, *No plan survives first contact with the enemy*^[[History of this quote](https://quoteinvestigator.com/2021/05/04/no-plan/)], and I can state with absolute certainty that the "documentation" of the DICOM and NIFTI standards (and ancillary software) is definitely hostile. Our initial hopes for a quick adventure were thus immediately blown to bits. Still, I pushed through and, after many bloopers, I have found the solution. I hope it can help somebody else in the future to tackle these f..antastic file formats. # The solution ## A bit of geometry To understand the solution, it is important to understand *projective geometry*. Do not sweat, we don't need to understand everything^[Seriously, don't worry if you don't quite get it: this projective geometry stuff is a bit crazy.]. We just need to know that, if we want to transform coordinates between two reference frames, then that can be reframed as a matrix multiplication. ::: {.callout-note} 1. A change of reference frame from $R_1$ to $R_2$ can be represented by a $4,4$ (or sometimes $3,3$) matrix $M_{R_2 \leftarrow R_1}$. 1. Computing the inverse of the matrix gives the matrix for the opposite change of reference frame: $$ \left(M_{R_2 \leftarrow R_1}\right)^{-1} = M_{R_1 \leftarrow R_2} $$ 1. Computing the matrix product $M_{R_3 \leftarrow R_2} M_{R_2 \leftarrow R_1}$ gives the matrix for change of reference frame from $R_1$ to $R_3$. ::: That's all we need to know, so feel free to skip to the next section: @sec-finding-matrices, or read my detailed explanations below. For example: - let $x,y,z$ denote the coordinates in the MRI room, with respect to the earth: $x$ is the south-north axis, $y$ is the west-east axis, $z$ is the down-up axis. Let the origin be the middle of the door into the room. - let $a,b,c$ denote the coordinates in the patient space. $a$ is the left-right axis of the patient, $b$ is the back-to-front axis, $c$ is the feet-to-head axis. The origin point is the middle of the head of the patient. Then, there exists a matrix $M$ of shape $4, 4$ which can be used to translate from $x,y,z$ coordinates to $a,b,c$: $$ \begin{pmatrix} a \\ b \\ c \\ 1 \end{pmatrix} = M \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} $$ Wait, why do we have constant coordinates $1$ here? Why are the vectors 4D instead of 3? It's needed so that we can also represent the change of origin using $M$. If we are representing transformations between reference frames with the same origin, we can work with a $3, 3$ matrix instead. However, I wanted to present the 4D case, because it is what is described in the DICOM and NIFTI docs. Inverting the matrix reverses the direction of the change of variables: \begin{align} \begin{pmatrix} a \\ b \\ c \\ 1 \end{pmatrix} = M \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} \\ M^{-1} \begin{pmatrix} a \\ b \\ c \\ 1 \end{pmatrix} = \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} \end{align} Similary, if we had three frames of reference, then applying sequentially a change of reference from $R_1$ to $R_2$ then a second from $R_2$ to $R_3$ would give an overall change from $R_1$ to $R_3$. The same property holds for the associated matrices: $$ M_{R_3 \leftarrow R_2} M_{R_2 \leftarrow R_1} = M_{R_3 \leftarrow R_1} $$ ## Finding the matrices {#sec-finding-matrices} So now we know that we need to go looking for matrices. For the NIFTI file format, this is immediate: the matrix is encoded as a header key^[Technically three but nibabel automatically chooses the most appropriate one.]. [If using nibabel, it is immediately accessible](https://nipy.org/nibabel/nifti_images.html#the-nifti-affines). Carefully reading the docs specifies that this affine matrix specifies the transformation between the voxel indices $i,j,k$ and the RAS patient-space (the acronym gives the order and direction of the axes: Right then Anterior (i.e. back-to-front) then Superior (foot-to-head)). Surely, the dicom format must be similarly simple. *Nope*. However, since I know have digested the docs, here are the steps: 1. First, there is a transposition between what the DICOM format calls voxels and how the array is organized on disk^[Possibly due to the differences between Fortran and C array layouts on disk?]. [This is explained here](https://nipy.org/nibabel/dicom/dicom_orientation.html#i-j-columns-rows-in-dicom). 1. Then, the `ImageOrientationPatient` header specifies the first two columns of the matrix. ```python a = np.array(dicom.ImageOrientationPatient[:3]) b = np.array(dicom.ImageOrientationPatient[3:]) matrix = np.zeros((4, 4)) #<1> matrix[3, 3] = 1 #<1> matrix[:3, 0] = b #<2> matrix[:3, 0] = a #<2> ``` 1. Initializing the matrix and specifying the fourth column. 2. Note the change of order with respect to `ImageOrientationPatient`. This is due to the transposition. 1. Finally, by considering the change of position between two different dicom slices, we can find the third column. ```python slice_diff = (np.array(dicom2.ImagePositionPatient) - np.array(dicom.ImagePositionPatient)) / ( dicom2.InstanceNumber - dicom.InstanceNumber ) c = slice_diff / np.sum(slice_diff**2) ** 0.5 #<1> ``` 1. Normalizing the vector so that it has norm 1. This gives us the matrix to transform from the DICOM array coordinates $i,j,k$ to the LPS patient-space. This is almost the same as the RAS space used by the NIFTI format: the first two axes are just pointing in the opposite direction. Overall, we are now able to combine: 1. The matrix from DICOM coordinates to LPS, 1. The matrix from LPS to RAS, 1. The matrix from NIFTI coordinates to RAS, in order to find the matrix corresponding to the change of variable we want: $$ M_{\text{DICOM} \leftarrow \text{NIFTI}} = \left[ M_{\text{RAS} \leftarrow \text{LPS}} M_{\text{LPS} \leftarrow \text{DICOM}} \right]^{-1} M_{\text{RAS} \leftarrow \text{NIFTI}} $$ And that's it. We can now analyze the matrix to find how it swaps axes around and reorders them, and apply that transformation to the NIFTI array: ```python matrix_dicom_from_nifti = ... ornt = nib.orientations.io_orientation(matrix_dicom_from_nifti) reoriented_nifti = nib.orientations.apply_orientation(nifti, ornt) ``` Honestly, the only way you've made it this far is if you are yourself trying to deal with this exact problem. If so, then best of luck and *bon courage*. You will need both. # Bloopers I can't resist but tell you about all of the *hilarious* moments along the way were the DICOM spec blew up in my face. 1. After a little bit of (unsuccesfully) poking around and trying to read the docs, I asked a colleague about the issue. We quickly went from: "Oh, but all dicoms have the same orientation." to "No wait, there are two." to "Well technically it's maybe three.". We then found a fourth one later in our tests. 1. The nibabel documentation, and the NIFTI docs I have found all repeat: "the NIFTI format uses a RAS reference frame" throughout. Imagine my surprise when I discovered the existence of the `affine` header. 1. In my first calculation of the shift between two DICOM slices, I initially naively thought that the slices would be in order and that the first file (with name `foo00000`) would actually be the first slice. *Ha*. *Ha*. *Ha*. They are in a random order instead. # References 1. [Nibabel documentation on the orientation of dicom files](https://nipy.org/nibabel/dicom/dicom_orientation.html). 1. [Nibabel documentation on the transposition between voxels and the array coordinates](https://nipy.org/nibabel/dicom/dicom_orientation.html#i-j-columns-rows-in-dicom). 1. [Nibabel documentation: full formula for the DICOM affine matrix](https://nipy.org/nibabel/dicom/dicom_orientation.html#d-affine-formulae). 1. [Nibabel documentation: the nifti affines](https://nipy.org/nibabel/nifti_images.html#the-nifti-affines). 1. [Innolitics DICOM documentation: Image Orientation Patient](https://dicom.innolitics.com/ciods/rt-dose/image-plane/00200037). 1. [Innolitics DICOM documentation: Image Position Patient](https://dicom.innolitics.com/ciods/rt-dose/image-plane/00200032).