Maximum-likelihood joint image reconstruction and motion estimation with misaligned attenuation in TOF-PET/CT

Alexandre Bousse; Ottavia Bertolli; David Atkinson; Simon Arridge; Sébastien Ourselin; Brian F Hutton; Kris Thielemans

doi:10.1088/0031-9155/61/3/L11

1. Introduction

Patient motion in positron emission tomography (PET) is a source of error due to possible mismatches between the PET data and the computed tomography (CT) attenuation map (μ-map) (Nyflot et al 2015). In Bousse et al (2016) we developed a motion compensated reconstruction scheme for gated PET data, namely joint reconstruction/motion estimation (JRM), to jointly estimate both the activity distribution and the motion field, by penalized likelihood maximization. Unlike previous works in this field (Jacobson and Fessler 2003, Blume et al 2010), our model assumed that both the activity distribution and the μ-map are affected by motion. This model led to the following result: the JRM-reconstructed PET gates are the same for any input μ-maps derived from deformations of a common μ-map. This is because the estimated motion automatically accounts for PET/μ-map misalignment. However, in case of large mismatches, our results in Bousse et al (2016) showed that JRM needs a high number of iterations due to the significant cross-talk in the joint-likelihood.

Recent work Rezaei et al (2012), (2014) demonstrated the potential of jointly reconstructing the activity distribution and the μ-map from time-of-flight (TOF) PET data. Since JRM with warped μ-map presents some similarities with joint activity/attenuation reconstruction from emission data, in the sense that the warped μ-map must match the PET projections at each gate, TOF PET can increase the JRM convergence rate—especially in situations where the input μ-map is misaligned with the PET data.

In this work we investigate the ability of JRM with TOF PET to deal with a misaligned μ-map. Section 2 presents the JRM optimization problem for TOF PET and non-TOF PET, and summarizes the algorithm. In section 3 we compare the convergence rate of JRM with TOF PET and non-TOF PET on a simulated end-expiration PET gate and a simulated end-inhalation μ-map. Results are discussed in section 4.

2. Method

2.1. TOF maximum-likelihood for joint image reconstruction and motion estimation

In this section we use notations similar to Bousse et al (2016, sections II and III).

2.1.1. Motion-free model.

The activity distribution and the attenuation map respectively take the form of 2 functions $f\in {{\mathcal{C}}^{+}}$ and $\mu \in {{\mathcal{C}}^{+}}$ , where ${{\mathcal{C}}^{+}}$ denotes the set of non-negative continuous functions defined on ${{\mathbb{R}}^{3}}$ . The attenuation map μ is reconstructed independently from separate measurements such as x-ray CT or segmented MRI. TOF measured counts are represented by a collection $\boldsymbol{g}=\left({{g}_{i,t}}\right)_{i,t=1}^{{{n}_{\text{b}}},{{n}_{\text{t}}}}\in {{\mathbb{N}}^{{{n}_{\text{b}}}\times {{n}_{\text{t}}}}}$ . The subscripts $i\in \left\{1,\ldots,{{n}_{\text{b}}}\right\}$ and $t\in \left\{1,\ldots,{{n}_{\text{t}}}\right\}$ are respectively the detector and the time bin indices. In absence of motion, g_i,t follows a Poisson distribution of expectation ${{\bar{g}}_{i,t}}(\,f,\mu )$ :

$\begin{eqnarray}&&{{\bar{g}}_{i,t}}(\,f,\mu )=\tau {{a}_{i}}(\mu ){{\mathcal{H}}_{i,t}}\,f+{{s}_{i,t}},\end{eqnarray}$

with

$\begin{eqnarray}&&{{\mathcal{H}}_{i,t}}\,f\triangleq {{\mathop{\int}^{}}_{\Omega }}f\left(\boldsymbol{r}\right){{h}_{i,t}}\left(\boldsymbol{r}\right)\,\text{d}\boldsymbol{r}\quad \text{and}\quad {{a}_{i}}(\mu )\triangleq \exp \left(-{{\mathop{\int}^{}}_{{{L}_{i}}}}\mu \left(\boldsymbol{r}\right)\,\text{d}\boldsymbol{r}\right)\end{eqnarray} \tag{ 1 }$

where ${{h}_{i,t}}:{{\mathbb{R}}^{3}}\to {{\mathbb{R}}^{+}}$ is the TOF PET system response function at detector/time bin (i, t), τ is the scanning time, L_i is the segment connecting the detectors of bin i, s_{i, t} is the expected background events (random/scatter) at bin (i, t) and $\Omega \subset {{\mathbb{R}}^{3}}$ is a compact set representing the field of view.

2.1.2. Model with motion.

In practice, f and μ are affected by patient motion. For quasi-cyclic motion (respiratory, cardiac), acquired emission data are regrouped into ${{n}_{\text{g}}}$ gates, each of which corresponds to a patient state. Each TOF-PET data vector $\boldsymbol{g}_{{l}}=\left({{g}_{i,t,l}}\right)_{i,t=1}^{{{n}_{\text{b}}},{{n}_{\text{t}}}}\in {{\mathbb{N}}^{{{n}_{\text{b}}}\times {{n}_{\text{t}}}}}$ , $l=1,\ldots,{{n}_{\text{g}}}$ , corresponds to a single gate in which the patient is assumed to be static. The motion at gate l is represented by a diffeomorphism ${{\varphi}_{l}}:{{\mathbb{R}}^{3}}\to {{\mathbb{R}}^{3}}$ that deforms both the activity distribution f and the attenuation map μ. The deformed activities and μ-maps are ${{\mathcal{W}}_{{{\varphi}_{l}}}}\,f$ and ${{\mathcal{W}}_{{{\varphi}_{l}}}}\mu$ , where ${{\mathcal{W}}_{\varphi}}:{{\mathcal{C}}^{+}}\to {{\mathcal{C}}^{+}}$ is the warping operator defined as ${{\mathcal{W}}_{\varphi}}h=h\circ \varphi$ for all $h\in {{\mathcal{C}}^{+}}$ . The number of detected counts g_{i, t, l} at bin (i, t), gate l is a Poisson random variable of expectation

$\begin{eqnarray}&&{{\bar{g}}_{i,t,l}}\left(\,f,{{\varphi}_{l}},\mu \right)\triangleq {{\tau}_{l}}{{a}_{i}}\left({{\mathcal{W}}_{{{\varphi}_{l}}}}\mu \right){{\mathcal{H}}_{i,t}}{{\mathcal{W}}_{{{\varphi}_{l}}}}\,f+{{s}_{i,t,l}},\end{eqnarray} \tag{ 2 }$

where s_{i, t, l} is the expected number of background events at bin (i, t), gate l, ${{\tau}_{l}}$ is the duration of gate l and a_i and ${{\mathcal{H}}_{i,t}}$ are defined in (1).

2.1.3. Optimization problem.

The $\log$ -likelihood of the TOF-PET data is

$\begin{eqnarray}&&{{L}^{\text{TOF}}}\left(\,f,\boldsymbol{\varphi },\mu \right)=\underset{l=1}{\overset{{{n}_{\text{g}}}}{\mathop \sum}}\,\underset{i=1}{\overset{{{n}_{\text{b}}}}{\mathop \sum}}\,\underset{t=1}{\overset{{{n}_{\text{t}}}}{\mathop \sum}}\,{{g}_{i,t,l}}\log {{\bar{g}}_{i,t,l}}\left(\,f,{{\varphi}_{l}},\mu \right)-{{\bar{g}}_{i,t,l}}\left(\,f,{{\varphi}_{l}},\mu \right),\end{eqnarray} \tag{ 3 }$

where $\boldsymbol{\varphi }\triangleq \left({{\varphi}_{l}}\right)_{l=1}^{{{n}_{\text{g}}}}$ . Similarly, the non-TOF PET $\log$ -likelihood is

$\begin{eqnarray}&&{{L}^{\text{PET}}}\left(\,f,\boldsymbol{\varphi },\mu \right)=\underset{l=1}{\overset{{{n}_{\text{g}}}}{\mathop \sum}}\,\underset{i=1}{\overset{{{n}_{\text{b}}}}{\mathop \sum}}\,{{g}_{i,l}}\log {{\bar{g}}_{i,l}}\left(\,f,{{\varphi}_{l}},\mu \right)-{{\bar{g}}_{i,l}}\left(\,f,{{\varphi}_{l}},\mu \right),\end{eqnarray}$

with ${{g}_{i,l}}=\sum_{t=1}^{{{n}_{\text{t}}}}{{g}_{i,t,l}}$ and ${{\bar{g}}_{i,l}}\left(\,f,{{\varphi}_{l}},\mu \right)=\sum_{t=1}^{{{n}_{\text{t}}}}{{\bar{g}}_{i,t,l}}\left(\,f,{{\varphi}_{l}},\mu \right)$ .

Maximum-likelihood joint image reconstruction/motion estimation (JRM) in TOF PET and non-TOP PET consists of solving the following optimization problem

$\begin{eqnarray}&&\left(\,\hat{f},\hat{\boldsymbol{\varphi }}\right)=\underset{f\in {{\mathcal{C}}^{+}},\boldsymbol{\varphi }\in \mathcal{D}}{\mathop{\text{arg}\,\text{max}}}\,L\left(\,f,\boldsymbol{\varphi },\mu \right)\end{eqnarray} \tag{ 4 }$

where L is either ${{L}^{\text{TOF}}}$ or ${{L}^{\text{PET}}}$ , $\mathcal{D}={{\mathcal{D}}^{{{n}_{\text{g}}}}}$ and $\mathcal{D}$ denotes the set of diffeorphism on ${{\mathbb{R}}^{3}}$ . Solving (4) returns a single image f reconstructed from all gates $\boldsymbol{g}_{{l}}$ , $l=1,\ldots,{{n}_{\text{g}}}$ . The reconstructed activity image at gate l is ${{\mathcal{W}}_{{{{\hat{\varphi}}}_{l}}}}\,\hat{f}=\hat{f}\circ {{\hat{\varphi}}_{l}}$ .

It should be noted that (2) and therefore (3) depend on f and μ only via ${{\mathcal{W}}_{{{\varphi}_{l}}}}\,f$ and ${{\mathcal{W}}_{{{\varphi}_{l}}}}\mu$ . The activity distribution f corresponds to an unobserved state consistent with μ (but not necessarily with any $\boldsymbol{g}_{{l}}$ ). This led to a theoretical result for JRM in non-TOF PET in Bousse et al (2016, proposition 1), that can be naturally extended to TOF PET: if ${{\mu}_{1}}$ and ${{\mu}_{2}}$ are 2 different attenuation maps such that ${{\mu}_{2}}={{\mu}_{1}}\circ \psi$ for some $\psi \in \mathcal{D}$ , then there exists a bijection between the maximizers (when they exist) of $\left(\,f,\boldsymbol{\varphi }\right)\mapsto L\left(\,f,\boldsymbol{\varphi },{{\mu}_{1}}\right)$ and the maximizers of $\left(\,f,\boldsymbol{\varphi }\right)\mapsto L\left(\,f,\boldsymbol{\varphi },{{\mu}_{2}}\right)$ such that each pair of maximizers $(\,{{\hat{f}}_{1}},{{\hat{\boldsymbol{\varphi }}}^{1}})$ – $(\,{{\hat{{{f}_{2}}}}_{{}}},{{\hat{\boldsymbol{\varphi }}}^{2}})$ , defined by this bijection, satisfies ${{\mathcal{W}}_{\hat{\varphi}_{l}^{1}}}\,{{\hat{f}}_{1}}={{\mathcal{W}}_{\hat{\varphi}_{l}^{2}}}\,{{\hat{f}}_{2}}$ and ${{\mathcal{W}}_{\hat{\varphi}_{l}^{1}}}{{\mu}_{1}}={{\mathcal{W}}_{\hat{\varphi}_{l}^{2}}}{{\mu}_{2}}$ .

Proposition 1 in Bousse et al (2016) does not claim the existence and uniqueness of a maximizer for $\left(\,f,\boldsymbol{\varphi }\right)\mapsto L\left(\,f,\boldsymbol{\varphi },\mu \right)$ . However, if $(\,{{\hat{f}}_{1}},{{\hat{\boldsymbol{\varphi }}}^{1}})$ is a likely candidate to maximize $\left(\,f,\boldsymbol{\varphi }\right)\mapsto L\left(\,f,\boldsymbol{\varphi },{{\mu}_{1}}\right)$ , then there exists $(\,{{\hat{f}}_{2}},{{\hat{\boldsymbol{\varphi }}}^{2}})\in {{\mathcal{C}}^{+}}\times \mathcal{D}$ that is an equally likely candidate to maximize $\left(\,f,\boldsymbol{\varphi }\right)\mapsto L\left(\,f,\boldsymbol{\varphi },{{\mu}_{2}}\right)$ such that ${{\mathcal{W}}_{\hat{\varphi}_{l}^{1}}}\,{{\hat{f}}_{1}}={{\mathcal{W}}_{\hat{\varphi}_{l}^{2}}}\,{{\hat{f}}_{2}}$ and ${{\mathcal{W}}_{\hat{\varphi}_{l}^{1}}}{{\mu}_{1}}={{\mathcal{W}}_{\hat{\varphi}_{l}^{2}}}{{\mu}_{2}}$ . Results from Bousse et al (2016) demonstrated that JRM with ${{\mu}_{1}}$ and ${{\mu}_{2}}={{\mu}_{1}}\circ \psi$ returns similar reconstructed gates, and more specifically, JRM can be performed with a μ-map misaligned with each gate, provided this misaligned results from some diffeomorphism. Moreover, the estimated motion realigns the μ-maps to the data at each gate l.

When μ is aligned with one gate $\boldsymbol{g}_{{{{l}_{0}}}}$ , f can be initialized by maximizing the $\log$ -likelihood at gate l₀ with ${{\varphi}_{{{l}_{0}}}}=\text{Id}$ . If μ is misaligned with every gate, there is no consistent data to obtain a good initial estimate of f, thus rendering the optimization problem (4) more difficult. For example, in non-TOF PET (see (Bousse et al (2016, section IV.B)), it takes 50 to 100 iterations to solve (4) with a misaligned μ-map. The objective of this work is to see if TOF PET can facilitate the optimization.

2.2. Joint reconstruction scheme

2.2.1. Discretization.

We used the same discretization scheme as in Bousse et al (2016), initially introduced in Jacobson and Fessler (2003) and Jacobson (2006). ${{M}_{n,m}}\left(\mathbb{R}\right)$ denotes the space of real $n\times m$ matrices. The activity distribution and attenuation map are respectively represented by $\boldsymbol{f}\in \mathbb{R}_{+}^{{{n}_{\text{v}}}}$ and $\boldsymbol{\mu }\in \mathbb{R}_{+}^{{{n}_{\text{v}}}}$ , where ${{n}_{\text{v}}}$ denotes the total number of voxels. A deformation φ is parametrized by a B-spline coefficient vector $\boldsymbol{\alpha }=\left(\boldsymbol{\alpha }^{{X}},\boldsymbol{\alpha }^{{Y}},\boldsymbol{\alpha }^{{Z}}\right)\in {{\mathbb{R}}^{3\times {{n}_{\text{c}}}}}$ , where ${{n}_{\text{c}}}$ denotes the number of B-spline control points. The warping operator ${{\mathcal{W}}_{\varphi}}$ becomes a square matrix $\boldsymbol{W}_{{\boldsymbol{\alpha }}}\in {{M}_{{{n}_{\text{v}}},{{n}_{\text{v}}}}}\left(\mathbb{R}\right)$ , defined by the voxel interpolating functions, the B-spline coefficients $\boldsymbol{\alpha }$ and the B-spline basis. The B-spline coefficient corresponding to the motion ${{\varphi}_{l}}$ at gate l is denoted $\boldsymbol{\alpha }_{{l}}$ . The entire collection of B-spline coefficients is denoted $\boldsymbol{\theta }=\left(\boldsymbol{\alpha }_{{l}}\right)_{l=1}^{{{n}_{\text{g}}}}$ .

The TOF operator ${{\mathcal{H}}_{i,t}}$ takes the form of a system matrix $\boldsymbol{H}\in {{M}_{{{n}_{\text{b}}}\times {{n}_{\text{t}}},{{n}_{\text{v}}}}}\left(\mathbb{R}\right)$ where ${{\left[\boldsymbol{H}\right]}_{(i-1){{n}_{\text{t}}}+t,j}}\triangleq {{p}_{i,j,t}}$ is the probability that an annihilation occurring at voxel j is detected in bin (i, t). The expected number of counts is redefined as

$\begin{eqnarray}&&{{\bar{g}}_{i,t,l}}\left(\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)\triangleq \tau {{a}_{i}}\left(\boldsymbol{W}_{{\boldsymbol{\alpha }_{{l}}}}\boldsymbol{\mu }\right){{\left[\boldsymbol{H}\boldsymbol{W}_{{\boldsymbol{\alpha }_{{l}}}}\,\boldsymbol{f}\right]}_{(i-1){{n}_{\text{t}}}+t}}+{{s}_{i,t,l}},\end{eqnarray} \tag{ 5 }$

where ${{a}_{i}}\left(\boldsymbol{\mu }\right)\triangleq \exp \left(-{{\left[\boldsymbol{L}\boldsymbol{\mu }\right]}_{i}}\right)$ and the line integral matrix $\boldsymbol{L}\in {{M}_{{{n}_{\text{b}}},{{n}_{\text{v}}}}}\left(\mathbb{R}\right)$ is defined by ${{\left[\boldsymbol{L}\right]}_{i,j}}={{\ell}_{i,j}}$ where ${{\ell}_{i,j}}$ is the length of the intersection of L_i with voxel j.

2.2.2. Optimization method.

The discrete TOF $\log$ -likelihood is derived from (3):

$\begin{eqnarray}&&{{L}^{\text{TOF}}}\left(\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)=\underset{l=1}{\overset{{{n}_{\text{g}}}}{\mathop \sum}}\,\underset{i=1}{\overset{{{n}_{\text{b}}}}{\mathop \sum}}\,\underset{t=1}{\overset{{{n}_{\text{t}}}}{\mathop \sum}}\,{{g}_{i,t,l}}\log {{\bar{g}}_{i,t,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)-{{\bar{g}}_{i,t,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right).\end{eqnarray} \tag{ 6 }$

In comparison, the non-TOF PET discrete $\log$ -likelihood is

$\begin{eqnarray}&&{{L}^{\text{PET}}}\left(\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)\triangleq \underset{l=1}{\overset{{{n}_{\text{g}}}}{\mathop \sum}}\,\underset{i=1}{\overset{{{n}_{\text{b}}}}{\mathop \sum}}\,{{g}_{i,l}}\log {{\bar{g}}_{i,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)-{{\bar{g}}_{i,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)\end{eqnarray} \tag{ 7 }$

with ${{g}_{i,l}}=\sum_{t=1}^{{{n}_{\text{t}}}}{{g}_{i,t,l}}$ and ${{\bar{g}}_{i,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)=\sum_{t=1}^{{{n}_{\text{t}}}}{{\bar{g}}_{i,t,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)$ .

To enforce image and motion smoothness, 2 penalty terms $U\left(\,\boldsymbol{f}\right)$ and $V\left(\boldsymbol{\theta }\right)$ are added to ${{L}^{\text{TOF}}}$ and ${{L}^{\text{PET}}}$ (quadratic penalties, see Bousse et al (2016)). The discrete JRM formulation is formulated as

$\begin{eqnarray}&&\,\left({{\hat{\boldsymbol{f}}}^{\text{TOF}}},{{\hat{\boldsymbol{\theta }}}^{\text{TOF}}}\right)=\underset{\boldsymbol{f}\geqslant \mathbf{0},\boldsymbol{\theta }}{\mathop{\text{arg}\,\text{max}}}\,{{\Phi}^{\text{TOF}}}\left(\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)\end{eqnarray} \tag{ 8 }$

$\begin{eqnarray}&&\left(\,{{\hat{\boldsymbol{f}}}^{\text{PET}}},{{\hat{\boldsymbol{\theta }}}^{\text{PET}}}\right)=\underset{\boldsymbol{f}\geqslant \mathbf{0},\boldsymbol{\theta }}{\mathop{\text{arg}\,\text{max}}}\,{{\Phi}^{\text{PET}}}\left(\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)\end{eqnarray} \tag{ 9 }$

with ${{\Phi}^{\text{TOF}}}\left(\,\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)={{L}^{\text{TOF}}}\left(\,\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)-\beta U\left(\,\,\boldsymbol{f}\right)-\gamma V\left(\boldsymbol{\theta }\right)$ , ${{\Phi}^{\text{PET}}}\left(\,\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)={{L}^{\text{PET}}}\left(\,\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)-\beta U\left(\,\,\boldsymbol{f}\,\right)-$ $\gamma V\left(\boldsymbol{\theta }\right)$ . The JRM-reconstructed activity images corresponding to gate l are $\boldsymbol{W}_{{\hat{\boldsymbol{\alpha }}_{l}^{\text{TOF}}}}{{\hat{\boldsymbol{f}}}^{\text{TOF}}}$ for TOF PET and $\boldsymbol{W}_{{\hat{\boldsymbol{\alpha }}_{l}^{\text{PET}}}}{{\hat{\boldsymbol{f}}}^{\text{PET}}}$ for non-TOF PET.

We solve (8) and (9) by alternating maximization in $\boldsymbol{f}$ and $\boldsymbol{\theta }$ . We proceed as in Bousse et al (2016): a complete iteration consists of a maximization in $\boldsymbol{\theta }$ —performed with a limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) quasi-Newton (QN) line-search algorithm (Nocedal and Wright 2006, chapter 7)—and a maximization in $\boldsymbol{f}$ —performed with a motion compensated (MC) modified maximum-likelihood expectation-maximization (MMLEM) (De Pierro 1995) reconstruction from the gated data $\left(\boldsymbol{g}_{{l}}\right)_{l=1}^{{{n}_{\text{g}}}}$ using the current estimated motion parameter $\boldsymbol{\theta }$ .

The gradient of ${{L}^{\text{TOF}}}$ in $\boldsymbol{\alpha }_{{l}}$ is similar to the PET case described in Bousse et al (2016) with a summation over the time index t:

$\begin{eqnarray}&&{{\nabla}_{\boldsymbol{\alpha }_{{l}}}}L\left(\,\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)=\underset{t=1}{\overset{{{n}_{\text{t}}}}{\mathop \sum}}\,\boldsymbol{J}_{{\boldsymbol{\alpha }_{{l}}}}{{\left(\,{{\overline{\boldsymbol{g}}}_{:,t,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)\right)}^{\top}}\left(\frac{\boldsymbol{g}_{{:,t,l}}}{{{\overline{\boldsymbol{g}}}_{:,t,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)}-\mathbf{1}\right)\end{eqnarray}$

with $\boldsymbol{A}\left(\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)=\text{diag}\left\{\left({{a}_{i}}\left(\boldsymbol{W}_{{\boldsymbol{\alpha }_{{l}}}}\boldsymbol{\mu }\right)\right)_{i=1}^{{{n}_{\text{b}}}}\right\}$ , $\boldsymbol{g}_{{:,t,l}}=\left({{g}_{i,t,l}}\right)_{i=1}^{{{n}_{\text{b}}}}$ , ${{\overline{\boldsymbol{g}}}_{:,t,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)=\left({{\bar{g}}_{i,t,l}}\left(\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)\right)_{i=1}^{{{n}_{\text{b}}}}$ , $\boldsymbol{H}_{{t}}\in {{M}_{{{n}_{\text{b}}},{{n}_{\text{v}}}}}\left(\mathbb{R}\right)$ defined by ${{\left[\boldsymbol{H}_{{t}}\right]}_{i,j}}={{p}_{i,j,t}}$ , and

$\begin{eqnarray}&&\boldsymbol{J}_{{\boldsymbol{\alpha }_{{l}}}}\left({{\overline{\boldsymbol{g}}}_{:,t,l}}\left(\,\,\boldsymbol{f},\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)\right)=\tau \boldsymbol{A}\left(\boldsymbol{\alpha }_{{l}},\boldsymbol{\mu }\right)\left(-\text{diag}\left\{\boldsymbol{H}_{{t}}\boldsymbol{W}_{{\boldsymbol{\alpha }_{{l}}}}\,\boldsymbol{f}\right\}\boldsymbol{L}\boldsymbol{J}_{{\boldsymbol{\alpha }_{{l}}}}\left(\boldsymbol{W}_{{\boldsymbol{\alpha }_{{l}}}}\boldsymbol{\mu }\right)+\;\boldsymbol{H}_{{t}}\boldsymbol{J}_{{\boldsymbol{\alpha }_{{l}}}}\left(\boldsymbol{W}_{{\boldsymbol{\alpha }_{{l}}\,}}\,\boldsymbol{f}\right)\right).\end{eqnarray} \tag{ 10 }$

The Jacobian matrices $\boldsymbol{J}_{{\boldsymbol{\alpha }}}\left(\boldsymbol{W}_{{\boldsymbol{\alpha }}}\,\boldsymbol{f}\right)$ and $\boldsymbol{J}_{{\boldsymbol{\alpha }}}\left(\boldsymbol{W}_{{\boldsymbol{\alpha }}}\boldsymbol{\mu }\right)$ are derived in Bousse et al (2016).

The overall scheme is briefly summarized in algorithm 1, for $\Phi={{\Phi}^{\text{TOF}}}$ or $\Phi={{\Phi}^{\text{PET}}}$ . Each sub-maximization algorithm (MC-MMLEM and L-LBFGS) is run until convergence. A detailed version of the scheme can be found in Bousse et al (2016).

Algorithm 1:. Summary of JRM

Input: TOF PET gated data $\left(\boldsymbol{g}_{{l}}\right)_{l=1}^{{{n}_{\text{g}}}}$ $\left(\boldsymbol{g}_{{l}}\right)_{l=1}^{{{n}_{\text{g}}}}$ , attenuation map $\boldsymbol{\mu }$ $\boldsymbol{\mu }$ (also image and motion smoothing priors weights β and γ).

Output: Activity image $\boldsymbol{f}$ $\boldsymbol{f}$ , B-spline motion parameter $\boldsymbol{\theta }$ $\boldsymbol{\theta }$

$\boldsymbol{\theta }\leftarrow \mathbf{0}$ $\boldsymbol{\theta }\leftarrow \mathbf{0}$ ;

$\boldsymbol{f}\leftarrow \text{MMLEM}\left(\boldsymbol{g}_{{1}}\right)$ $\boldsymbol{f}\leftarrow \text{MMLEM}\left(\boldsymbol{g}_{{1}}\right)$ (reconstruction from gate 1 w/o motion compensation) ;

for $n=1,\ldots,{{n}_{\text{max}}}$ $n=1,\ldots,{{n}_{\text{max}}}$ do

$\boldsymbol{\theta }\leftarrow \text{arg}\,\text{ma}{{\text{x}}_{\boldsymbol{\theta }}}\Phi\left(\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)$ $\boldsymbol{\theta }\leftarrow \text{arg}\,\text{ma}{{\text{x}}_{\boldsymbol{\theta }}}\Phi\left(\boldsymbol{f},\boldsymbol{\theta },\boldsymbol{\mu }\right)$ (with L-BFGS QN line-search) ;

$\boldsymbol{f}\leftarrow \text{MC}-\text{MMLEM}\left(\boldsymbol{g},\boldsymbol{\theta }\right)$ $\boldsymbol{f}\leftarrow \text{MC}-\text{MMLEM}\left(\boldsymbol{g},\boldsymbol{\theta }\right)$ ;

end

2.2.3. Alternative approach for the single gate case.

When the task is to reconstruct from a single gate l₀ only, the projection model (5) can be simplified by ignoring the motion on the activity and warp the μ-map only, i.e. using a Poisson model with expectation

$\begin{eqnarray}&&\check{\boldsymbol{g}}_{i,t,l_0}(\boldsymbol{f},\boldsymbol{\alpha}_{l_0},\boldsymbol{\mu}) \triangleq \tau a_i(\boldsymbol{W}_{\boldsymbol{\alpha}_{l_0}}\boldsymbol{\mu}) [\boldsymbol{H} \boldsymbol{f}]_{(i-1){n_{{\rm t}}} + t} + s_{i,t,_{l_0}} \end{eqnarray}$

so that the Jacobian (10) expression is simplified. The reconstructed activity $\hat{\boldsymbol{f}}$ matches the data $\boldsymbol{g}_{{{{l}_{0}}}}$ without the need of being warped. This problem was addressed in Rezaei and Nuyts (2013). However, the generalization of this model to the multi gate case requires to estimate one activity image $\hat{\boldsymbol{f}}$ per gate, as opposed to JRM which only reconstruct one image $\boldsymbol{f}$ , warped to each gate.

3. Experiments on simulated data

For this experiment we used only one gate, i.e. ${{n}_{\text{g}}}=1$ . The sums over l in the $\log$ -likelihood (6) and (7) are dropped. We only reconstruct from a single TOF PET dataset $\boldsymbol{g}\in {{\mathbb{N}}^{{{n}_{\text{b}}}\times {{n}_{\text{t}}}}}$ , ${{\left[\boldsymbol{g}\right]}_{i,t}}={{g}_{i,t}}$ . The motion parameter $\boldsymbol{\theta }$ reduces to a single $\boldsymbol{\alpha }$ . The aim is to reconstruct a single activity image with a misaligned μ-map. The estimated deformation corresponds to the misalignment between the input μ-map and $\boldsymbol{g}$ .

Each MC-MMLEM sub-optimimization was performed with 100 iterations which in our experiments was sufficient for convergence (no stopping rule was used), using the current $\boldsymbol{\alpha }$ estimate. Prior to each MC-MMLEM sub-optimimization, the image $\boldsymbol{f}$ was reinitialized to a blank image to avoid attenuation artifacts due to incomplete motion estimation in the previous iteration. We used the Fortran implementation proposed in Zhu et al (1997) for each L-BFGS sub-optimimization, with 80 iterations (sufficient for convergence in our experiments).

3.1. Simulation set-up

Our simulation set-up is similar to Bousse et al (2016, section IV.B) with Poisson noise. We used the XCAT phantom to generate ground truth end-expiration activity $\boldsymbol{f}^{{\star}}$ and attenuation map $\boldsymbol{\mu }^{{\star}}$ ( $160\times 160\times 42$ volumes with 3.125 mm edge cubic voxels, corresponding to a 500 mm field of view, see figures 1(a) and (b)). We used a ${{n}_{\text{c}}}=53\times 53\times 16$ B-spline grid to parametrize the motion. A misaligned end-inhalation μ-map, $\widetilde{\boldsymbol{\mu }}$ (figure 1(c)), was also generated.

**Figure 1.** (a) $\boldsymbol{f}^{{\star}}$ : true activity ; (b) $\boldsymbol{\mu }^{{\star}}$ : true μ-map ; $\widetilde{\boldsymbol{\mu }}$ : deep breath-in misaligned μ-map.
Download figure:
Standard image High-resolution image

**Figure 1.** (a) $\boldsymbol{f}^{{\star}}$ : true activity ; (b) $\boldsymbol{\mu }^{{\star}}$ : true μ-map ; $\widetilde{\boldsymbol{\mu }}$ : deep breath-in misaligned μ-map.
Download figure:
Standard image High-resolution image

The spatial resolution of the PET was set to 6 mm FWHM and the temporal TOF resolution set to 500 ps FWHM. We used 10 TOF bins (332 ps width). The line integral operator $\boldsymbol{L}$ was adjusted to match the spatial resolution of the PET. We generated TOF and non-TOF data Poisson random vectors from $\boldsymbol{f}^{{\star}}$ and $\boldsymbol{\mu }^{{\star}}$ as:

$\begin{eqnarray}&&{{g}_{i,t}}\sim \text{Poisson}\left(g_{i,t}^{\star}\right)\quad \text{and}\quad {{g}_{i}}=\underset{t=1}{\overset{{{n}_{\text{t}}}}{\mathop \sum}}\,{{g}_{i,t}},\end{eqnarray}$

with

$\begin{eqnarray}&&g_{i,t}^{\star}=\tau {{a}_{i}}\left(\boldsymbol{\mu }^{{\star}}\right){{\left[\boldsymbol{H}\boldsymbol{f}^{{\star}}\right]}_{(i-1){{n}_{\text{t}}}+t}}+{{s}_{i,t}}.\end{eqnarray}$

$\left({{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}},{{\hat{\boldsymbol{f}}}^{\text{TOF}}}\right)$ and $\left({{\hat{\boldsymbol{\alpha }}}^{\text{PET}}},{{\hat{\boldsymbol{f}}}^{\text{PET}}}\right)$ were obtained with JRM using the misaligned attenuation map $\widetilde{\boldsymbol{\mu }}$ , i.e. by maximizing $\left(\,\boldsymbol{f},\boldsymbol{\alpha }\right)\mapsto {{\Phi}^{\text{TOF}}}\left(\,\boldsymbol{f},\boldsymbol{\alpha },\tilde{\mathop{\boldsymbol{\mu }}}\,\right)$ and $\left(\,\boldsymbol{f},\boldsymbol{\alpha }\right)\mapsto {{\Phi}^{\text{PET}}}\left(\,\boldsymbol{f},\boldsymbol{\alpha },\tilde{\mathop{\boldsymbol{\mu }}}\,\right)$ using algorithm 1.

JRM reconstructs the activity distribution $\boldsymbol{f}$ in the $\widetilde{\boldsymbol{\mu }}$ -space, which does not correspond to the observed PET gate, and $\boldsymbol{W}_{{\boldsymbol{\alpha }}}$ warps it to the observed gate. Therefore, we displayed the warped images $\boldsymbol{W}_{{\hat{\boldsymbol{\alpha }}}}\,\hat{\boldsymbol{f}}$ , in order to be compared with the activity phantom $\boldsymbol{f}^{{\star}}$ —used to generate the data.

We chose two different values of the image smoothness parameter β for ${{\Phi}^{\text{TOF}}}$ and ${{\Phi}^{\text{PET}}}$ . The image variance was estimated by reconstructing several noise replicates for different values of β, as in Bousse et al (2016), and the 2 values of β were chosen such that the variance in the output images $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{PET}}}$ and $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}}$ were the same.

3.2. Results

Figures 2 and 3 show the JRM reconstructed activities $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{PET}}}$ and $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}}$ at several iterations (iteration 0 corresponds to the first MMLEM reconstruction with no motion compensation). It can be observed that PET JRM requires 50 to 100 iterations to reduce μ-map misalignment artifacts significantly whereas TOF JRM requires only 5. The relative differences $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\widetilde{\boldsymbol{\mu }}-\boldsymbol{\mu }^{{\star}}$ and $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\widetilde{\boldsymbol{\mu }}-\boldsymbol{\mu }^{{\star}}$ are shown in figures 4 and 5⁶. Results show that the warped μ-map using TOF JRM estimated motion, i.e., $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\widetilde{\boldsymbol{\mu }}$ , becomes similar to $\boldsymbol{\mu }^{{\star}}$ in 5 iterations, whereas it takes more than 50 iterations for the warped μ-map using non-TOF JRM estimated motion. Note that since TOF PET forward and back projections are computationally more demanding, these observations should be put into perspective. Nevertheless, the reduction in number of iterations shows that using TOF PET significantly reduces the cross-talk in JRM as compared to non-TOF PET.

**Figure 2.** From top left to bottom right: $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{PET}}}$ at iterations 0 (no motion compensation), 1, 10, 30, 50 and 100.
Download figure:
Standard image High-resolution image

**Figure 3.** From left to right: $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}}$ at iterations 0 (no motion compensation), 1 and 5.
Download figure:
Standard image High-resolution image

**Figure 3.** From left to right: $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}}$ at iterations 0 (no motion compensation), 1 and 5.
Download figure:
Standard image High-resolution image

**Figure 4.** From top left to bottom right: relative difference $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\widetilde{\boldsymbol{\mu }}-\boldsymbol{\mu }^{{\star}}$ at iterations 0 ( $\hat{\boldsymbol{\alpha }}=\mathbf{0}$ ), 1, 10, 30, 50 and 100.
Download figure:
Standard image High-resolution image

**Figure 4.** From top left to bottom right: relative difference $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\widetilde{\boldsymbol{\mu }}-\boldsymbol{\mu }^{{\star}}$ at iterations 0 ( $\hat{\boldsymbol{\alpha }}=\mathbf{0}$ ), 1, 10, 30, 50 and 100.
Download figure:
Standard image High-resolution image

**Figure 5.** From left to right: relative difference $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\widetilde{\boldsymbol{\mu }}-\boldsymbol{\mu }^{{\star}}$ at iterations 0 ( $\hat{\boldsymbol{\alpha }}=\mathbf{0}$ ), 1 and 5.
Download figure:
Standard image High-resolution image

**Figure 5.** From left to right: relative difference $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\widetilde{\boldsymbol{\mu }}-\boldsymbol{\mu }^{{\star}}$ at iterations 0 ( $\hat{\boldsymbol{\alpha }}=\mathbf{0}$ ), 1 and 5.
Download figure:
Standard image High-resolution image

Quantitative assessment of the convergence rate of TOF JRM and non-TOF JRM is challenging because the objective functions to maximize are different, i.e., plotting the values of ${{\Phi}^{\text{TOF}}}\left(\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}},{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}},\tilde{\mathop{\boldsymbol{\mu }}}\,\right)$ and ${{\Phi}^{\text{PET}}}\left(\,{{\hat{\boldsymbol{f}}}^{\text{PET}}},{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}},\tilde{\mathop{\boldsymbol{\mu }}}\,\right)$ is not informative. We therefore relied on the mean square error (MSE), between the estimated activities, $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}}$ and $\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{PET}}}$ , and the ground truth $\boldsymbol{f}^{{\star}}$ . The MSE is defined for all $\boldsymbol{u}$ , $\boldsymbol{v}\in {{\mathbb{R}}^{{{n}_{\text{v}}}}}$ as

$\begin{eqnarray}&&\text{MSE}\left(\boldsymbol{u},\boldsymbol{v}\right)=\frac{1}{{{n}_{\text{v}}}}\parallel \boldsymbol{u}-\boldsymbol{v}\parallel _{2}^{2}.\end{eqnarray}$

The plots of $\text{MSE}\left(\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}},\,\boldsymbol{f}^{{\star}}\right)$ and $\text{MSE}\left(\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{PET}}},\,\boldsymbol{f}^{{\star}}\right)$ are shown in figure 6. MSE results are consistent with the observations from figures 2 and 3: TOF JRM reaches a quasi-minimal MSE after 3 iterations whereas non-TOF JRM needs more than 30. The difference in MSE is due to the better conditioning of the TOF PET system matrix compared the non-TOF PET one.

**Figure 6.** $\text{MSE}\left(\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{PET}}},\,\boldsymbol{f}^{{\star}}\right)$ and $\text{MSE}\left(\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}},\,\boldsymbol{f}^{{\star}}\right)$ VS iteration number.
Download figure:
Standard image High-resolution image

**Figure 6.** $\text{MSE}\left(\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{PET}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{PET}}},\,\boldsymbol{f}^{{\star}}\right)$ and $\text{MSE}\left(\boldsymbol{W}_{{{{\hat{\boldsymbol{\alpha }}}^{\text{TOF}}}}}\,{{\hat{\boldsymbol{f}}}^{\text{TOF}}},\,\boldsymbol{f}^{{\star}}\right)$ VS iteration number.
Download figure:
Standard image High-resolution image

4. Discussion and conclusion

This paper is an extension of our recent work Bousse et al (2016), and presents preliminary results on JRM with TOF PET data. Results from Rezaei et al (2012) and Rezaei et al (2014) demonstrated that TOF PET can significantly reduce the cross-talk in the joint-likelihood. Our experiments led to a similar observation: when the reconstructed attenuation is misaligned with the PET data, TOF JRM requires 10 to 20 times less iterations than non-TOF JRM. This demonstrates that TOF JRM significantly reduces the cross-talk compared to using non-TOF PET data, and extend its practicality to situations where the μ-map is severely misaligned with the PET.

Acknowledgments

This work was supported by UK EPSRC (EP/K005278/1), the NIHR-funded UCH Biomedical Research Centre and EPSRC CASE Award 13220093 with GE Healthcare.

Maximum-likelihood joint image reconstruction and motion estimation with misaligned attenuation in TOF-PET/CT

Article metrics

Author e-mails

Author affiliations

Dates

Abstract

1. Introduction

2. Method

2.1. TOF maximum-likelihood for joint image reconstruction and motion estimation

2.1.1. Motion-free model.

2.1.2. Model with motion.

2.1.3. Optimization problem.

2.2. Joint reconstruction scheme

2.2.1. Discretization.

2.2.2. Optimization method.

2.2.3. Alternative approach for the single gate case.

3. Experiments on simulated data

3.1. Simulation set-up

3.2. Results

4. Discussion and conclusion

Acknowledgments

Footnotes

Maximum-likelihood joint image reconstruction and motion estimation with misaligned attenuation in TOF-PET/CT

Article metrics

Share this article

Author e-mails

Author affiliations

Dates

Abstract

1. Introduction

2. Method

2.1. TOF maximum-likelihood for joint image reconstruction and motion estimation

2.1.1. Motion-free model.

2.1.2. Model with motion.

2.1.3. Optimization problem.

2.2. Joint reconstruction scheme

2.2.1. Discretization.

2.2.2. Optimization method.

2.2.3. Alternative approach for the single gate case.

3. Experiments on simulated data

3.1. Simulation set-up

3.2. Results

4. Discussion and conclusion

Acknowledgments

Footnotes