Three dimensional photoacoustic tomography in Bayesian framework

framework Jenni Tick, Aki Pulkkinen, Felix Lucka, Robert Ellwood, Ben T. Cox, Jari P. Kaipio, Simon R. Arridge, and Tanja Tarvainen Department of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, Finland Centrum Wiskunde and Informatica, P.O. Box 94079, 1090 GB Amsterdam, Netherlands Department of Medical Physics and Biomedical Engineering, University College London, Gower Street, London, WC1E 6BT, United Kingdom Dodd-Walls Centre, Department of Mathematics, University of Auckland, Private Bag 92019, Auckland Mail Centre, Auckland 1142, New Zealand Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom


I. INTRODUCTION
Photoacoustic tomography (PAT) is a hybrid imaging modality that combines optical excitation with ultrasonic detection. [1][2][3][4][5][6][7] This allows for both high contrast and high resolution to be achieved simultaneously. PAT is a non-ionizing and non-invasive imaging technique, and it can provide structural, functional, and molecular information. 2,3 These features make PAT an attractive imaging modality, and it has shown potential in a variety of biomedical applications. 3,[5][6][7] In PAT, a short (nanosecond scale) pulse of visible or near-infrared light is used to illuminate the tissue region of interest. As light propagates within the object, it is absorbed leading to localized (weak) increases in pressure and generation of a pressure wave. The propagated pressure waves are measured on the surface of the object by ultrasound sensors.
The inverse problem of PAT is widely studied and a variety of reconstruction methods for the estimation of the initial pressure distribution have been developed. 8 A widely utilized method for photoacoustic image formation is the backprojection algorithm. [9][10][11][12] This algorithm is based on analytical inversion formulas for an approximate problem. In the approach, the initial pressure is reconstructed by summing up the backprojected measured pressure signals with appropriate time delays. The eigenfunction expansion method 13,14 is another approximate problem based method and it aims to solve the image reconstruction problem analytically as well. In this method, the initial pressure is obtained as the series solution and series coefficients are calculated from measured pressure signals. However, both of these methods are limited to specific geometries such as spherical, cylindrical, and planar acoustic detection surfaces.
On the other hand, time reversal, [15][16][17][18] penalized least squares, [19][20][21][22][23][24][25][26][27][28] and Bayesian approaches 29 utilize the numerical solution of the problem. These approaches are computationally more intensive, as the wavefield within the entire domain needs to be computed. On the other hand, they allow performing image reconstruction in more general imaging scenarios than the backprojection and eigenfunction expansion algorithms. Furthermore, they can incorporate acoustic heterogeneities such as variations in speed of sound and acoustic attenuation. 17,18,27,[30][31][32] Time reversal algorithms perform image reconstruction by simulating the propagation of the time-reversed measured signals back into the volume. Approaches based on penalized least squares perform image reconstruction by minimizing the sum of the misfit between measured signals and signals simulated by a photoacoustic forward model and a regularizing penalty functional.
Recently, a Bayesian approach to PAT was suggested. 29 In this approach, all parameters are modeled as random variables and the formal solution of the inverse problem consists of a probability density for the initial pressure in each voxel of the reconstruction domain. It combines the information obtained through the measurements, the forward model, and the prior model for unknown parameters. In addition, the Bayesian approach facilitates representing and taking into account the uncertainties in parameters, models, and geometries. [33][34][35][36][37] In Ref. 29, the Bayesian approach to PAT was tested with two dimensional (2D) simulations. The results showed that the Bayesian approach can be used to provide accurate estimates of the initial pressure distribution as well as information about the uncertainty of the estimates.
In this paper, photoacoustic image reconstruction with uncertainty quantification in the Bayesian framework is extended to three dimensions (three dimensional, 3D). Due to the large dimension of the problem, the closed form matrix presentation that was used in the 2D case can no longer be applied. Therefore, a matrix-free method is used to compute point estimates of the posterior distribution. The method utilizes the adjoint of the forward operator 27 implemented with the k-space time domain method. 38 The point estimates for the image reconstruction and credibility evaluation are computed iteratively using a biconjugate gradient stabilized 39 method.
The rest of the paper is organized as follows. After a short introduction to PAT and Bayesian inversion in Sec. II, the implementation of 3D matrix-free image reconstruction and uncertainty quantification are described in Sec. III. Then, the approach is tested with numerical simulations in Sec. IV and experimental studies in Sec. V. Finally, a discussion of the results and drawn conclusions are given in Sec. VI.

A. Photoacoustic model
In a linear and homogeneous medium, the acoustic part of the photoacoustic signal generation can be described by a wave equation where c is the speed of sound and p 0 is the initial pressure distribution. 1 In photoacoustics, the acoustic pressure wave p is measured only in some locations r L 2 @X, for some time period t 2 [0,T], with X being a spatial subset of the modeling domain, and T being the time duration that the photoacoustic time series is captured for. In practice, the measured pressure waves are polluted with noise, which is commonly assumed to be additive. To perform PAT image reconstruction, we can use the following discrete observation model: where p t 2 R m is a vector composed of the acoustic pressure waves sampled at the sensors at a set of discrete time points, p 0 2 R n is the discrete initial pressure distribution, K 2 R mÂn is the linear operator, which maps the initial pressure distribution to the measurable data by discretizing the forward model (1), and e 2 R m denotes the noise.

B. Image reconstruction with uncertainty quantification
In the inverse problem of PAT, the initial pressure distribution p 0 is estimated from the measured pressure waves p t based on Eq. (2). Here, a Bayesian approach 33,40 is taken.
In the Bayesian approach, the parameters p t , p 0 , and e of the observation model [Eq. (2)] are treated as random variables. The solution of the inverse problem is the posterior density, which can be written in the form where p(p 0 ) is the prior probability density and pðp t jp 0 Þ is the likelihood density. The posterior density reflects the uncertainty of the unknown initial pressure distribution p 0 given the acoustic pressure measurements p t . Noise and prior distribution are assumed to be mutually independent and normally distributed, i.e., e $ N ðg e ; C e Þ and p 0 $ N ðg p 0 ; C p 0 Þ where g e and C e are the mean vector and covariance matrix of the noise, respectively, and g p 0 and C p 0 are the mean vector and covariance matrix of the prior model, respectively. These assumptions are typical for many imaging modalities, including PAT. With these assumptions and observation model [Eq. (2)], the posterior density [Eq.
(3)] becomes 29 where L T e L e ¼ C À1 e and L T p 0 L p 0 ¼ C À1 p 0 are matrix square roots such as Cholesky decompositions of the inverse covariance matrices of the noise and prior, respectively.
In the case of a linear observation model and Gaussian noise and prior, the posterior density [Eq. (4)] is also a Gaussian distribution N ðg p 0 jp t ; C p 0 jp t Þ. The mean g p 0 jp t and covariance C p 0 jp t can formally be written in the form where Instead of solving the whole posterior distribution directly using Eqs. (5)-(8) it can be evaluated by computing point estimates. In this paper, a maximum a posteriori (MAP) estimate is considered. In a purely Gaussian case, the MAP estimate coincides with the (conditional) mean of the posterior distribution p 0;MAP ¼ g p 0 jp t . Furthermore, in the Bayesian approach, also the reliability of the reconstructed image can be assessed by computing uncertainty measures of the estimates. Here, the marginal densities of the posterior distribution in some individual voxel are computed. Since the joint density is a Gaussian, all marginal densities are Gaussian p 0;k jp t $ N ðg p 0 jp t ;k ; C p 0 jp t ;kk Þ; where g p 0 jp t ;k is the value of g p 0 jp t in the kth voxel and C p 0 jp t ;kk is the value of the kth diagonal element of C p 0 jp t .

III. IMPLEMENTATION
A. Numerical method for wave propagation In this paper, the k-space time domain method implemented in the k-Wave MATLAB Toolbox is used to solve the initial value problem [Eq. (1)]. 38,41 In the k-space method, the spectral calculation of spatial derivatives is combined with a temporal propagator expressed in the spatial frequency domain or k-space. This allows field gradients to be calculated efficiently using the fast Fourier transform. Therefore, the k-space method enables a computationally efficient way to solve the initial value problem.

B. Matrix-free implementation of the image reconstruction
In this paper, the reconstructed image is obtained by computing the MAP estimate, Eq. (5). This is equivalent to solving a linear system where and I is an identity matrix. The advantage of expressing Eq. (5) in the form of Eq. (10) is that the inversion of the covariance matrix of the prior C p 0 can be avoided, and that efficient iterative linear equation solvers can be utilized. Due to the large dimension of the problem, a matrix-free method is used to solve the linear system. Here, a biconjugate gradient stabilized (l) method builtin MATLAB (2015b; The MathWorks, Inc., Natick, MA) is used for the solution of the system of Eq. (10). During the conjugate gradient iteration, the matrix-vector product on the left-hand side of Eq. (10) is computed by evaluation of sequential linear operators. First, the product Kg p 0 jp t is provided with the k-Wave MATLAB toolbox. 41 Second, the multiplication with C À1 e is trivial since it is a diagonal matrix (see Sec. III E). Third, the multiplication by the transpose of the forward model, K T , can be computed by solving an adjoint wave equation, which, again, can be implemented using a k-Wave as described in Ref. 27. Finally, the prior density is evaluated as described in Sec. III D, and the product with an identity matrix results in just a vector addition. The vector d in Eq. (10) is formed similarly.
Here, the biconjugate gradient solver is started with an initial guess chosen as where p 0,TR is a time reversal solution of the initial pressure distribution andâ is the solution of a minimization problem Although the optimality of this initial choice was not studied in this paper, it was verified with 2D simulations to converge to the correct minimum in a reasonable time.

C. Matrix-free implementation of the reliability estimation
The uncertainty of the reconstructed initial pressure is given by the posterior density. For the individual voxel, this is given by Eq. (9). Thus, for the kth voxel at r k the value of the kth diagonal element of C p 0 jp t needs to be determined. Due to the large dimension of the problem, the posterior covariance matrix cannot be explicitly constructed, but its kth column can be computed by solving the linear system where C is as in Eq. (11) and e k is a unit vector with value one at the kth element and zero elsewhere. The linear system [Eq. (15)] is solved using a biconjugate gradient stabilized (l) method built-in MATLAB. Again, as in the computation of the MAP estimate in Sec. III B, the matrix-vector product on the left-hand side of Eq. (15) is replaced by the evaluation of sequential linear operators during the biconjugate gradient iteration. Furthermore, the iterations are started from an initial guess where r p 0 is the standard deviation of the prior, r e is the standard deviation of the noise, N is as in Eq. (18), and e k is a unit vector. This choice was verified to converge to the correct minimum using 2D simulations.

D. Prior
In this paper, the prior model for the unknown initial pressure p 0 was chosen to be based on an Ornstein-Uhlenbeck 42 process. The Ornstein-Uhlenbeck prior supports the correlation between neighborhood voxels promoting distributions, which can be locally close to homogeneous. This prior can be assumed to be a good presumption for PAT where some spatial correlation in parameter values between the voxels can be expected and where it is possible that the target is composed of heterogeneities separated by sharper edges such as blood vessels. The Ornstein-Uhlenbeck prior is a Gaussian prior distributed with mean g p 0 and covariance matrix C p 0 , which is defined to be with where i and j are the voxel indices, r i and r j are the corresponding voxel locations, respectively, r 2 p 0 is the variance, and ' is the characteristic length scale, which controls the spatial range of correlation. Previously, the Ornstein-Uhlenbeck prior has been used in PAT and quantitative PAT in Refs. 43-47. Evaluation of a matrix-vector product set by Eqs. (17) and (18) in Eqs. (10)-(12) and (15) corresponds to computing a 3D convolution. On regular grids, such as those used in this work to discretize the pressure fields, convolutions can be efficiently evaluated by the fast Fourier transform. To simulate zero boundary conditions the vector that C p 0 is multiplied with is transformed into a 3D array and zero-padded at all ends of the coordinate axes before taking the 3D Fourier transform. Next, the array is pointwise multiplied by the corresponding zero-padded discrete Fourier transformed origin centered covariance function [Eq. (18)], and the inverse Fourier transform is taken. This allows for efficient evaluation of a covariance matrix-vector product. The length of the zero-padding is chosen such that the covariance function [Eq. (18)] will fall beneath a threshold level (10 À6 in this paper) within the padded distance. In voxels, this can be computed as where N is the padding distance, Dh is the discretization length of a voxel, and is the threshold value.

E. Determination of noise statistics
In the Bayesian approach with Gaussian assumptions, information about the noise statistics can be incorporated into the solution of the inverse problem in the form of the mean g e and covariance C e of the noise term e in Eq. (2). Typically, measurement setups are such that g e ¼ 0 and the covariance can be approximated as C e ¼ r 2 e I where I is an identity matrix. This model is also used in this paper.
The noise of the experimental data is characterized in each sensor by determining the mean and standard deviation from a time frame of the measured signal that is supposed to contain only noise. An alternative would be to perform additional calibration measurements by measuring pressure time courses on all sensors with the complete experimental setup in place but without firing the excitation laser. The mean g e,k and standard deviation r e,k for each sensor are estimated as where p k,i is a pressure signal in the kth sensor at the ith time point and N t is the number of time points in the time frame windowed for the noise characterization. Thus, the mean of the noise is a vector where n s is the number of sensors, and the covariance matrix of the noise is a diagonal matrix with the values of variance r 2 k on the diagonal C e ¼ diagfr 2 1 …r 2 n s g: IV. SIMULATIONS

A. Geometry and discretization
In simulations, a three dimensional cubic domain with the side length of 10 mm was considered. Discretizations of the domain in data simulation and image reconstruction are given in Table I. Different discretizations in data simulation and image reconstruction were used in order to avoid socalled inverse crime.
In this paper, a full view sensor geometry and two limited view sensor geometries were considered. The sensor geometries are illustrated in Fig. 1. In the full view (6-side) sensor geometry, 62 119 sensors were located on all faces of the cube. In the first limited view sensor geometry (L-shape), 20 808 sensors were located on 2 adjacent faces (z ¼ 5 mm and x ¼ 5 mm) of the cube. In the second limited view setup (1side), 10 404 sensors were located only on 1 side (z ¼ 5 mm) of the cube. The sensor pitch was 98 lm. This type of measurement setup simulates a Fabry-P erot based sensor head; see, e.g., Refs. 48 and 49.

B. Data simulation
In the data simulation, a non-attenuating medium with a constant speed of sound c ¼ 1500 m/s was considered. The true simulated initial pressure distribution contained nine spheres of 1.43 mm radius on a homogenous background. Eight spheres were located close to the corners of the cube and one sphere was located in the center. The ambient initial pressure was set to zero and the initial pressures within inhomogeneities were Gaussian functions with a peak value of 5 and a full-width at half-maximum of 0.4 mm. Figure 2 shows the simulated initial pressure distribution. This kind of positioning and values of initial pressure distribution could approximately correspond to a photoacoustic phantom colored with ink and emerged in water. Due to its symmetric structure, it can be beneficial, for example, in examining limited-view artefacts.
The data were simulated using the k-space time domain method implemented with the k-Wave MATLAB toolbox 41 as described in Sec. III A. The pressure signals were sampled at 60 MHz and discretized into 849 temporal points at each of the acoustic sensor locations. This corresponds to a recorded temporal pressure time series duration of 14.1 ls, corresponding to an acoustical propagation distance of 21 mm. Normally distributed zero-mean noise with a standard deviation equal to 1% of the peak amplitude of the simulated pressure signal was added to the data to simulate measurement noise. This noise statistics correspond to the noise achievable with a Fabry-P erot based sensor head used in the experiments. 48

C. Image reconstruction and posterior uncertainty
The MAP estimates were computed iteratively as described in Sec. III B by solving the system of equations (10)- (12). The initial guess for the iterations was chosen as in Eq. (13). The measurement noise was considered to be uncorrelated with the standard deviation set to 1% of the peak positive amplitude of the noisy simulated data. The Ornstein-Uhlenbeck prior described in Sec. III D was used as prior information. The values of the noise and prior parameters used in the reconstructions are given in Table II both for the simulation result (Sec. IV D) and the two experimental cases (Sec. V). For comparison, a time reversal solution was computed. 18,41 To make the comparison easier, the time reversal solution was scaled with factorâ in Eq. (14). The computations were performed on a graphics processing unit (GPU).
Accuracy of the estimates was evaluated by computing the relative errors of the reconstructions with respect to the true initial pressure distribution where p 0 is the simulated initial pressure distribution interpolated to the reconstruction space andp 0 is the estimated value.
The reliability of the estimates was assessed by calculating the marginal densities inside the domain using Eq. (9). Variance C p 0 jp t ;kk needed for the marginal density was computed iteratively as described in Sec. III C by solving the linear system [Eq. (15)]. The initial guess for iterations was chosen as in Eq. (16). The computations were performed on a GPU. Figure 3 shows reconstructions obtained using the Bayesian approach, whereas reconstructions using time reversal are shown in Fig. 4. It can be seen that both Bayesian and time reversal reconstructions obtained using the full view sensor geometry match qualitatively to the true initial pressure distribution. As the number of the detection surfaces is reduced, the estimates of the initial pressure become more distorted in the areas far from any sensor. The distortion of the estimates increases the further the inclusions are from the sensors. In addition, quantitative values in these distorted areas are reduced. This is evident especially in the bottom rows of Figs. 3 and 4. The relative errors listed in Table III support these observations. The relative errors of the estimates obtained using the full view sensor geometry have the smallest values, and the errors increase as the number of detection surfaces decreases. Based on the reconstructions and relative errors, it can be seen that the Bayesian approach tolerates limited view artefacts better than the time reversal. However, it should be noted that the quality of the reconstructions obtained using time reversal can be improved by using an enhanced version of time reversal such as iterative time reversal. 27  Values of the noise and prior parameters: mean g, standard deviation r, and characteristic length scale ' (mm). The subindex e refers to noise, whereas the subindex p 0 refers to prior. In the case of real data measurements, the two values represent the ranges of the noise parameters at different measurement channels.

Spheres
Leaf Mouse In the Bayesian approach, uncertainties of the reconstructed images can also be assessed. Figure 5 shows the marginal densities at the point inside of the domain that is indicated in Fig. 2 with an asterisk. In the full view sensor geometry, the maximum of the marginal density is located close to true value. In the limited view sensor geometries, the maximum of marginal density is further from the true value. However, the marginal density is wider in the limited view sensor geometries than in the full view sensor geometry. Therefore, true value is also supported by marginal densities when the limited view sensor geometries are used. The wider marginal density in the case of the limited view sensor geometry indicates that the uncertainty of the estimate obtained using the full view sensor geometry is smaller than the uncertainty of the estimate obtained using the limited view, as should be expected due to the fact that a lesser number of limited view sensors carry less information.

A. Measurement setup
The approach was tested with experimental data obtained from a phantom and a mouse. The phantom was a skeletal leaf that was submerged in India ink for contrast enhancement. A photograph of the phantom is shown in Fig. 6.
In addition, photoacoustic data from a mouse head were acquired.
The data were acquired using a photoacoustic measurement system 48,49 developed in the Photoacoustic Imaging Group of University College London. In the phantom measurement, a neodymium-doped yttrium aluminum garnet (Nd:YAG) laser (8-ns pulse length) operating at 1064 nm was used to illuminate the imaged object. In the mouse imaging, the illumination was done by two Nd-YAG pumped optical parametric oscillators (pulse widths of 8 ns and 6 ns) that were configured to a wavelength of 755 nm. The emitted photoacoustic signals were recorded using a planar Fabry-P erot sensor (a nominal À3 dB bandwidth of 39 MHz). In addition, the leaf phantom was also imaged using an orthogonal Fabry-P erot sensor. In the measurements, an area of approximately 10 mm Â 10 mm on the sensors was scanned with a step size of 100 lm. The phantom and the mouse head were coupled to the sensor using FIG. 3. (Color online) The reconstructed initial pressure distribution obtained using the Bayesian approach. From top to bottom: the reconstructed image obtained using full view sensor geometry (first row), the reconstructed image obtained using L-shape sensor geometry (second row), and the reconstructed image obtained using 1-side sensor geometry (third row). The left image shows the contour surface that indicates the areas where the parameter has value 1 or more. The three images on the right represent maximum intensity projections along axis directions x, y, and z.

FIG. 4. (Color online)
The reconstructed initial pressure distribution obtained using the time reversal. From top to bottom: the reconstructed image obtained using full view sensor geometry (first row), the reconstructed image obtained using L-shape sensor geometry (second row), and the reconstructed image obtained using 1-side sensor geometry (third row). The left image shows the contour surface that indicates the areas where the parameter has value 1 or more. The three images on the right represent maximum intensity projections along axis directions x, y, and z. deionized water. More details on the phantom, experimental setup, and measurements can be found in Ref. 49.

B. Image reconstruction
Before reconstructions, the measured pressure signals were filtered using a bandpass filter with cutoff frequencies between 0.5 and 20 MHz to remove nuisance signal components such as a rising trend of the measured pressure signals. Bandpass filtering was also taken into account in the forward model. Image reconstruction was performed using the Bayesian approach as described in Sec. III B by solving the system of equations (10)- (12). The initial guess for the iterations was chosen as in Eq. (13). The reconstruction domains were rectangular volumes whose discretizations are listed in Table IV. In the reconstructions, the sound speed of the medium was set to 1488 m/s. The noise statistics were determined from the measured noise signal as described in Sec. III E using 80 time points. Obtained mean and standard deviation of the noise, as well as the chosen values of the prior parameters, are listed in Table II. Since the measurement system was not calibrated to measure absolute pressure values and quantitative prior information on the phantom was not available, units of the noise and prior parameters were considered as arbitrary units. Furthermore, due to these same reasons only MAP estimates without uncertainty evaluation were considered. Again, a time reversal solution was computed for comparison.

C. Results
The contour surfaces of the reconstructed images obtained from the leaf phantom data using the Bayesian and time reversal approaches are presented in the first row of Fig. 7 for the planar sensor and in the second row for the orthogonal sensor. Correspondingly, the maximum intensity projections of the reconstructions are shown in Figs. 8 and 9. As it can be seen, only the veins that run parallel to the sensor are recovered well when the planar sensor is used. The  orthogonal sensor gives a more complete reconstruction of the vein-like structure of the leaf, since the majority of the veins running in both orientations are recovered well. In particular, the veins that are close to the sensor appear sharp with both sensors. As it can be seen in these visualizations, the reconstructions look visually equally good for both the Bayesian approach and time reversal. The contour surfaces of the reconstructed images obtained from the mouse head data using the Bayesian and time reversal approaches are presented in Fig. 10, and the maximum intensity projections of these reconstructions are shown in Fig. 11. It can be seen that some of the vascular and anatomical structures can be identified from the reconstructions. When comparing the reconstruction obtained using the Bayesian approach to the reconstruction obtained using time reversal, it can be seen that the main features of the reconstructed images are the same.

VI. DISCUSSION AND CONCLUSIONS
In this paper, the recently proposed Bayesian approach to PAT was extended to three dimensions and a matrix-free method for the solution of this approach was described. Image reconstruction and uncertainty quantification were performed iteratively using a biconjugate gradient stabilized method equipped with the adjoint of the acoustic forward operator. The approach was tested using both simulated and experimental data with different sensor geometries. The reconstructions were compared to time reversal solutions.  The simulations show that the reconstructed images computed using the proposed approach can provide both qualitative and quantitative information about the targets in terms of their location, size, shape, and initial pressure values if the full view sensor geometry is used. In the limited view sensor geometry, distortion of the target size and shape can be noted. Furthermore, the quantitative accuracy is reduced. In addition to the reconstructed images, the uncertainty of these images can be assessed in the Bayesian approach. The uncertainty of the estimates obtained using the full view sensor geometry is small. The uncertainties of the estimates increase as the number of detection edges decreases.
Image reconstruction was also studied using experimental data. The reconstructed images represent the features of the imaged object, and the results compare well with time reversal reconstructions. It seems that in the case of the leaf phantom, the Bayesian approach is able to detect structures deeper than the time reversal. On the other hand, in the case of the mouse head, some differences between the reconstructions obtained with the Bayesian approach and time reversal can be seen. In this case, both reconstructions include stronger modeling errors since the homogeneous wave equation does not model wave propagation correctly in the heterogeneous mouse head. In addition, bones of the mouse head should be modeled as elastic media. These modeling errors can cause various artefacts in the reconstructed images. For example, any reverberations present in the data caused by acoustic heterogeneities can be projected deep in the tissue as an incorrect initial pressure distribution. For more information on sound propagation, simulation, and photoacoustic imaging in elastic media, see, e.g., Refs. 52-58. The quantitative values of the experimental phantom could not be studied since the measurement system was not calibrated to measure absolute pressure values and quantitative prior information on the phantom was not available.
If compared to time reversal, the Bayesian approach is computationally more expensive since it requires solving a large system of equations. In this work, this system of equation was solved iteratively in a matrix-free form. The computation times for the MAP estimates varied between 1 and 20 h depending on the detector geometry. However, the convergence criteria of the algorithms were set very tight, which lead to long computation times, and in practice it may be possible to relax this condition. This would lead to less accurate estimates especially in the areas that are not enclosed by the sensors and increasing values of the relative errors. Convergence of the algorithm, when the marginal densities of the posterior covariance were solved, was slow. In fact, the residual remained quite large, which we believe is related to slow convergence of the cross-covariance values. The standard deviation values, on the other hand, seemed to converge and are reasonable when compared to each other and the results of 2D simulations. Thus, computational efficiency of the algorithms still needs to be improved and their converge needs to be studied in more detail. Further, it could be possible to utilize a model reduction approach, for example, using Bayesian approximation error modeling, 34,35,37,59,60 to decrease the memory requirements and speed up the computations. On the other hand, the computational cost of the FIG. 10. The contour surface of the reconstructed image obtained from the mouse head data using the Bayesian approach (first column) and time reversal (second column).
FIG. 11. (Color online) Photoacoustic images of the mouse head. From top to bottom: the reconstructed image obtained using the Bayesian approach (first row) and the reconstructed image obtained using time reversal (second row). Images represent maximum intensity projections along axis directions x, y, and z.
Bayesian approach can be justified with the quantitative information that can be provided with the approach. That is, the method can be used to provide a probability distribution with mean and standard deviation of the parameters of interest, i.e., initial pressure, in each voxel of the domain. In addition, the Bayesian approach is advantageous when the uncertainty of the image reconstruction grows, e.g., with less sensors, more limited-view sensor geometry, more modelmismatch, since it can take into account uncertainties in parameters, models, and geometries.

ACKNOWLEDGMENTS
This work has been supported by the Academy of Finland (Project Nos. 286247, 314411, and 312342 Centre of Excellence of Inverse Modelling and Imaging), Instrumentarium Science Foundation, Saastamoinen Foundation, and Jane and Aatos Erkko foundation. F.L. acknowledges financial support from EPSRC project EP/ K009745/1 "Dynamic High Resolution Photoacoustic Tomography System" and the Netherlands Organization for Scientific Research (NWO), Project No. 613.009.106/2383. In addition, we gratefully acknowledge the support of NVIDIA Corporation (Santa Clara, CA) with the donation of Tesla K40 GPU used for this research.