TomoSim: a tomographic simulator for DOAS

TomoSim comes as part of project ATMOS, a miniaturised DOAS tomographic atmospheric 1 evaluation device, designed to fit a small drone. During the development of the project, it became 2 necessary to write a simulation tool for system validation. TomoSim is the answer to this problem. 3 The software has two main goals: to mathematically validate the tomographic acquisition method; 4 and to allow some adjustments to the system before reaching final product stages. This measurement 5 strategy was based on a drone performing a sequential trajectory and gathering projections arranged 6 in fan beams, before using some classical tomographic methods to reconstruct a spectral image. 7 The team tested three different reconstruction algorithms, all of which were able to produce an 8 image, validating the team’s initial assumptions regarding the trajectory and acquisition strategy. All 9 algorithms were assessed on their computational performance and their ability for reconstructing 10 spectral "images", using two phantoms, one of which custom made for this purpose. In the end, the 11 team was also able to uncover certain limitations of the TomoSim approach that should be addressed 12 before the final stages of the system. 13

theoretical background with which the paper was built; Section 4 describes the design, the rationale 48 behind it and the technical choices that have been taken in the making of TomoSim; in Section 2 there 49 is a description of the results that were obtained through the simulator, including reconstructions, 50 running times and Section 5 is dedicated to the conclusions that were taken using this piece of software, 51 as well as some foreseeable future developments.  Fundamentally, it is an absorption spectroscopy technique, therefore based on Lambert-Beer's law. 55 This law was actually first formulated by Pierre Bouguer in 1729. At the time, he wrote that "in a 56 medium of uniform transparency, the light remaining in a collimated beam is an exponential function 57 of the length of the path in the medium" [? ]. This theory can thus be written as in Equation 1. 58 I(λ) = I 0 (λ) · exp(−σ(λ) · c · L) (1) In which I 0 (λ) is the source intensity of a light beam, I(λ) the intensity of the light that reaches the 59 detector, σ is the absorption cross-section of the chemical compound being measured, c its concentration 60 and L the optical path of the light. Finally, λ is the wavelength of the radiation.
In the laboratory, this equation can be (and is) used directly and with few obstacles, since there 64 are very few uncontroled variables, and that which exists is controlled for. In the open atmosphere, Scattered sunlight DOAS systems have their own specific particularities and special properties.

77
For instance, the optical path is unknown. It does not use direct sunlight, so there was scattering, but 78 exactly where this takes place cannot be determined a priori. Scattering also implies that there is a 79 fraction of the source's light that does not reach the detector. Besides these, one must account for all the 80 other passive DOAS common effects, such as the fact that there are many absorbers that have spectral In Equation 3, there is more than one absorber, which is denoted by index i in the sum. A(λ, . . .) 86 denotes the fraction of light that gets scattered into the detector, M and R are Mie and Rayleigh 87 scattering coefficients, and the integral is performed on the whole optical path, s.

88
Typically, we measure a trace gas's atmospheric contribution by its total column. This quantity is 89 essentially the integral of the compound's number density, in molecules/cm 3 , over a column that goes Where I re f and A re f are, respectively, the reference light intensity and reference scattered light ].

126
Tomography has had its mathematical basis set by Johannes Radon, a German mathematician that 127 proved that it is possible to represent a function in R in the space of straight lines L through its line 128 integrals. In the tomographic case, these integrals represent a measurement on a ray that is traversing 129 the field of analysis. Each set of line integrals (rays), characterised by a given projection angle, is called to parametrise a ray of light, with the ray being written (in two dimensions): In Equation 7, X 1 and Y 1 are the coordinates for the entry point (of the ray in the analysis field) 140 and X 2 and Y 2 are the exit point coordinates. α is the parametric value. If the ray is totally contained 141 within the field of analysis, this value varies between 0 and 1; otherwise, it has its minimum at the 142 entry point and maximum at the exit.

143
The parametrical representation of the line integral, allows one to recursively calculate all 144 intersections between the ray and the grid defined by the orthogonal lines described above. The 145 differences between intersection points render the lengths of each ray contained within each pixel. The   The overarching division between tomographic reconstruction algorithms is on the level of their 165 nature, which can be analytical or algebraic (iterative). Other subdivisions come from the geometry 166 and the type of technology used for the particular application on which reconstruction is being run. In 167 medical imaging, the most common analytical method is the Filtered BackProjection algorithm (FBP).

168
FBP is based on Fourier's Slice Theorem, which states that the one dimensional Fourier Transform projections can be thought of as a matrix, called sinogram, as has been introduced in this same section.

186
In this matrix, the lines refer to the projection number, and the columns deal with the detectors (for can also be thought of as a matrix, in which each pixel has a given value, which gives it its intensity 189 (and/or colour). Finally, there is the system matrix, which is the matrix that contains the lengths of MLEM algorithms were first published in the medical imaging community in 1982, by Shepp and 213 Vardi [? ]. With this algorithm, image corrections are ruled by Equation 10, which also iterates over k.  These studies and more are addressed in another paper, which should be submitted shortly, and 246 in which the authors have conducted a deeper and more systematic literature review on the subject.

248
This section presents, analyses and discusses results obtained by the application of the techniques 249 and methods described in the previous two sections, i.e., the tomographic reconstruction of the 250 phantoms which were also presented in Section 4. Sinogram examples: the new spectral phantom projection data at a projection interval of 1 degree. On the left, the projection data before resorting; on the right, the parallel projection data obtained after resorting the fan-beam line integrals.

259
Images corresponding to the trace gas distribution within the ROI were reconstructed using 260 iterative and analytical methods. In Figure 3, one can see the reconstruction results for the three tested 261 methods when applied to the new spectral phantom; Figure 4 shows the graphical representation of 262 the reconstruction errors for the spectral phantom and is accompanied by Table 1  ∆. This is, also as expected, confirmed by the upward trend of the error when increasing the value 280 of the angular projection interval, as can be seen numerically in Table 1 and qualitatively in Figure 5. algorithm in this study (SART is also geometry independent, but this particular implementation 290 expects a parallel projection sinogram as input).

291
As stated in section 4.5, three different kinds of error influence the reconstruction results:

301
The three methods were also evaluated as to how they perform computationally, by measuring 302 the time it took to produce the images in Figure 3 using a Paperspace P4000 cloud computing instance.

303
In this regard, the fastest method was FBP, which took around 3 seconds to reconstruct. The second 304 was MLEM, with around 50 seconds for 1000 iterations, and finally came SART, with 1 minute and 50 305 seconds for 1 iteration. One relevant observation comes from the fact that MLEM was significantly 306 faster than SART, even taking into account the difference in optimisation, which was not an expected 307 result and may indicate some reconstruction enhancing technique on SART's side, as the literature 308 seems to indicate that this technique is faster than MLEM [? ].

309
All things considered, the FBP algorithm produces a very good reconstruction, equivalent to 310 SART's, while being more than 10 times faster, indicating that for this kind of application and with this 311 kind of projection information, it is the best reconstruction algorithm. TomoSim is a simulation platform for a drone-mounted atmospheric monitoring system based 315 on DOAS. Although the physical device has not yet been assembled, the team has already compiled 316 a final (or very close) design, which is schematically represented in Figure 6. The reasoning behind the custom design was to increase the maximum payload and allow longer flight times. The team 318 chose to use a DJI S900 frame (hexacopter), with custom-made 368 mm carbon fibre arms, longer and 319 lighter than the original. The increased empty space allows the replacement of the default propellers 320 by 17" carbon fibre units, coupled to 6 E1200 motors. This propeller-motor configuration is not only 321 significantly more powerful than the default assembly, but are also more efficient. According to the    1). When the 2 nd moment is complete, the system has a set of 359 fan beam distributed spectra, which can be equated to projections in a tomography problem. The angular interval between the points on which the drone stops is denoted ∆. The (not represented) angular interval between rays within the fan is, in this particular experiment, also equal to ∆ for reconstruction purposes. Image constructed using Google Maps in 2019.

361
In medicine, a phantom is a model that emulates certain properties of human or animal tissue.

362
Researchers use these models to evaluate therapeutic or diagnostic methods. In the imaging field, 363 phantoms are known matrices with a given size that were designed to mimic the types of bodies that  Table 2.

375
During simulation, a phantom is totally contained within the ROI and some Gaussian noise is 376 added to it. The phantom shares the same grid as the discretised ROI and each pixel has a value

381
Any tomographic reconstruction requires the previous and detailed knowledge of the ray 382 geometry of the problem. This implies that the space being reconstructed is discretised, so that it can 383 be addressed through computational routines. In this case, the discretisation consists in overlaying 384 a 100 x 100 pixel grid (10m square pixels, considering a 1km diameter circular drone trajectory). By 385 applying Siddon's algorithm to this geometry, the lengths that each ray traverse in each pixel of the 386 grid are retrieved, assembling the system matrix. The system matrix is a complete description of the  with reference to this simulator, f is the original image and g is the reconstructed image.

422
The initial goal of the TomoSim software project was to develop a simulation platform with which 423 to recreate the tomographic reconstruction of the column density distribution for a number of target 424 atmoshperic trace gases.

425
The software program was written using the Python language and some numeric calculation 426 libraries, such as NumPy and SciPy. Using these two libraries had two main effects: on the one hand, it 427 enabled the programmers to easily create and manipulate matrices and vectors (images, for instance), 428 and on the other, they greatly improved the running speed of the code, since their core is written in 429 lower level languages (namely C).

430
The simulations that the software performs prove that, if the final device is programmed to 431 comply to trajectory and acquisition requirements, reconstruction is perfectly achievable, even with 432 relatively low projection numbers (comparing with medical imaging procedures). This brings another 433 significant conclusion which is that the devised acquisition definitions, which produce a set of fan 434 beam arrays, provide sufficient projection information to run the reconstruction and achieve plausible 435 results.

436
TomoSim runs three algorithms on the projection data in order to produce the spectral mapping be directly compared to the SART algorithm, due to differences in the optimisation levels of both for t (which correspond to P 1 and P 2 ). Selection is made by determining the returned value of t that 480 maximises the euclidean distance between the produced point and P 1 .  Figure A2 is a graphical representation for the reasoning behind the geometric error estimation.

483
There are two types of error in this image: the ones that come from the RTK-GPS system (positioning 484 error, denoted p ) and the ones that come from the gimbal (pointing error, denoted γ ). TomoSim 485 considers these errors to be normally distributed, and the two values correspond to their standard 486 deviation. To introduce the error into the simulation, the software calculates the theoretical P 1 from the 487 β and D values (see Figure A1) and then adds a normally distributed random number that respects 488 p , retrieving the true P 1 . This new point is used to draw the theoretical line P 1 P 2 and the pointing 489 error is added using the same process as in P 1 . Given the very low gimbal error, the small angles 490 approximation (sin θ = θ) is used to determine the theoretical value of P 2 , on the drone's circular 491 trajectory. Finally, the software adds again the positioning error, in the same manner as it had on P 1 .

492
As a finishing remark, it is important to note that in Figure A2, all errors are extremely exaggerated as 493 they would not be visible otherwise, due to the huge size difference between them and the trajectory.

495
As stated in Section 4, the simulation uses a phantom matrix to perform its reconstructions.

496
However, this is not purely the input to the simulator routine. Some transforms, introducing random 497 variation, are run beforehand.

498
As described in Figure A3, the simulator starts by loading the phantom data. This is a 300 by 499 300 pixels image, which is created by running the TomoPhantom [? ] software. This matrix, which 500 is parametrically stored, is always the same. The data it contains are 64 bit floating point numbers, 501 ranging from 0 to 1 (0 is black; 1 is white). This image is then normalised to range between 0 and 255, 502 and its data type is changed to unsigned 8 bit integers.

503
The extremes of concentration values are randomly taken from a uniform distribution between 504 1e15 and 1e17, rendering min_val and max_val. These two values will be used as concentration limits 505 for the phantom matrix, with min_val representing the baseline (0 in the phantom matrix) and max_val 506 representing the maximum value in the ROI (255 in the phantom matrix).

507
The projection operator which geometrically describes the projection system is applied to the 508 phantom matrix, rendering the sinogram. Before reconstruction, the program adds poissonian noise to 509 this matrix.  Figure A3. The data flowchart of the simulation routine. The phantom is entered to the routine as a fixed matrix, but is transformed so that each time the program runs, the reconstruction is randomly different and more in line with what happens in nature.