Systematic Comparison of Vectorial Spherical Radiative Transfer Models in Limb Scattering Geometry

A comprehensive inter-comparison of seven radiative transfer models in the limb scattering geometry has been performed. Every model is capable of accounting for polarisation within a fully ::::::::: polarization :::::: within :: a spherical atmosphere. Three models (GSLS, SASKTRAN-HR, and SCIATRAN) are deterministic, and four models (MYSTIC, SASKTRAN-MC, Siro, and SMART-G) are statistical using the Monte Carlo technique. A wide variety of test cases encompassing different atmospheric conditions, solar geometries, wavelengths, tangent altitudes, and Lambertian surface reflectances have been de5 fined and executed for every model. For the majority of conditions it was found that the models agree to better than 0.2% in the single scatter test cases and better than 1% in the multiple scatter scalar and vector test cases :::: scalar :::: and ::::::: vectorial :::: test :::: cases :::: with :::::::: multiple :::::::: scattering :::::::: included, with some larger differences noted at high values of surface reflectancein multiple scatter. For the first time in limb geometry, the effect of atmospheric refraction was compared among four models that support it (GSLS, SASKTRAN-HR, SCIATRAN, and SMART-G). Differences among most models in multiple scatter with :::: with 10 ::::::: multiple :::::::: scattering :::: and refraction enabled was less than 1%, with larger differences observed for some models. Overall the agreement among the models with and without refraction is better than has been previously reported in both scalar and vector ::::::: vectorial modes. Copyright statement.

analytic weighting functions calculated, and two-and three-dimensional atmospheres can be handled. The primary application of SASKTRAN has been as the forward model for retrievals of ozone (Degenstein et al., 2009), stratospheric aerosols (Bourassa et al., 2007), and nitrogen dioxide (Sioris et al., 2017) from the OSIRIS instrument. However, SASKTRAN has also been used in a variety of projects unrelated to OSIRIS. SASKTRAN has been adapted to process limb retrievals from other instruments, including stratospheric aerosols from SCIAMACHY measurements  and stratospheric ozone from OMPS-130 LP measurements (Zawada et al., 2018). SASKTRAN was also used to analyse data from an acousto-optical tuneable filter based instrument, the Aerosol Limb Imager (ALI, Elash et al. (2016)), which is conceptually similar to the VIS-NIR channel of ALTIUS.
SASKTRAN-HR has various options that control the accuracy of the solution, but the main one is the number of diffuse profiles, i.e. the number of discretizations used in solar zenith angle to compute the multiple scatter field. The model has 135 been configured to use the number of diffuse profiles required to obtain approximately 0.2 % accuracy as a function of solar conditions shown in Zawada et al. (2015).

SCIATRAN
The SCIATRAN software package provides tools for modelling radiative transfer processes in the ultraviolet to thermal infrared spectral range. A detailed review of available algorithms, selected comparisons, and applications is given by Rozanov et al. 140 (2014) : , ::: and ::::: more :::::::::: information :::: can :: be :::::: found :: at : https://www.iup.uni-bremen.de/sciatran/. SCIATRAN contains databases (or code modules) of optical properties and climatologies, and a set of engines to solve the RTE. The only solver capable of simulating the vector ::::::: vectorial radiance field is the Discrete Ordinates Method-Vector (DOM-V) solver. The DOM-V solver uses the discrete ordinates method to simulate the pseudo-spherical solution used to initialize the spherical integration.
In the limb viewing geometry SCIATRAN can operate in two modes, fully spherical and approximate spherical. In the 145 approximate spherical mode, the single scatter radiance is calculated accounting for the full sphericity of the Earth, while the multiple scattered signal is approximated by several pseudo-spherical calculations. In fully spherical mode, the approximate spherical solution is iterated in a fully spherical geometry to account for sphericity effects. The comparisons shown use :::::::::: SCIATRAN :::: uses the fully spherical mode of SCIATRAN :: in ::: the ::::: shown ::::::::::: comparisons.
SCIATRAN accounts for refractive effects and calculates approximate weighting functions. SCIATRAN has been used in 150 numerous applications spanning multiple research areas. One of SCIATRANs ::::::::::: SCIATRAN's : primary applications is its use as the forward model for SCIAMACHY limb scatter retrievals of including, but not limited to, ozone (Rozanov et al., 2007), water vapour (Rozanov et al., 2011b), and stratospheric aerosols (Von Savigny et al., 2015;Malinina et al., 2018). SCIATRAN has also been successfully used in the inversion of data products from the observations of other limb missions, including the retrievals of stratospheric aerosol from OSIRIS measurements , and ozone and stratospheric aerosol from as benchmark models and to study new effects. Statistical models are typically orders of magnitude slower than deterministic models and are thus usually not used in operational retrieval methods. They also contain inherent random noise, driven by the number of photons used in the simulation, which may need to be considered depending on the application.

SASKTRAN-MC
The SASKTRAN RTM contains a MC mode (SASKTRAN-MC) based upon the Siro algorithm (Oikarinen et al., 1999). The primary purpose of SASKTRAN-MC is to serve as a benchmark for the SASKTRAN-HR model.

190
Siro is capable of simulating radiances where the atmosphere varies three-dimensionally (not only in altitude). Siro is commonly used as a reference model for both studying limb scattered radiance and in comparisons with other RTMs. Oikarinen et al. (1999) used Siro to demonstrate the importance of multiple scattering for limb scatter instruments, and to simulate the effects of a three-dimensionally ::::::::::::::: two-dimensionally varying reflective surface (Oikarinen, 2002). In Oikarinen (2001) the effect of polarization on limb scatter radiance was assessed in detail using Siro. Siro played a key role in the RTM inter-comparison 195 study performed by Loughman et al. (2004) as one of the MC reference models.

SMART-G
SMART-G (Speed-up Monte-carlo Advanced Radiative Transfer code with GPU) is a radiative transfer solver for the coupled ocean-atmosphere system with a wavy interface (Ramon et al., 2019) or any surface spectral BRDF boundary condition. It is based on the MC technique, works in either plane-parallel or spherical-shell geometry, and accounts for polarization. The 200 vector ::::::: vectorial code is written in CUDA (Compute Unified Device Architecture) and runs on GPUs (Graphic Processing Units). Physical processes included in the current version of the code are elastic scattering, absorption, reflection, thermal emission, and refraction.
The radiances at any level of the domain can be estimated using the local estimate variance reduction method (Marchuk et al., 2013). Benchmark values are accurately reproduced for clear (Natraj and Hovenier, 2012) and cloudy atmospheres  While all four models listed above use the Monte Carlo technique, there is one difference :::: There :: is : a :::::: subtle :::::::: difference :: in ::: the :::: way :: the :::::::::: backwards ::::: Monte ::::: Carlo ::::::: method :: is ::::::::::: implemented that can be noticed in the subsequent comparisons. One option is to trace rays through the atmosphere, and calculate the scattering probability at each layer interface. This gives the possibility of photons not scattering and directly escaping the atmosphere, which is important for estimates of radiative fluxes (not directly applicable 215 for limb scatter measurements). A consequence of this is that at higher ::::: longer : wavelengths and higher tangent altitudes where the atmosphere is optically thin, a large number of photons are required to reduce the statistical noise to acceptable levels. In this study this technique is used by MYSTIC and SMART-G.
An alternative technique is to force every photon traced backwards from the observer to scatter. Random numbers are generated to determine the scatter location, not if scattering actually occurs. Photons can then be weighted by the optical 220 thickness to account for the probability of scattering. A benefit of this technique is that the number of photons required to hit a desired noise floor is more uniform in wavelength and altitude space, but the technique is more specific to limb scattering measurements. Siro and SASKTRAN-MC both use the same force-scatter technique :::::::: technique ::::: where ::::: every :::::: photon :::::: traced :: is ::::: forced :: to :::::: scatter.
SMART-G includes an option to force additional scattering in limb mode, but only for the first (single) scatter. The option 225 has a similar effect to the technique used by Siro/SASKTRAN-MC in that it reduces the variance of the calculation for optically thin scenarios, but it is not exactly equivalent.
Overall, all of the mentioned techniques solve the radiative transfer equation with the same level of accuracy. The only difference is the number of photons required to reach a desired level of precision. A more in-depth discussion of the computational efficiency of the different techniques for different scenarios is presented in Sec. 4.5.

Model Test Cases
Test cases are designed to explore the aspects of the RTMs that are applicable for past, present, and future satellite-based limbscattering measurements. All tests are performed for the following range of tangent altitudes, solar angles, surface reflectance, atmospheric constituent conditions, and wavelengths: -80 tangent altitudes from 0.5 km to 79.5 km (inclusive) with a spacing of 1 km.
-3 atmospheric constituent conditions: pure Rayleigh scattering, Rayleigh scattering and ozone absorption, and Rayleigh + stratospheric aerosol scattering and ozone absorption 240 -11 wavelengths provided in Table 2.
These test cases span the reasonable conditions that have been, or are currently, in use by operational limb scatter instruments for retrievals of typical atmospheric constituents.
One of the challenges in performing an inter-comparison of RTMs is ensuring that the inputs are the same across every model.
In this study, care was taken so that the input parameters were specified in a way that can be assimilated by every model in the study. Stratospheric aerosol is specified as a log-normal distribution of Mie scattering particles with a median radius of 80 nm and a mode width of 1.6, with scattering parameters (cross sections, Mueller matrices, and Legendre moments) calculated Table 1. Solar zenith angles, solar azimuth angles, and solar scattering angles used in the test cases.

A Note on Atmospheric Gridding
The test cases specify the atmospheric state parameters on a 1 km grid from 0 km to 100 km, but leave the interpolation scheme up to the individual RTM. The two standard choices either assume that the atmospheric state ::::::: (number ::::::: densities ::: of   various ::::::: species, :::::::::: temperature, :::: and ::::::: pressure) : varies linearly between grid points, or that the atmosphere consists of 1 km homogeneous layers where the atmospheric state parameters are constant. How the discretized atmospheric state maps to an effective continuous quantity is of particular importance since any retrieved quantity must be interpreted the same way.
The RTMs GSLS, SASKTRAN (HR and MC), and Siro assume linear interpolation and handle it through analytic methods.
All four models calculate optical depth exactly, assuming a linearly varying extinction (see Oikarinen et al., 1999;Loughman 280 et al., 2015, for more detail). For SASKTRAN-MC and Siro this is all that is required and the linear variation of the atmosphere is accounted for without approximation. GSLS and SASKTRAN-HR must make additional approximations in the calculation of the source function and we refer back to Loughman et al. (2015) and Zawada et al. (2015) respectively for the exact methods used.

Discussion and Results
The main challenge in interpreting, and attributing , differences between models is that the true answer is not known. All of the set of models that agree with each other to a level that cannot be attributed to a concrete difference in a single model.
For the single scattering test cases the level was determined to be 0.2 %, and 1 % for the multiple scattering test cases. Any RTM that is found to have disagreements above these levels is excluded from the MMM for the relevant test case, causing the models composing the MMM to vary between different test cases. the models in scalar or vector ::::::: vectorial mode. Therefore, for the purpose of brevity, no comparisons of strictly scalar mode are shown. It is understood that the results for the polarised ::::::: polarized : comparisons are equally applicable to scalar calculations.

Single Scatter
Simplifying assumptions made for the single scatter calculation are minimal, and thus we expect differences to be relatively small. We do not expect zero differences due to the various methods of gridding in the vertical dimension of the atmosphere. As 305 originally done in the limb geometry by Siro (Oikarinen et al., 1999) and discussed extensively in Loughman et al. (2004Loughman et al. ( , 2015, the calculation of optical depth may be performed analytically assuming an extinction that varies linearly in altitude. However, the integration of the source function cannot be performed the same way and an approximation must be made. For example, SASKTRAN-HR creates a second order spline of the source function across an integration cell (Zawada et al., 2015), but other models may assume a constant source function or do something more sophisticated. All of this is further complicated by any The excellent agreement in single scatter is expected due to the relative simplicity of the calculation. Fundamentally each RTM solves the single scatter problem in the same way with minimal assumptions, the primary purpose of this test is to ensure that the inputs to RTM are configured correctly. The agreement of 0.1-0.2 % here sets a baseline that differences above this 325 level in more complex test cases cannot be explained by differing treatment ::::::: different :::::::: treatments : of input data.

Multiple Scatter
Differences in multiple scatter ::::::: radiance :::: with :::::::: multiple :::::::: scattering ::::::: enabled : are expected to be larger than those seen in single scatter owing to the extra complexity of the radiative transfer problem. The discrete models must deal with discretizations of the multiple scattering source term and may also make fundamental approximations for the sake of computational speed.

330
Comparatively the statistical models employ a simpler technique.  Fig. 4 we see that differences between each RTM and the MMM (MYSTIC, SASKTRAN-HR, SASKTRAN-MC, SCI-ATRAN, and SMART-G) is usually less than 1 % with a few exceptions. Siro shows a low bias on the order of 1-4 % at 351 nm, which is most pronounced at low solar zenith angles. The bias is not present when the Lambertian surface albedo is set to 0 instead of 1. This bias was present in Loughman et al. (2004), however it was misattributed to a high bias in the other 335 RTMs as Siro was used as the reference model. Internal testing suggests that the difference may not be directly related to ground scattering, and instead is an error that compounds on each succesively higher order of scatter. The error is strongest near 351 nm and a surface albedo of 1 because this is the condition where higher orders of scatter have the largest contribution to the observed radiance.
GSLS at 351 nm has a distinct pattern in altitude with variations on the order of 1 % that are relatively consistent across solar 340 geometry. The exact cause of these variations are unknown but are likely caused by discretizations of the multiple scatter source field calculation, however as stated these variations are fairly small. At 675 nm GSLS shows significant deviations of up to 4 % depending on the solar geometry at altitudes near 25 km. The deviation is only present when the Lambertian surface albedo is 1 and is almost non-existent with a 0 albedo. Testing has shown that the difference is present for all atmospheric composition scenarios and is thought to be due to approximations made in the ground to line of sight scatter calculation ::::::: multiple :::::: scatter 345 ::::::::: calculation, :::: and : is :::::::: currently ::::: under ::::::::::: investigation.
In order to isolate the polarization effects, we consider two quantities: the Degree of Linear Polarization (DOLP) and the Linear Polarization Orientation (LPO). DOLP is defined as which is the fraction of radiation that is linearly polarized, and LPO is defined as which indicates the direction of linear polarization. The absolute differences in DOLP and LPO for the Rayleigh scattering, ozone absorption, and stratospheric aerosol scattering case are shown in Fig. 6 and Fig. 7 respectively.

Refraction
All of the models considered thus far with the exceptions of SASKTRAN-MC and MYSTIC support atmospheric refraction to some level. While Siro has support for refraction it was not tested as part of this study. GSLS and SASKTRAN-HR neglect refraction of the incoming solar rays. Furthermore GSLS and SASKTRAN-HR neglect refraction for multiple scattering effects, only implementing refraction for the line of sight ray. SCIATRAN and SMART-G implement refraction in a generic way 400 accounting for all solar and multiple scattering effects. SASKTRAN-HR has since been updated to include refractive effects for incoming solar rays and multiple scatter effects but the calculations here use only line of sight refraction. Figure 9. Ratio of refracted to unrefracted I for ::: with : multiple scatter ::::::: scattering :::::: enabled, Rayleigh scattering + ozone absorption + strato-calculations using the provided atmospheric temperature and pressure. There exist various methods :::::: Various ::::::: methods ::::: exist to do this calculation and they may not be the same between each RTM. Differing methods of ray-tracing and integration can also lead to small differences. Since SCIATRAN (and SMART-G) included refractive effects for the incoming solar ray and multiple scatter terms, it is possible that this :::: solar ::::::::: refraction could be the source of some minor differences. The solar geometries tested 425 here have been limited to SZA ≤ 80 • , where refraction of the incoming solar ray is expected to be minimal. Further study could examine the effect of refraction at higher ::::: larger solar zenith angles.

Timing
We have considered a basic run time comparison between the models in the study. While a useful exercise, there are technical challenges in standardizing the hardware used to execute the models, and in the case of SMART-G which uses a GPU the stan-430 dardization is not possible. More importantly every model has settings that involve an accuracy/speed trade-off. For example, with the deterministic models, there are various discretization setting that can dramatically affect the speed of the calculations.
Harmonizing the balance between accuracy and speed between all of the RTMs is impractical, instead we opt for a simple order of magnitude timing estimate. The time taken to execute all of the multiple scatter ::::::: enabled, : polarized tests for each model without refraction is shown in Table 4. These timing numbers should be interpreted as the time required to execute a 435 wide variety of test cases, individual RTMs may be significantly more or less efficient in specific cases, however analyzing these differences is beyond the scope of this study. For the CPU based models a scaled runtime value is also supplied where the runtime on equivalent hardware has been approximated using the relative ::::::::::: multithreaded CPU benchmark values from https:// www.passmark.com/. The deterministic models, GSLS, SCIATRAN, and SASKTRAN-HR, all have runtimes of a similar order of magnitude taking anywhere from 0.3 s to 1 s to execute a single wavelength, solar geometry, and atmospheric composition 440 on average.
Analyzing the timing of the Monte Carlo models is inherently more challenging as the calculations also contain statistical noise. It is common to benchmark models for a set number of photons, however the number of photons used is not comparable between Siro/SASKTRAN-MC and MYSTIC/SMART-G since the MC technique is not the same. Instead, the precision of the calculation must be directly compared, which is shown for a typical condition in Fig. 10 Table 4. Estimated time to execute all multiple scatter ::::::: scattering ::::::: enabled, no refraction tests. This includes three effective surface albedos, three different atmospheric compositions, 11 wavelengths, 9 solar geometries, and 80 lines of sight. The MC models were executed to the precision shown in Fig. 10 (see text for more detail). a The scaled runtime is calculated by scaling every CPU to the computational power of the AMD 3900x using the relative benchmark values from https://www.passmark.com/ as of November 6th 2020. SMART-G NVIDIA Titan V 59.2 N/A than 0.1% precision in cases where the atmosphere heavily scatters (low altitudes, shorter wavelengths), and worse precision at higher altitudes and longer wavelengths where the atmosphere is optically thin. As mentioned earlier, this is since photons in Siro are forced to scatter, while photons in MYSTIC may pass through the atmosphere without interaction.
The runtime for the GPU based MC model, SMART-G, is ∼1-2 orders of magnitude less than the other MC models. The most natural comparison is between SMART-G and SASKTRAN-MC since they have similar precision for this case. Here,

465
A systematic comparison has been performed between seven radiative transfer models operating in the limb scatter geometry. The seven models are capable of handling the sphericity of the atmosphere, and compute the Stokes vector accounting for polarisation :::::::::: polarization. The test cases cover a wide variety of solar angles, Rayleigh scattering, ozone absorption, Mie scattering, and surface reflectances.
In single scatter, the deterministic models GSLS, SASKTRAN-HR, and SCIATRAN agree within 0.1 % for all observed 470 conditions. The statistical models MYSTIC, SASKTRAN-MC, Siro, and SMART-G all agree to a level better than the :: at ::: the :::: level :: of precision of the calculation which is approximately ∼0.2 %.
For almost all conditions in multiple scatter :::: with ::::::: multiple ::::::::: scattering ::::::: enabled, the agreement between the fully spherical models in the multiple scatter cases is within 1% when refraction is disabled for I with a few exceptions: Figure 10. Precision estimates for the MC models for :::