Mitigation of bias sources for atmospheric temperature and 1 humidity in the mobile Weather & Aerosol Raman Lidar 2 ( WALI ) 3

7 Lidars using vibrational and rotational Raman scattering to continuously monitor both the water 8 vapor and temperature profiles in the low and middle troposphere offer enticing perspectives 9 for applications in weather prediction and studies of aerosol/cloud/water vapor interactions by 10 deriving simultaneously relative humidity and atmospheric optical properties. Several heavy 11 systems exist in European laboratories but only recently have they been downsized and 12 ruggedized for deployment in the field. In this paper, we describe in detail the technical choices 13 made during the design and calibration of the new Raman channels for the mobile Weather and 14 Aerosol Lidar (WALI), going over the important sources of bias and uncertainty on the water 15 vapor & temperature profiles stemming from the different optical elements of the instrument. 16 For the first time, the impacts of interference filters and non-common-path differences between 17 Raman channels, and their mitigation, are particularly investigated, using horizontal shots in a 18 homogenous atmosphere. For temperature, the magnitude of the highlighted biases can be much 19 larger than the targeted absolute accuracy of 1°C defined by the WMO. Measurement errors are 20 quantified using simulations and a number of radiosoundings launched close to the laboratory. 21

The relative error on R is equal to the constraint on WVMR, i.e. 5%. An assessment of the 135 relative error on Q is performed considering the RR filter parameters given in Table 2 (  The results, summarized in Table 1, have very important implications. Typically, Q' must be 6 142 to 10 times more accurate than R' to deliver meaningful results in terms of temperature. Raman 143 cross-sections being larger for the RR channels than for the H2O VR channel, the main 144 difficulties shift from constraints linked to signal-to-noise ratio (SNR) to also encompass strong 145 constraints linked to instrumental biases. SNR as used in Table 1 SNRR, typically limited by the H2O channel, must be above ~20 and SNRQ must be above ~125 149 to satisfy the requirements given above. Such high values can be reached by increasing the laser 150 power and pulse repetition frequency (PRF), or enlarging the integration over altitude and time, 151 as SNR is usually magnified by the square roots of the energy and number of averaged samples. We derive the errors expected on RH given those on temperature and WVMR at the bottom of 157 with P pressure and Pwv,sat the water vapor saturation pressure in hPa, T temperature in °C. 163

Sources of bias 164
Biases arising from inaccurate measurement of any of the estimated factors of Eqs. (3-7), or 165 from a variation after that measurement due to instabilities in the instrument, must also be 166 smaller than the aforementioned values of 2-5% for WVMR and 0.12-0.4%, an especially 167 difficult goal to reach for temperature. Their impact must be mitigated either by careful design 168 or by precise estimation. 169 The expected (i.e. noiseless) values of R and Q can be detailed as: 170 with ̅ denoting the expected value of variable x, Kj and Oj(z) the instrumental constant and 171 overlap factor of channel j, respectively. To simplify our discussion, we choose to incorporate 172 any deviation that affects the ratios without a range-dependence into the instrumental constant proportionately larger with the diameter of the receiver. Because the optical path of each 203 channel is independently aligned, this always induces different overlap factors even 204 when sharing the same telescope. This large effect must be calibrated and corrected, yet 205 its impact was never discussed before in the RR lidar literature, despite being three times 206 as large in other systems with 450 mm receivers. This impact can be mitigated by 207 attacking the filters at normal incidence, where the derivative of CWL as a function of 208 AOI (see Eq. (17)) is minimal. θ' calculations. f: receiver focal length, Drec: receiver diameter, ϕ: full lidar field-ofactive surface and of angle of incidence, is now specified on the cathodes of PMTs used 218 at 400 nm wavelength (Hamamatsu (2007), Section 4.3.3). The amplitude was found to 219 be much larger by Simeonov et al. (1999), with significant impact. This effect has been 220 bluntly limited in all our lidars by putting the cathode plane as far as possible before the 221 focal plane, while still avoiding vignetting. It can still be responsible for differences of 222 overlap factors between channels. 223 • Uncalibrated PMT gain or digitizer baseline variations will of course induce bias in the 224 channel system constants. We will see how to mitigate these effects.
• Slight variations of overlap or channel transmittance after calibration will be directly 226 responsible for bias. In the next sub-section, we discuss how they can appear. 227

Overlap measurement with horizontal shots and limitations 228
Range-dependent biases influence the lower part of the profiles just like overlap factors, at 229 varying distances from the emitter depending on both the quality of the alignments and 230 characteristics of the receiving optics. Two methods are used in the literature to estimate the 231 overlap factors of a Raman lidar: i) an iterative Klett inversion of elastic and Raman channels 232 sharing the same telescope is easy to achieve (Wandinger and Ansmann, 2002) but inefficient 233 when non-common path errors are involved, whereas ii) the method of aiming the lidar 234 horizontally (e.g. Sicard et al., 2002; Chazette and Totems, 2017) is sometimes impractical but 235 more direct and yields more accurate results in an horizontally homogeneous atmosphere over 236 a range of 1 to 2 km. In the context of RR measurements, it is necessary to implement the latter, 237 and also to measure the ratios of overlap factors, rather than the overlap factors themselves, 238 thus avoiding errors due to an imprecise estimation of atmospheric extinction. 239 Considering a horizontal line of sight in a supposedly homogeneous atmosphere, the expected 240 values of ratios R and Q can be expressed as: 241 where ( ∞ ) and ( ∞ ) are the values observed when all overlap factors have become constant 242 at a sufficiently large range from the lidar, noted z∞, after which variations of the optical path 243 inside the reception channels become negligible. Δα = α(407nm) − α(387nm) is the difference 244 of atmospheric extinction between the two VR wavelengths. 245 To evaluate z∞, we introduce in Figure 1 b) parameters that characterize the overlap of a paraxial 246 or coaxial lidar (e.g. Kuze et al., 1998): i) ze = 2e/ at which the emitted laser beam located at 247 distance e from the receiver axis enters the field of view, whose full size is ; ze is null for a 248 coaxial system ii) H = Drec/ , the so-called hyperfocal distance, minimum range from which 249 the beam originating from a point still fully enters the field stop; iii) HIF = 2Drecf/f' ′ , that 250 we might call the filter hyperfocal distance, similarly to the former, the minimum range from 251 which the image of a point does not exceed θ'max, the AOI on the IF that significantly changes 252 its transmittance. z∞ is above the maximum of those three, which is usually HIF. If we use for θ'max the AOI value causing 1°C bias on temperature per Eq. (10) and Eq. (15), we find: z∞ > HIF 254 = 780 m. Note that z∞ can reach several km with misaligned filters. 255 If for instance the lidar can be mounted on a rotating platform capable of aiming horizontally, 256 the overlap ratios can be estimated with suitable precision by averaging over time and range 257 (and correcting for differential of extinction on the VR ratio): 258 However, assumptions are made for this estimation, namely: 259 • As explained above, the atmosphere is assumed to be homogeneous in WVMR and 260 temperature (down to <0.5°C) up until z∞, whereas the overlap ratios must be constant 261 (down to <0.4%) after z∞. Also, the maximum range (with sufficient SNR) of the lidar 262 must exceed z∞, implying nighttime measurements for the Raman channels. Therefore, 263 the effects generating overlap variation after a few hundred meters must be prevented. 264 • The lidar is assumed to retain the exact same overlap functions when aiming 265 horizontally and vertically. Considering a field of view around 1 mrad, the stability of 266 the emission and reception optical paths must be better than ~10 µrad between these 267 two positions. This is feasible for a small refractor but difficult for a Raman system such 268 as WALI, with a heavy laser and large reflector. 269 These difficulties make it extremely challenging to estimate the overlap ratios with an accuracy 270 better than a few percent. This is enough for the WVMR, but we find that a correction must be 271 applied by comparing with in-situ sounding for temperature measurements by Raman lidar. 272

Implementation and bias mitigation on the WALI system 273
In this section, we describe the WALI instrument from the emitter to the reception channels, 274 characterizing the critical elements in the framework of WVMR and temperature 275 measurements. The system has evolved from its previous implementation described in Totems presenting the main lidar sub-systems is shown in Figure 2, and a summary of its characteristics 278 is given in Table 2. coordinate. Manually applying curvature to the fiber, as suggested by so-called "mode 354 scrambling" devices, did not make the energy distribution more uniform so much as creating 355 unwanted losses (effect not shown). Even for a centered input, the energy radial distribution -356 i.e. the percentage of the total output in a given radial bin, that will therefore impact a well-357 aligned filter at the same AOIis uniform. We conclude that while minimizing effects of filter 358 misalignment, the use of fiber optics does not substantially make the angle of incidence on the 359 interference filters independent from the image position in the focal plane of the telescope, in We plot both the raw spectrum and the Fourier transform spectrometer noise floor after 1000 376 profile integrations, to highlight the very weak features observed at 780 to 910 nm, and the high 377 associated uncertainty. Due to the noise level, and given the dichroic plate residual 378 transmittance of the laser wavelength, we can only ascertain that the fluorescence power 379 spectral density (PSD) around 400 nm is lower than 10 -6 times the peak laser PSD, although no 380 feature can be detected in this spectral domain. Note that fluorescence between 400 and 500 nm was indeed observed using a broadband excitation from a fibered LED at 340 nm (not shown). 382 Nevertheless, the amount of rejection observed for a 355 nm excitation is sufficient to exclude 383 an adverse impact of the OH-rich fiber optics for Raman lidar measurements. we chose to implement a splitter-based configuration, favouring a compact system (25x25 cm, 390 easier to confine) and normal incidence on the filter, at the expense of SNR. Indeed, designing 391 the filters for a correct CWL at 5° incidence (as in the cited work) instead of 0° dramatically 392 narrows the filter angular acceptance, as can be deduced by deriving Eq. (15) as a function of 393 incidence θ'. In the WALI polychromator, the output from the fiber is collimated by a near-UV 394 achromat with 50 mm focal length, resulting in a 22 mm diameter beam. Dichroic beamsplitters 395 with adequate cut-on wavelengths are used to separate channels. On each separated channel, an 396 aspheric lense condenses light on the PMT surface, located 4 mm before the focal plane. A cage 397 system assembly holds all parts with great stability, however beamsplitters are not always 398 perfectly aligned at 45° in the stock cage cubes. That is why all filter, lens and PMT sub-399 assemblies are mounted on tiltable mounts to allow precise alignment at normal incidence. They were characterized on the Fourier transform spectrometer (described in section 3.2.2) 409 prior to mounting, using fibered LEDs peaking at 340, 385 and 405 nm as the light source; the 410 beam was collimated by the same near-UV achromat with 50 mm focal length. We give the 411 measurements results for the RR filters in Figure 6 and Table 3.   We then put the condensing lens used in the polychromator in front of the PMT, and studied its 481 response as a function of AOI on the lens+PMT assembly, which is shown in Figure 8 b). The 482 input beam was the nominal size in the polychromator ie. ~22 mm in diameter. We find that the 483 curve corresponds well to the measured sensitivity profile, smoothed by its convolution by the 484 spot on the PMT. The problem is that at normal incidence, the derivative of sensitivity with 485 incidence is 2-5% per degree. In the future, the condensing lenses will be replaced with afocal 486 beam reducers to reduce this dependency. 487 The baseline induced by the detection chain is found to vary between channels and in time. It 489 is also subject to electro-magnetic (EM) interference causing parasitic signals of both high 490 frequency, mostly due to the flashlamp high peak current radiating over the system, and low 491 frequency, probably due to other neighboring electronics. For this reason, the channel baselines 492 are evaluated regularly (by averaging 1000 shots with PMT gain set to zero, every 8 minutes), 493 smoothed and corrected (Lj in Eq. (2)). However, for the Raman channels (H2O and RR2 repeating parasitic spikes were found to jam the channels (especially photon counting) starting 496 at altitude 6-7 km. 497  Table 1 (2% on VR channels, 0.4% on RR channels). In Figure 10 a), we show the 527 experimental calibration of G versus U as well as second-degree polynomial fits for each 528 channel. The relative error on the VR and RR channel gain ratios approximated by these models 529 is plotted on Figure 10 b), with the measurement uncertainty. This uncertainty is mostly due to 530 variations of atmospheric parameters and laser energy during calibration. Since all relative 531 errors are well centered, we compute that the possible error for the gain ratio with these models 532 is ~1.3 %. This is compatible with WVMR measurements but not with temperature 533 measurements. Therefore, the PMT gain should only be adapted on the VR channels, and the 534 RR channels should be kept at a fix value of gain. We wish to emphasize here that baselines Lj and background signals Bj in Eq. (3) must be 546 estimated separately for the analog and photon-counting recorded profiles (which have no 547 baseline, and a smaller but non-zero background value due to the suppression of electronic 548 noise). Otherwise, the merged signal will show discontinuities at the cut-off altitude, and biases 549 at high altitude at dusk and dawn. Their impacts are typically much larger than the requirements 550 of Section 2.2. 551

Qualification on the atmosphere 552
In this section, we qualify the WALI system starting with the measurement of its overlap factor 553 ratios, followed by its calibration and comparisons with radiosoundings. Remaining biases are 554 highlighted and corrected, and experimental measurement errors are evaluated. 555

Experimental set-up and strategy 556
We put the lidar into operation in our laboratory near Saclay (48°42'42"N 2°08'54"E) over a 557 period of two weeks in May 2020. It was placed on a rotating platform below a trapdoor 558 equipped with silica windows for zenith shots, and in front of a window at a height of about 9 559 m above the ground level (agl) for horizontal shots. During the latter, the lidar aimed North <5° 560 above the horizon (beam elevation <80 m per km of range). In that direction, land use is fields 561 up to 800 m range, buildings and trees between 800 and 2 km range, and fields again up to 5.5 562 km range. 563 To calibrate and qualify the lidar measurements, we use radiosoundings launched two to three 564 times daily from the operational Météo-France station located in Trappes (48°46'27"N 565 2°00'35"E, 12.3 km WNW from the lidar near Saclay, approximately upstream in the prevailing 566 winds).

Measurement of overlap ratios with horizontal shots 568
The overlap factors and their ratios were estimated on signals averaged over 3 hours after sunset 569 on December 19th, 2019, with a rather lukewarm, unturbulent but hazy atmosphere (aerosol 570 extinction coefficient 0.32 km −1 at 355 nm with Angström exponent ~1.5, 11°C ground 571 temperature, and WVMR at ground level around 6.5 g kg −1 ). With a planetary boundary layer 572 (PBL) height of ~900 to 1000 m, and slow gradients of temperature (−1 to −4°C km −1 ) and 573 WVMR (−0.8 to −1.2 g kg −1 km −1 ) in that PBL (as measured by radiosoundings launched from 574 Trappes at ~12:00UTC and 0:00UTC, presented in the next subsection), conditions were 575 excellent for a homogeneous atmosphere within the first 5 km at least. 576 The estimated overlap factors of the different channel, with atmospheric extinction fitted 577 between 800 and 2000 m, are shown in Figure 11 a). Full geometrical overlap is obtained as 578 expected between 150 and 200 m, but the curves differ by several percent between the Raman 579 channels. Atmospheric extinction drifts from the estimated value after 2 km. 580 The estimated ratios of overlap factors ORR and ORQ are plotted in Figure 11 b) and c), at 7.5 m 581 resolution (thin line) and after smoothing (thick line, final correction used hereafter). Peak 582 divergence is 5 to 7 %, at ~150 m. Convergence within 1% happens at ~400 m, but oscillations 583 of lower amplitude persist until ~3 km. We note that for ORQ, deviations do not exceed the 584 ±0.7% required to maintain bias below 1°C. They are nevertheless corrected. 585  In order to debias WVMR and temperature measurements from residual errors on ORR and 597 ORQ, we perform a three-step calibration: • First step: we exclude the first 1500 m agl of the profiles when fitting rH2O in-situ vs R' 599 and Q' vs T in-situ to estimate K and f respectively. This initial calibration is shown in 600 Figure 12 a) & d). 601 • Second step: using these first estimates, we then plot the ratios between the lidar 602 observables R' & Q' and the expected observables deduced from the in-situ 603 measurements and these initial calibration parameters. This provides an estimate of the 604 remaining biases on ORR and ORQ, which we find to be up to ~4% and ~1.8% 605 respectively. This represents a small correction to the overlap ratios estimated while 606 shooting horizontally, but remains larger than the requirements of precision specified in 607  In the three steps, data with SNR lower than 10 for R' and 30 for Q' are rejected so as to 613 limit the impact of noise present at higher altitudes. 614 615 Figure 12. Results of calibration on 12 nighttime and 24 daytime radiosoundings launched from 616 Trappes between May 20th and June 2nd, 2020 for WVMR (upper row) and temperature (lower 617 row), in three steps: calibration on measurements above 1500 m (a/d) with samples as crosses 618 (one color per radiosonde) and calibration curve in black; residual overlap ratio estimation (b/e) with samples as crosses, mean ratio in blue and model in red; calibration on all results (c/f). 620 Daytime samples are limited to SNRs above 10 for R' (WVMR) and 30 for Q' (temperature). 621 The reliability of this calibration along time has been tested by comparing to the same exercise 622 performed two months later at the end of July 2020. After calibration in the same conditions 623 than in May, we found K decreased by ~7.3%, and the temperature associated to a given value 624 of Q' to be ~2.1°C higher. However, ORR and ORQ were still accurate within the reachable 625 precision, ie. ~0.2%. It was later proven that a malfunction of the laser seeder was responsible 626 for a slow drift of the emitted wavelength. Thus, although a regular verification of the 627 calibration is necessary, the measurement of the overlap ratios is reliable. 628 629 Figure 13. Residual deviations between lidar and Trappes radiosoundings in terms of WVMR, 630 temperature and relative humidity, for night time (a/b/c) and daytime (d/e/f), with mean 631 deviation (thick lines), and RMS error (colored rectangles). The error corresponding to noise levels on the lidar signal is shown as darker rectangles. Cloudy profiles have been discarded. 633 Daytime measurements are limited to SNRs above 5 for R (WVMR) and 20 for Q (temperature). 634 In Figure 13, we examine the residual deviations between the lidar and the same series of 635 radiosoundings used for the calibration. RH has been derived using Eq. (14) from lidar-636 estimated WVMR and temperature, and the pressure profile given by radiosoundings. For each 637 parameter rH2O, T and RH, we plot for daytime and night time profiles the mean and RMS 638 deviations averaged over large range bins as colored bars, as well as the propagated signal error 639 as darker shaded areas. This allows to compare the observed random error to what could be 640 expected from the level of noise on the lidar measurements. Note that only profiles with good 641 Table 4. Statistics of observed differences for rH2O, T, and RH: experimental Mean Differences 650 (MD), Root-Mean Square Differences (RMSD), averaged over two different range bins, in the 651 low troposphere (1-2 km) and the free troposphere (5-6 km). Comparison to the "natural" 652 atmospheric variability between the lidar and RS sites as modelled by the ECMWF/IFS ERA5 653 reanalyses (difference over considered period between grid points nearest to each of the two 654 sites), and to the theoretical root-mean-square error (RMSE) derived from the variance of the 655 To support the above interpretation, in Table 4 we compare the experimental mean difference 658 and RMS difference plotted on Figure 13, averaged over two altitude ranges (low troposphere, 659 LT, 1 to 2 km, and free troposphere, FT, 5 to 6 km), to i) the natural variability of the atmosphere 660 between the radiosondes at Trappes and the lidar at LSCE, as modelled by ERA5 reanalyses of 661 the ECMWF/IFS weather model, ii) the expected random error given the noise level on the RR 662 signals. Nighttime and daytime values are indicated in the LT, only nighttime values in the FT. measured laser linewidth or short-term wavelength drift were shown to be negligible in the 685 WALI system. 686 After a measurement of RR/VR channel ratios during horizontal shots, which showed the 687 significant impact of the above phenomena (up to 5% bias on ratios below 300m, ~1% higher), 688 we calibrated and de-biased the WALI measurements using radiosondes launched from the 689 nearby Trappes station of Météo-France. Between the de-clouded lidar measurements and the 690 radiosonde profiles, the remaining mean differences are small (below 0.1 g/kg on water vapor, 691 1°C on temperature) and RMS differences are consistent with the expected error from lidar 692 noise, calibration uncertainty, and horizontal inhomogeneities of the fields between the lidar 693 and radiosondes. On relative humidity we thus reach a goal of ~10%RH random error and 694 5%RH systematic error up to 9 km by night and 1.5 km by day, with 40 min time integration 695 and progressive vertical integration of 15 to 360 m at 10 km. The systematic error on RH is 696 dominated by bias on temperature, whereas the random error is dominated by noise on water 697 vapor measurements. 698 Thus exhaustively qualified, the WALI system may be applied in the near future to exercises 699 assimilating thermodynamic profiles in weather models, as is expected within the WaLiNeAs 700