The novel HALO mini-DOAS instrument: Inferring trace gas concentrations from air-borne UV/visible limb spectroscopy under all skies using the scaling method

. We report on a novel 6 channel optical spectrometer (further on called mini-DOAS instrument) for aircraft-borne nadir and limb measurements of atmospheric trace gases, liquid and solid water, and spectral radiances in the UV/vis and nearIR spectral ranges. The spectrometer was developed for measurements from aboard the HALO (http://www.halo.dlr.de/) research aircraft during dedicated research missions. Here we report on the relevant instrumental details and the novel scaling method used to infer the mixing ratios of UV/vis absorbing trace gases from their absorption measured in limb geometry. The 5 uncertainties of the scaling method are assessed for NO 2 and BrO measurements. Some ﬁrst results are reported along with complementary measurements and comparisons with model predictions for a selected HALO research ﬂight from Cape Town to Antarctica, which was performed during the research mission ESMVal on 13 September 2012. and temperature data. Additional data on the micro-physical properties and spatial distribution of and methane degradation differs between the models. O 4 ), or otherwise available measurements. The present study examines the resulting random and


Introduction
In the past three decades aircraft-borne UV/vis spectroscopy measurements developed into a powerful tool to study the photochemistry and radiative properties of the atmosphere. Based on the pioneering work of Noxon (1975) and later Noxon et al. (1979) to exploit ground-based spectroscopic observations of the zenith scattered skylight to monitor stratospheric NO 2 (and later O 3 , BrO and OClO, see below), the discovery of the ozone hole in 1985 and the need to unravel its formation mecha-measured absorptions of the scaling gas (further on denoted P ) and the targeted gases (further on denoted X), preferentially monitored in the same wavelength region. The latter appears to be convenient in order to eliminate any wavelength dependence of the atmospheric Rayleigh and Mie scattering (see  and their supplement, and below). The in situ measured concentration and the remotely observed absorption of the scaling gas P can then be used to infer an effective light path length (or distribution) common for the gases P and X (see section 3 below). The underlying assumption is a horizontally constant 5 trace gas concentration along the line of sight equal to the in situ measured concentration. One draw-back of the scaling method comes from its (moderate) sensitivity towards the relative vertical profiles shapes (but not absolute concentrations) of the involved trace gases. The sensitivity can best be dealt with by using a scaling gas P with a similar profile shape to that of the target gas X. The relative profile shapes of both gases can then be taken from either in situ measurements performed during dives of the aerial vehicle, any a prior knowledge, and/or from chemistry transport models (CTMs, e.g., CLaMS, SLIMCAT) 10 or chemistry climate models (CCMs, e.g., EMAC). The latter is very convenient since the limb measurements are often used to validate the predictions of the respective CTMs together with the other complementary measurements performed on board the respective research aircrafts.
The present study explores the scaling method in more detail together with its uncertainties and potential errors.
The paper is structured as follows. In section 2 the instrument is described and characterized. Details of the employed meth- 15 ods are provided in section 3. These include the spectral retrieval, radiative transfer calculation, complementary measurements, CTM and CCM modelling and a description of the scaling method and its uncertainties. Section 4 describes sensitivity studies of the retrieval method by comparing inferred [NO 2 ] using different CTM and CCM trace gas profile predictions and different scaling gases. Finally, our results for inferred [NO 2 ] and [BrO] are inter-compared with complementary measurements and model predictions for a HALO flight from Cape Town to Antarctica during austral spring 2012 (section 5). Section 6 concludes 20 the study.
Its major design criteria for air-borne measurements are a small weight (several to tens of kg), a small power consumption (200 W), multiple channels of moderate spectral resolution (i.e., ranging from several tenth of nm in UV to several nm in nearIR) for UV/vis/nearIR analysis of the skylight received from nadir and simultaneously in scanning limb direction, a stable 30 optical imaging, and finally an easy to operate instrument, either by on board operators (e.g., on HALO) or fully automated for deployments on unmanned aircrafts, such as the NASA Global Hawk . On HALO the mini-DOAS instrument is installed in the unpressurized so-called 'boiler room' located in the rear of the HALO aircraft, which is not accessible during fuselage and has a weight of 4 kg. The three limb telescopes point to the starboard side of the aircraft, perpendicular to the aircraft fuselage axis, and are moveable from +3 • to −93 • , in steps of less than 0.005 • . During the flight they are commanded to compensate for the changing roll angle of the aircraft (see below), while the three nadir telescopes are held fixed. The six telescopes have diameters of 1.2 cm each, and six silica fiber bundles conduct the collected skylight from the telescopes to the 20 spectrometers. At the spectrometer end, the fibers are linearly arranged and placed at the entrance slits of the spectrometers.
At the telescope end, the fibers are linearly arranged as well positioned in the focal point of the telescope lenses, forming field of views (FOVs) of 3.15 • in the horizontal 0.38 • in the vertical for the UV and visible telescopes, and 1.68 • in the horizontal and 0.76 • in the vertical for the nearIR telescopes (for the other details see Table 1). Finally, an industrial miniature camera is attached to the telescope aperture plate and oriented towards the sky's limb for monitoring of the investigated sky 25 area simultaneously with the spectroscopic measurements.

Spectrometer unit
The six grating spectrometers are assembled in a Czerny-Turner configuration with the specifications given in Table 1. In order to clearly identify each spectrometer and the corresponding telescope, they are labeled by the wavelength range and numbered 1 through 6. Spectrometers UV1, VIS3, and NIR5 (odd numbers) are then used in nadir viewing geometry, and 30 spectrometers UV2, VIS4 and NIR6 (even numbers) are used in limb viewing geometry. All spectrometers are mounted onto the lid of a vacuum tight container. The spectrometer container lid also accommodates vacuum tight connectors and feed-throughs for the fiber bundles and the connection to the detector electronics. Prior to each mission the vacuum tight spectrometer 5 Atmos. Meas. Tech. Discuss., doi:10.5194/amt-2017-89, 2017 Manuscript under review for journal Atmos. Meas. Tech. Discussion started: 27 April 2017 c Author(s) 2017. CC-BY 3.0 License.
container is evacuated to some 10 −5 mbar (leakage rate 2×10 −5 mbar · l/s) to keep the spectrometer and detectors clean from contamination and the optical imaging stable. The whole vacuum tight spectrometer container is immersed into a vessel filled with 7 l of water/ice, in order to stabilise the spectrometer and detector temperatures at around 0 • C. The whole spectrometer unit is further insulated using a combination of silica vacuum insulation panels (thermal conductivity of 0.008 W/(m · K)) and a more flexible Polyvenylidenfluorid (PVDF) foam (thermal conductivity of 0.037 W/(m · K)). Prior to a flight, the water ice 5 vessel is filled with approx. 4 kg of ice and 3 l of cooled water, providing a latent heat of melting of 1300 kJ. When operating under arctic conditions, i.e. with an already cooled instrument prior to flight preparations, constant temperatures are maintained for 10 hours or more, showing that average heat flows during operation are well below 36 W. In a worst case scenario, i.e. in very hot and humid ambient conditions in the tropics (e.g., in Manaus/Brazil in fall 2014, or the Maldives in August 2015), the instrument has to be subsequently cooled adding ice and removing liquid water prior to the flight. Under these conditions, the 10 average heat flow during flight preparation and measurement flight is around 80 W, and therefore in the present configuration the instrument is limited to 3 -4 hours of stable temperatures (∆T≤ ± 1 • C). Therefore, after having made some experience with the instrument's heat budget, three Peltier elements were additionally mounted on the spectrometer container lid.

Control unit
The power supply, the read-out electronics for the six spectrometers, the controllers for the telescope motion, the control board 15 for the Peltier elements, house-keeping electronics as well as a single board personal computer for instrument control and data storage and communication with the operator in the aircraft cabin is integrated into two removable electronic boxes, mounted above the spectrometer unit (yellow boxes in Figure 1). The measurements and control processes including read-out of the aircraft attitude data and the motion control of the three limb telescopes is controlled by a LabView software running on the single board computer. Finally the whole instrument is mounted on a custom-built rack of 45 × 47 × 54 cm 3 . The total weight 20 of the instrument is 57 kg, including the water/ice, and it consumes 100-200 W of 28 Volt DC power provided by the aircraft, depending on the power consumption of the Peltier elements.

Pre-flight test measurements
Prior to each mission, the instrument is optically and electronically characterized in the laboratory for a subset of parameters.
This characterization includes recording of the dark currents and offset voltage of the CCD detectors, recording of line shapes 25 and the optical dispersion, recording of trace gas absorption spectra, measurements of the telescope's field of views, and alignment of telescopes to the major aircraft axis (roll angle).
Dark current and offset voltage: Dark current and offset voltages of the CCD detectors are recorded prior to each flight for post-flight data processing (Platt and Stutz, 2008).
Slit function: The spectrometer slit function and wavelength dispersion are monitored in the laboratory and in the field prior 30 to each flight using HgNe and Kr emission lamps (see Table 1). Moreover, since test measurements in the laboratory show that the slit functions are sensitive to the spectrometer's temperature, their T-dependence is extensively studied and monitored in the laboratory. For example it is found that the width of the slit function is most sensitive at low temperatures with a sensitivity of 0.005 nm/K (0.04 channels/K). However, due to the thermal stability of the instrument, a temperature sensitive slit function does not need to be taken into account for most spectral retrievals.
The effective field of view (FOV eff ) of the telescopes is made up of three contributions, which are (a) the optical FOV of the telescope (FOV opt ), (b) the lag time between aircraft movement and telescope attitude correction (∆ attit ) and (c) the play of the telescope gear (∆ gear ). These are discussed in the following paragraphs. 5 FOV opt (a): The optical FOV of the telescopes is measured in the laboratory in advance of the deployment to any mission.
FOV opt is listed in Table 1. The vertical FOV opt in the UV/vis is ≈ 0.38 • .
Telescope attitude control (b): In order to maintain the targeted elevation angle (EA) of the telescopes relative to the horizon during flight, the changing roll angle of the moving aircraft has to be corrected for. The aircraft's attitude data is received from the aircraft sensor data system (BAsic HALO Measurement And Sensor system, or in brief BAHAMAS) aboard the HALO 10 aircraft at a frequency of 10 Hz and a time delay < 1 ms via an Ethernet UDP broadcast. Due to the continuous movement of the aircraft and the time delay between data transmission and actual motor movement, a small difference between the targeted and the actual telescope angle can thus be expected. Tests involving a continuous and arbitrary sampling of the aircraft roll angle and the telescope position yields a mismatch of both angles with a standard deviation of ∆ attit ≈ 0.17 • ...0.18 • (Fig. 1 in the supplement). 15 Telescope gear (c): In addition, the pointing precision is limited by the play of the telescope's gear (∆ gear ). Telescope gear play (∆ gear ≈ 0.05 • ) is determined by the shift of the recorded radiance maximum when the telescope's FOV is measured by scanning in opposite directions.
Gaussian summation of contributions (a) ... (c) gives a FOV eff ranging between 0.54 • (during mission ML-Cirrus (Voigt et al., 2016)) and 0.64 • (during the TACTS/ESMVal missions (e.g. Müller et al., 2016)) for the VIS4 telescope, for which the 20 tests are carried out. Arguably it is of the same size for the other limb telescopes.
Telescope alignment to the major aircraft axis: After integration of the instrument into the aircraft, the telescope angle with respect to the aircraft is calibrated by placing a Ne gas lamp in 15 m distance and at the same height as the telescopes in the line of sight of the telescopes. The lamp is modified so that light is only emitted through a narrow (∼ 5 mm) slit. Scanning over the lamp again gives the field of view of the telescope, whose maximum is used to determine the angle that represents a horizontal 25 line of sight with respect to the horizon. Under the assumption of a 2 cm uncertainty in the height of the lamp relative to the aperture plate (1 cm at each side), the angle uncertainty is 0.076 • . When the aircraft is grounded, the aircraft roll angle given by the aircraft attitude data has a standard deviation of 0.2 • . Accordingly, the systematic error in telescope alignment is ∆ align The systematic misalignment (∆ align ) can be tested independently by observation of the radiance 'knee', i.e. the apparent 30 maximum in the relative radiances received from a set of elevation angles in limb direction, which is wavelength dependent (see Figure 5 in Deutschmann et al. (2011) and Figure 5 in Weidner et al. (2005)). Figure 2 in the supplement shows measured and modelled relative radiances in the UV and visible wavelength ranges, indicating a systematic misalignment below 0.2 • .

DOAS retrieval
The spectral retrieval is based on the DOAS method (Platt and Stutz, 2008). The primary product of the DOAS spectral retrieval are so-called differential slant column densities (dSCDs) given in molecules per cm 2 (Platt and Stutz, 2008), i.e., the amount of absorption measured in a foreground versus background (Fraunhofer) spectrum. Since the details of the spectral retrieval 5 and its uncertainties have been described in previous studies (Harder et al., 1998;Aliwell et al., 2002;Weidner et al., 2005;Dorf et al., 2006;Butz et al., 2006;Kritten et al., 2010;, here only those details are discussed which depart from our previous work. Table 2 provides a brief summary of the different DOAS settings and typical dSCD errors. Table 3 lists the absorption cross sections used in the analysis together with their uncertainties as stated in the literature. In all spectral retrievals a polynomial of degree 2 is included to compensated for broad-band extinction features in the radiative transfer of 10 the atmosphere, together with a Fraunhofer reference, a Ring spectrum and an additional Ring spectrum multiplied by λ 4 as suggested by Wagner et al. (2009). The trace gas cross section spectra are calculated by convolving the literature absorption cross sections listed in Table 3 with the measured dispersion and a Gaussian line-shape describing the Hg line at 404 nm (UV) or the Kr line at 450 nm (vis). Inaccuracies in wavelength calibration due to small changes in the instrument's optics and errors in the wavelength calibration of the fitted spectra are accounted for during the spectral retrieval. All trace gas cross sections are 15 linked together and the package of trace gas cross sections is allowed to shift against the Fraunhofer reference and the Ring spectra which are linked together. Typical spectral shifts for both groups of spectra are well below 1 detector pixel.  (Tables 2 and 3). All five intervals are different but show significant overlap (Table 2). 20 O 3 is retrieved in the 335 -362 nm wavelength region of the Huggins band in order to achieve a larger spectral overlap with the other targeted gases in the UV spectral range which is found necessary in support of the scaling method (see section 3.6). Here O 3 , BrO, NO 2 , and O 4 references are included in the spectral retrieval ( Table 2). The average error in the inferred O 3 -dSCD is 6.4 × 10 16 molec/cm 2 for the UV spectral range. It is noteworthy that the spectral retrieval for O 3 could be improved by using the stronger ozone absorptions bands of the Huggins band occurring towards the lower wavelength end of 25 the UV spectrometer (310 nm), but then spectral overlap with the other gases as well as the much stronger absorption would negatively infer with the quality of the O 3 -scaling method.
O 4 is retrieved in a spectral window ranging from 350 -370 nm in order to allow fitting of the collisional band 1 Σ + g + 1 Σ + g (ν=1) (at 360.5 nm) ( Table 2). The BrO analysis window covers 342 -362 nm, the vibrational transitions 3, 0, 4, 0, 5, 0, and 6, 0 of the A 2 Π 3/2 ←− X 2 Π 3/2 30 electronic transition. Reference spectra of O 3 for 223 K and 293 K (the latter orthogonalised to the 223 K reference spectrum) are included in the spectral retrieval together with reference spectra of NO 2 , CH 2 O, and O 4 (for the other parameters see Table   2. Figure 2 (bottom left) shows an example for the retrieval of BrO from a limb spectrum collected in the lowermost arctic Here the focus is put on the spectral retrieval of O 3 , O 4 , and NO 2 , since the former two gases are used for the scaling method and the later complements the measurements of NO and total NO y by the AENEAS instrument (see section 3.4.2) on board HALO. The spectral retrieval of IO, C 2 H 2 O 2 and water vapor is not discussed further in this manuscript.
Ozone is analyzed in the 450 -500 nm wavelength band of the Chappius absorption band. The center of both fitting window 20 is thus shifted by 20 nm relative to NO 2 . In the spectral retrieval, absorption cross sections of NO 2 at 223 K, together with O 4 and water vapor (Table 2) are included. The average error in the inferred O 3 -dSCD is 4 × 10 17 molec/cm 2 in the visible spectral range.
The 1 Σ + g + 1 ∆ g absorption of O 4 at 477.3 nm is analyzed in the 460 -490 nm wavelength band with the same combination of reference spectra as those used in the O 3 retrieval (Table 2). For O 4 the average retrieval error is 5.6 × 10 41 molec 2 /cm 5 .

25
NO 2 is thus analyzed in a relatively wide spectral window ranging from 424 -490 nm of the sub-bands of the electronic transition 2 B 1 ←− 2 A 1 thus supporting both small dSCD errors while maintaining a stability of the least squares fit involved in the spectral retrieval. Reference spectra of O 3 at 223 K and 293 K (the latter orthogonalised to the 223 K spectrum), O 4 and water vapor are included in the retrieval (Table 2). Figure 2 (top right) shows an example of a spectral retrieval of NO 2 with a dSCD of (2.17 ± 0.05) × 10 16 molec/cm 2 for a limb spectrum taken within the framework of the ESMVal mission close to

Determination of the amount SCD ref
In order to obtain the total slant column density (SCD), which is needed to solve the inversion problem, the amount of absorption SCD ref contained in the Fraunhofer reference needs to be determined and added to the measured dSCD, i.e.
where SCD ref is determined using (a) the so-called Langley method (i.e., a regression of dSCD as a function of total air

Radiative transfer modelling 20
The radiative transfer is simulated in 2D (and in selected cases in 3D, see supplement Figure  For the forward simulations of the trace gas absorptions measured in limb direction, the RT model is further fed with simulated trace gas curtains along the flight track (for details see section 3.5 and Figure 3, panels a and b).  . In post-processing the chemiluminescence detector data is calibrated using the UV photometer data. FAIRO was first deployed on HALO during the TACTS/ESMVal mission (July to September 2012); its 15 performance was excellent during all 13 flights.

AENEAS
NO and NO y measurements on board HALO are performed using a two-channel chemiluminescence detector (AENEAS -Atmospheric nitrogen oxide measurement system) in combination with a catalytic conversion technique (Ziereis et al., 2000;Stratmann et al., 2016). A commercial two-channel chemiluminescence detector (ECO PHYSICS, Switzerland) is modified for 20 use on board of research aircrafts. The chemiluminescence technique is widely used for the detection of atmospheric NO and relies on the emission of light in the near infrared following the reaction of NO with O 3 (e.g. Drummond et al., 1985). Heated gold tubes in combination with CO or H 2 as reducing agent are frequently used to convert all species of the odd nitrogen family Bollinger et al., 1983;Fahey et al., 1985) that is subsequently detected by chemiluminescence. The conversion efficiency of the gold converter is quantified using gas phase titration of NO 25 and O 3 before and after each flight with a conversion efficiency of typically more than 98%. The statistical detection limit is 7 pmol/mol for the NO measurements and 8 pmol/mol for the NO y measurements for an integration time of 1 s. The overall uncertainty for the NO and NO y measurements is 8% (6.5%) for volume mixing ratios of 0.5 nmol/mol (1 nmol/mol).

TRIHOP
The TRIHOP instrument is a three channel Quantum Cascade Laser Infrared Absorption spectrometer capable of the subse- CLaMS is a Lagrangian CTM system developed at Forschungszentrum Jülich, Germany. The specific model setup is de- tropopause or the polar vortex edge). It should be noted that the present ClaMS simulation is not optimized in particular to reproduce photochemical processes the lower troposphere. Therefore, the employed chemistry setup does only contain reactions of importance within the stratosphere (Grooß et al., 2014) and it does neither contain sources of larger hydro-carbon compounds (e.g. VOCs and NMHCs) nor any interactions of the chemical compounds with clouds.
The ECHAM/MESSy Atmospheric Chemistry (EMAC, http://www.messy-interface.org/) model is a numerical chemistry 25 and climate simulation system that includes sub-models describing processes in the troposphere and middle atmosphere and their interaction with oceans, land and human influences (Jöckel et al., 2010). It uses the second version of the Modular Earth Submodel System (MESSy2) to link multi-institutional computer codes. The core atmospheric model is the 5th generation European Centre Hamburg general circulation model (ECHAM5, Roeckner et al., 2006). Here, we analyse data of the RC1SDbase-10a simulation (Jöckel et al., 2016) sampled along the aircraft flight track with the submodel S4D (Jöckel et al., 2010).

30
The time resolution is the model time step length, i.e., 12 minutes for the applied model resolution. For the RC1SD-base-10a simulation, EMAC has been nudged towards ERA-Interim reanalysis data (Dee et al., 2011) to reproduce the "observed" synoptic situation in the model (for details see Jöckel et al., 2016). The model is applied in the T42L90MA-resolution, i.e.

The scaling method
The scaling method makes use of the information on the relevant radiative transfer gained from a simultaneously in situ and remotely (line-of-sight) measured scaling gas P and the remotely measured absorption of the target gas X to infer the absolute 5 concentration [X] (Raecke, 2013;Großmann, 2014;Werner et al., 2017;. Ideally, the absorption bands of X and P (Table 2) are close to each other in order to diminish the influence of wavelength dependent Rayleigh and Mie scattering on the results. The potential advantages of the scaling method over optimal estimation come from largely removing uncertainties in radiative transfer due to aerosols and clouds.
Mathematically, the method evolves along the following lines. The total measured SCD (= dSCD + SCD ref ) (eq. 1) can be For the atmospheric layer of interest j, i.e. the altitude range around aircraft altitude where the limb line of sight penetrates through and most of the absorption is picked up, the concentrations for both gases can be expressed as By noting that for weak absorbers (i.e. those with optical densities much smaller than unity), the BoxAMFs B Xj and B Pj 20 are the same for both gases X and P when measured in the same wavelength range, the ratio of equations 4 and 5 yields: Further, by defining so called α-factors (α X , and α P ), which describe the fraction of the absorption in layer j relative to the total atmospheric absorption for both gases, i.e. and the main equation of the scaling method can be written as 10 Here [P ] j is the in situ measured concentration of the scaling gas (e.g., O 3 , O 4 , ...), but averaged over the time of spectrum integration, and SCD X , and SCD P are obtained from Eq. (1). α R and SCD R are the ratios of the α-factors and the SCDs, respectively. Equations (8) and (10)   the lower and upper wavelength end of the spectral retrieval for each gas. In agreement with , it is found that α R may only change by as much as a few percent in our applications. Thus, the error is negligible as compared to the other errors discussed in the following section.

Errors of the scaling method
The errors and uncertainties of the scaling method fall into the categories random (presumably Gaussian distributed) errors and 5 systematic errors. The sources and magnitudes of both are discussed in the following.

Random errors of the scaling method
The random errors and sensitivities of the scaling method towards all input parameters are addressed by inspecting the Gaussian error propagation of Eq. (12). The uncertainty ∆[X] j is calculated from 10 In the following we discuss the different contributions to ∆[X] j in Eq. (13). The magnitudes of the contributions are summarised in Table 4. When using O 4 as scaling gas, the altitude and temperature dependent O 4 concentration (in terms of molec 2 /cm 6 ) can easily be calculated with an uncertainty of ≤ ±1% (Greenblatt et al., 1990;Pfeilsticker et al., 2001; Thalman and Volkamer, . and references therein) are assumed for the simulations, because such a scenario may represent the most severe disturbance of the radiative field in the UV/vis spectral range. The configuration of the cloud field is described in the supplement (Figure 3).
For the cloudy sky, α R is narrowly distributed within a range of typically ∆α R ≤ ±5% around the clear sky case with some outliers within an interval of ∆α R ≤ ±15% (Figure 4 in the supplement). A notable finding is that α R follows the assumed concentration ratio of the target gas and scaling gas, however by a somewhat damped amplitude, i.e. within an interval of 5 0.6 ≤ α R ≤ 1.8, whereas the concentration ratio ranges between 0.2 and 1.7. In conclusion the scaling method thus largely removes the uncertainties in the concentration retrieval due the complexity of the radiative transfer in the UV/vis spectral range for a cloudy atmosphere, but the modelled α R largely depend on the relative profile shapes of the target gas and scaling gas.
Overall this finding is in agreement with the recent findings of .
(b) Uncertainties in α R due to small scale variability not covered by the CTM is addressed by a comparison of CLaMS where the vertical profile is sampled. In order to test how this uncertainty propagates into ∆α R all simulated trace gas profiles are artificially shifted by 500 m upwards and downwards and the largest and lowest α R are then used as uncertainty boundaries for each measurement geometry.
During most flight sections, ∆α R is dominated in equal parts by the uncertainty due to Mie scattering and sub-grid variability. However, if the vertical gradient of the involved trace gases is strong around flight altitude (e.g. at 08:00 -09:00 UTC in 25 Figure 3), the vertical sampling uncertainty is the dominating effect ( Figure 6 in the supplement). The resulting uncertainties are typically ∆α R ≈ 10% ... 20% for O 3 and NO 2 , and in rare cases of large vertical gradients up to ∆α R ≈ 50%.

Potential systematic errors of the scaling method
In our study a priori information on the profile shapes is either taken from CTM/CCM modelling, or in the case of O 4 from calculations. It is thus necessary to consider how uncertainties in the predicted profile shapes propagate into the inferred  Figure 5. Evidently including the cloud cover reduces α R in O 4 scaling but does not significantly change α R in O 3 scaling. Most striking is the influence of (broken) clouds on the O 4 scaling as evidenced by the large reduction in the calculated α R for measurements prior to 8:00 UTC and after 13:00 UTC. Some proxy information on the cloud cover below the aircraft can be inferred from the colour index calculated from backscattered radiances at 600 nm / 430 nm received 30 by the nadir VIS3 channel (panel c in Figure 5). Unlike for the time period between 9:00 and 12:30 UTC, when a more or less uniform cloud layer prevailed below the aircraft, the broken cloud cover past 13:00 UTC caused inferred [NO 2 ] O4 to become Atmos. Meas. Tech. Discuss., doi:10.5194/amt-2017-89, 2017 Manuscript under review for journal Atmos. Meas. Tech. In conclusion the profile shape dependence of the scaling method thus mandates to carefully choose the scaling gas, i.e.
O 3 appears more appropriate as a scaling gas for the detection of gases of low tropospheric and large stratospheric abundance when probed from an aircraft flying in the middle and upper troposphere and lowermost stratosphere (e.g., such as NO 2 , BrO) while O 4 appears to be more suited for gases of large concentrations in the lower troposphere (e.g., such as CH 2 O, C 2 H 2 O 2 , IO, and in polluted environments HONO and NO 2 ) when probed from low flying air-borne vehicles.

EMAC versus CLaMS profile predictions
Next, the sensitivity of inferred [NO 2 ] as a function of predicted O 3 and NO 2 curtains is investigated. NO 2 mixing ratios are retrieved using trace gas curtains predicted by CLaMS ( Figure 3) and EMAC (Figure 7). The retrieved NO 2 mixing ratios agree within the random errors during most flight sections (Figure 8, panel b). However, some differences between the models have an impact on retrieval results, such as the higher spatial and temporal resolution of the CLaMS model. For example, a 20 local maximum in [NO 2 ] is predicted by CLaMS between 13:00 and 13:30 UTC but not by EMAC (Figures 3 and 7, respective panel f, and Figure 9, panel c  Figure 8, panel a, and Figure 9, panel e). Two reasons for these differences can be identified. First, there is a discrepancy in predicted tropospheric BrO concentrations between the models, which leads to a difference in calculated α BrO 30 at all altitudes. Below 9 km altitude, CLaMS predicts 3 -5 ppt, while EMAC predicts concentrations close to zero (Figure 9, panel b, dashed and dotted lines). This discrepancy is probably due to missing tropospheric sinks in the CLaMS model (sect.

3.5). Hence, the EMAC-predicted [BrO]
profile is expected to be more realistic. Secondly, while the extent of the polar vortex is predicted roughly in the same manner, the treatment of subsidence and methane degradation differs between the models. This can be observed by comparing measured and predicted methane mixing ratios in flight sections B and D (Figure 8, panel c, and Figure 9, panel g). For both flight sections measurements indicate air mass ages up to 4.5 years in combination with strong dehydration (Rolf et al., 2015) and denitrification (Jurkat et al., 2017). However, the subsidence of O 3 appears to be overestimated in the CLaMS model, since the vertical profile of measured O 3 concentrations is more accurately represented by EMAC (Figure 9, panel a).

5
In conclusion, differences in relative profile shapes predicted by the employed models and their spatial and temporal resolution influence the retrieval results of the scaling method. These differences are particularly large, if fundamental properties of the atmosphere, e.g. the presence of BrO in the troposphere or the subsidence in the polar vortex, are treated differently by the models. In most cases, inferred mixing ratios agree whatever model predictions (CLaMS vs. EMAC) are taken.
5 Sample results and discussion 10 Finally, we discuss the mini-DOAS observations from the flight on 13 September 2012 in the context of complementary measurements and model predictions (Figures 8 and 9). Beside the mini-DOAS measurements of O 3 , NO 2 , and BrO complementary instrumentation provided information on the following gases: O 3 from the Fairo instrument, NO and total NO y from the AENEAS instrument, and CO and CH 4 from the TRIHOP instrument (section 3.4). These measurements are further compared with the predictions of CLaMS and EMAC, which support the interpretation with respect to the atmospheric dynamics 15 and photochemistry. Most notable is the joint detection of NO, NO 2 , and total NO y (and of BrO) in a remote location, such as in the Antarctic troposphere and lowermost stratosphere, since such measurements are infrequent or to date not existing.
Overall, mixing ratios of BrO and NO 2 are inferred for the whole flight with a time resolution of 30 s and a resulting spatial resolution of 6 km, although radiative transfer implies further averaging along the line of sight (perpendicular to flight direction) of 200 km and along flight direction of 10 km. A conservative estimate for the detection limits at low mixing ratios is 2 20 ppt and 10 ppt for BrO and NO 2 , respectively. Measurements of CH 4 , which is well mixed in the troposphere and degrades in the stratosphere, provide a measure of stratospheric age of the air. Accordingly, the flight is subdivided into five flight sections A ... E (Figure 8, panel c) in order to distinguish data recorded in the midlatitude lowermost stratosphere (flight sections A and E), polar winter vortex air (flight sections B and D) and the polar troposphere (flight section C). In September 2012 the tropospheric CH 4 mixing ratio at Cape Grim, Tasmania was 1778 ppb (http://www.csiro.au/greenhouse-gases/).

25
Inferred BrO mixing ratios are around 4 ppt / 7 ppt in flight section B and 6 ppt / 8 ppt in flight section D, based on retrievals using CLaMS / EMAC in the scaling method, respectively (Panel a of Figure 8; differences between both retrievals are discussed above in section 4.2). These concentrations are on the higher end of comparable BrO measurements in the same altitude range (12 -13 km) reported in the literature (Harder et al., 1998;Dorf et al., 2006;Hendrick et al., 2007;Werner et al., 2017), which could be caused by the subsidence of stratospheric air from higher altitudes discussed above. Panel 30 e of Figure 9 shows the vertical BrO profile retrieved from the ascent of the dive at 65 • S. The retrieved [BrO] O3,EMAC and [BrO] O3,CLaMS are both below the detection limit of 2 ppt in the altitude range below 9.5 km, even when using RT calculations based on CLaMS, which predicts 3 ppt BrO in the troposphere. Hence, below 9.5 km altitude BrO could not be detected above
the detection limit. The amount and distribution of halogen oxides such as BrO (panel e) in the troposphere is a matter of current debate (Harder et al., 1998;Fitzenberger et al., 2000;Van Roozendael et al., 2002;Saiz-Lopez and von Glasow, 2012;Volkamer et al., 2015;Wang et al., 2015;Schmidt et al., 2016;Sherwen et al., 2016;Werner et al., 2017) and is of significant scientific interest due to its potential influence on tropospheric ozone chemistry (von Glasow et al., 2004) and thus radiative forcing (Sherwen et al., 2017). Reported tropospheric background profiles at polar latitudes include those by Fitzenberger scaling is insensitive. This is consistent with the expectation that a scaling gas with similar profile shape as compared to the target gas is best suited for the method. The comparison of retrievals involving a CTM (CLaMS) and a CCM (EMAC) reveals that results are in agreement within the random error, as long as the fundamental properties of the atmosphere are represented 10 in a similar way (e.g. presence or absence of a trace gas in the troposphere). Further, the comparison indicates that CTM/CCM curtains with spatial resolutions close to those of the measurements are desirable.
The present study shows the applicability of the scaling method to HALO mini-DOAS measurements of NO 2 and BrO at altitudes between 3.5 and 15 km under all sky conditions. It can be argued that the scaling method replaces the major a priori used in the traditional optimal estimation (i.e. the aerosol and cloud profile) by a different a priori (i.e. the relative 15 trace gas profile shape). The latter a priori has the advantages that (a) it is more homogeneous in space and time on the scales relevant for the air-borne DOAS measurements, and (b) it can be predicted more reliably by modern CTMs/CCMs as compared to the presence of aerosols and clouds. Thus, the scaling method provides a novel and reliable means for inferring trace gas concentrations from air-borne UV/vis limb measurements. The significantly decreased dependency on aerosol and cloud properties increases the ability to make use of already recorded data and decidedly widens the applicability of air-borne 20 UV/vis limb spectroscopy as a means of investigating atmospheric photochemistry.