Formaldehyde total column densities over Mexico City: comparison between MAX-DOAS and solar absorption FTIR measurements

Formaldehyde (HCHO) total column densities over the Mexico City Metropolitan Area (MCMA) were retrieved using two independent measurement techniques: Multi Axis Differential Optical Absorption Spectroscopy (MAX-DOAS) and Fourier Transform Infrared (FTIR) Spectroscopy. For the MAX-DOAS measurements, the software QDOAS was used to calculate differential Slant Column Densities (dSCDs) from the measured spectra and subsequently the Mexican MAX-DOAS Fit retrieval code (MMF) to convert from dSCDs to Vertical Column Densities (VCDs). The direct-solar absorption spectra 5 measured with FTIR were analyzed using the PROFFIT retrieval code. Typically the MAX-DOAS instrument reports higher VCDs than those measured with FTIR, in part due to differences found in the ground-level sensitivities as revealed from the retrieval diagnostics from both instruments. Three MAX-DOAS datasets using measurements conducted towards the east, west or both sides of the measurement plane were evaluated with respect to the FTIR results. The retrieved MAX-DOAS HCHO VCDs where 5%, 9% and 28% larger than the FTIR which, supported with satellite data, could demonstrate a large horizontal 10 inhomogeneity in the HCHO abundances. A time-dependent comparison revealed that the vertical distribution of this pollutant, guided by the evolution of the mixing layer height, can play an important role in how the results are affected. Apart from the reported seasonal and diurnal variability of HCHO columns within the urban site, background data from measurements at a high-altitude station, located only 60 km away are presented.

produced HCHO may account for up to 80% while during the night and before sunrise the primary sources, such as vehicle emissions, dominate the HCHO concentrations at the surface. In a study conducted by Lei et al. (2009), the impact of primary HCHO on pollution in Mexico City was analyzed. The authors indicate that HCHO emitted by primary sources dominates the concentration of this carbonyl both in the morning and at night, and HCHO decreases by approximately 1/3 in the afternoon.
In this study, we use a time series of more than 6 years to perform an unprecedented comparison of the HCHO total vertical 5 column amount measured with two independent techniques. Retrieval diagnostics from both the MAX-DOAS and FTIR results are used to characterize the difference and to improve the agreement and correlation between coincident data pairs. The seasonal and diurnal variability of HCHO columns is reported from a measurement site in the Mexico City urban area, as well as from a remote site in a high-altitude station located only 60 km away. Together with space observations, these results do not only serve to understand the local conditions in which this pollutant is emitted, produced and transported within the Mexico City 10 Metropolitan Area (MCMA), but will also provide confidence in the validation activities of model results as well as of the current and future satellite missions.

Methodology
In this section we describe the two independent measurement techniques, based on FTIR spectroscopy and MAX-DOAS, used to retrieve HCHO vertical column densities over two measurement sites. One is in the south of the MCMA, at the Universidad At the Altzomoni remote site, a high-resolution FTIR (Bruker Optics, IFS120/5) is operated remotely with a microwave antenna that allows us to have communication with the station. This instrument allows a maximal optical-path difference of 257 cm, recording spectra typically at 0.005 cm −1 resolution, and its solar tracker uses an astronomical telescope-mount that is controlled by the Camtracker software (Gisi et al., 2011;Gisi, 2012). Further details about the instrumental setup are provided elsewhere (Baylon et al., 2017;Plaza-Medina et al., 2017;Taquet et al., 2019). 5 Vertical Column Densities (VCD) are retrieved from FTIR solar absorption spectra in 4 spectral microwindows in the region between 2763-2782 cm −1 , using the spectroscopic line-data compilation AMT16 (Toon et al., 2016), available at http://mark4sun.jpl.nasa.gov/toon/linelist/linelist.html (last access: 20 January 2020), a Tikhonov L1 constraint and an a priori VMR profile taken as a 41-year mean  from the Whole Atmosphere Community Climate Model (WACCAM, version 4) model run over the particular measurement sites. Although this a priori profile would correspond to the background 10 HCHO concentration and not to the polluted conditions in Mexico City, this will not affect the resulting profile as a Thikonov-L1 constraint is used and just the relative shape of the vertical a priori profile is relevant for the constraint. The retrieval strategy and error analysis follows the recipe described by Vigouroux et al. (2018). The data are processed using the retrieval code PROFFIT9 (Hase et al., 2004) and the random errors at UNAM and Altzomoni are estimated to be around 5% (1.1x10 15 molec/cm 2 ) and 10% (0.2x10 15 molec/cm 2 ) of the corresponding mean columns (22.1x10 15 molec/cm 2 , 2.18x10 15 molec/cm 2 ) 15 (Vigouroux et al., 2018).

The MAX-DOAS measurements
A MAX-DOAS instrument, designed and built by the Spectroscopy and Remote Sensing Group at CCA-UNAM, was used to conduct sky measurements in the UV-Vis region of the electromagnetic spectrum. The MAX-DOAS instrument, which is collecting data since 2013, is installed on the roof of CCA-UNAM (same location as the FTIR-Vertex instrument) and forms 20 part of a small network (Arellano et al., 2016) of MAX-DOAS instruments that covers part of the MCMA.
The MAX-DOAS has a theoretical field of view of 0.31 • (Arellano et al., 2016) and performs measurements sequences with a telescope's azimuth angle of 85 • with respect to the north. Each elevation scan of the MAX-DOAS measurements starts with a zenith measurement. This is followed by a number of measurements towards different elevation angles, starting at low angles, all towards the same azimuthal direction. Approaching the zenith direction, the scan is continued in reverse order for 25 the elevation angles towards the opposite azimuthal direction, from large elevation angles towards small elevation angles. The measurement sequence, as specified in Friedrich et al. (2019) is: 90 • zenith, 0, 2, 6, 13, 23, 36, 50, 65, 82 • W, 82, 65, 50, 36, 23, 13, 6, 2, 0 • E. At the end of each scan, a dark spectrum is taken (closed shutter). The dark spectrum is subtracted from all other spectra measured during the sequence (including the zenith spectrum). A detailed instrument description and measurement strategy can be found in Arellano et al. (2016) and Friedrich et al. (2019). 30 Before conducting retrievals, spectra are filtered with the objective to remove all spectra either with too low light conditions (10% or less of the maximum possible intensity level) or saturated spectra in the retrieval region. Differential Slant Column Densities (dSCDs) are retrieved from the collected spectra at different elevation angles, following a DOAS approach using the QDOAS software developed at the Belgian Institute for Space Aeronomie (Danckaert et al., 2013). A wavelength calibration 4 https://doi.org/10.5194/amt-2020-208 Preprint. Discussion started: 1 September 2020 c Author(s) 2020. CC BY 4.0 License. was conducted in QDOAS by applying a nonlinear least-squares fit to a solar atlas (Chance and Kurucz, 2010). HCHO was retrieved in the 324.6 to 359 nm wavelength range, specific details are provided in Table 1. Cross-sections were convolved with the instrumental slit function and a wavelength calibration file created using a mercury lamp. O 4 was retrieved following settings described in Friedrich et al. (2019).  (2013) Ring spectrum Calculated with QDOAS according to Chance and Spurr (1997) and normalized as in Wagner et al. (2009) HCHO VCDs were retrieved using the MMF code (Friedrich et al., 2019). MMF uses HCHO dSCDs and converts them 5 into VCDs in a two steps process for each scan: first, the O 4 slant column density information is used to retrieve an aerosol profile. In the second step, the retrieved aerosol profile information is used together with the HCHO dSCDs to retrieve the trace gas profile. Both parts follow a procedure that consists of a forward model and an inversion algorithm. A constrained least squares fit is used in both steps, but the aerosol retrieval uses Tikhonov regularization and the trace gas retrieval uses optimal estimation. The forward model uses the radiative transfer code VLIDORT v2.7 (Spurr et al., 2001;Spurr, 2006Spurr, , 2013. 10 The inputs to VLIDORT are calculated using temperature and pressure information from daily radiosonde measurements and aerosol single scattering optical depths and asymmetry factors from the AERONET (Aerosol Robotic Network) data base for Mexico City.
To run MMF retrievals, the absorption cross section was taken at a wavelength in the middle of the wavelength interval used for the QDOAS retrieval: for O 4 retrieval it was 360 nm and for HCHO it was 338 nm. For aerosol, the a priori profile 15 shape was taken from 1 year of ceilometer data, averaged each hour. The a priori aerosol data for total optical depth were time-interpolated from the co-located AERONET station in Mexico City. For HCHO retrieval, a single a priori profile and covariance matrix taken from the chemical transport model WRF-Chem was used. The surface albedo used in this study was set to 0.07.

HCHO horizontal distribution from OMI satellite observations
In order to assess the horizontal inhomogeneity of HCHO in the MCMA, an average distribution map of HCHO was constructed from data between 2005 and 2018 from the Ozone Monitoring Instrument (OMI) satellite instrument and is presented in Figure   5 1. The OMHCHO Version-3 data product (Chance, 2007) was downloaded from the EARTHDATA web portal. The map was generated using the ReferenceSectorCorrectedVerticalColumn field from the SAO OMI product (González Abad et al., 2015). Only data with cloud fraction of 20% or less and the field MainDataQualityFlag set to 0 was used. From this average HCHO distribution map, a strong horizontal inhomogeneity over the MCMA is evident and motivates the investigation of the differences in computing VCDs using MAX-DOAS measurements conducted towards the eastern (V1) or western (V2) sides 10 of the scanning plane, or using all available measurements (V3), as will be described below.

Diurnal and seasonal variability of HCHO in Mexico City
A large data set of measurements taken at the UNAM site between January 2013 and May 2019 allowed us to study the diurnal and seasonal variability of HCHO. Figure 2 shows the time series of HCHO VCDs hourly means from the MAX-DOAS (V3) and from FTIR instruments located at this urban site. Additionaly, HCHO VCDs from a high resolution FTIR instrument 15 were measured from the remote site at Altzomoni providing relevant information about the variability of the background concentrations (see section 3.4). One can see that both instruments located at UNAM report values in the same order of magnitude, however, higher values in MAX-DOAS measurements than the FTIR instrument are apparent. This is in part attributed to the larger ground level sensitivity of the MAX-DOAS instrument with respect to the FTIR as will be shown later. are one order of magnitude higher than the background measurements at the mountain site Altzomoni (green). The same applies to their variance. The difference between the MAX-DOAS and FTIR measurements at UNAM is mainly explained by their averaging kernel and is discussed in detail in the text.
Compared to the instrument at UNAM, the FTIR instrument located at Altzomoni, presents lower HCHO VCDs hourly means values. This is to be expected since that instrument is located at a higher altitude level (>1700 m higher than UNAM) and therefore above the mixed-layer most of the time and thus probes cleaner atmospheric columns. 8 https://doi.org/10.5194/amt-2020-208 Preprint. Discussion started: 1 September 2020 c Author(s) 2020. CC BY 4.0 License.

Differences in the MAX-DOAS viewing directions
A detailed comparison between VCDs retrieved using the MAX-DOAS and FTIR measurement techniques was conducted and is explained in this section. The correlation between the coincident hourly mean vertical columns from FTIR and MAX-5 DOAS measured at UNAM are shown in Figure 5. The plots shown in the left column contain the retrieved VCDs without any correction. Three different data sets are presented in the different rows from top to bottom corresponding to three product versions: V1 retrieved VCDs from MAX-DOAS measurements conducted towards the east, V2 retrieved VCDs from MAX-DOAS measurements conducted towards the west and V3 retrieved VCDs from MAX-DOAS measurements conducted towards both sides of the scanning plane. The correlation coefficient is provided, along with the linear regression information when 10 forced to zero (red) and not constrained (green). Black lines represent the 1:1 relation. The resulting total column averaging kernel from the retrievals of the FTIR and MAX-DOAS data sets are presented in the third column of Figure 5, which already explains partly the relation between the vertical columns. The total column averaging kernel is the sum of the rows of the averaging kernel in partial columns [molecules/cm 2 / molecules/cm 2 ]. The averaging kernel of the FTIR is lower than 1.0 within the mixed layer (2280-4000 m a.s.l.), containing the largest amount of HCHO, and thus the VCD is underestimated. It can be 15 seen that the V1 data set has a better agreement (slope closer to 1) than V2 and V3. It is interesting to note that particularly the VCDs retrieved from MAX-DOAS using both measurement directions (lowest row) have an enhanced sensitivity in comparison to the FTIR retrieval, which partly explains the slope 1.28 and the 28% overestimation with respect to the FTIR retrievals.
For the correlation plots presented in the middle column of Figure 5, the retrieved MAX-DOAS profiles were smoothed with the FTIR averaging kernels resulting typically in lower MAX-DOAS VCD values. This smoothing process simulates how 20 the HCHO profile retrieved by the MAX-DOAS should be seen by the FTIR instrument. The smoothed profiles are calculated by multiplying the averaging kernel of the FTIR retrieval with the retrieved MAX-DOAS profile (Rodgers, 2000). For the V3 data set, the smoothing by the FTIR kernel improves the slope from 1.28 to 1.11, much more might not be expected as the vertical information in the MAX-DOAS profiles is limited to less than two degrees of freedom and do not represent the true atmospheric profiles. For a fair comparison, the effect of the different a prioris (Figure 6) in the retrieval is taken into account 25 and the new a priori for both retrievals is the average of the MAX-DOAS profile retrieval of V3.
To further investigate the large HCHO inhomogeneity already shown in Figure 1, VCDs from the MAX-DOAS products using different viewing directions (V1, V2 and V3) were analyzed independently. Figure 7 shows correlation plots between the coincident HCHO VCDs when using dSCDs measured towards the east (a) or the west (b) compared to HCHO VCDs computed using both sides of the scanning plane. Black lines represent the 1:1 relation and a linear regression not constrained 30 to zero is shown in red. The correlation between data sets for the east (V1) and both sides (V3) result in a Pearson's correlation coefficient of 0.887 with a slope of 0.630 and an offset of 6.781x10 15 molec/cm 2 . When comparing west V2 versus V3, the calculated Pearson's correlation coefficient is 0.908 with a slope of 0.722 and an offset of 5.616x10 15 molec/cm 2 . VCDs retrieved using measurements from both sides of the scanning plane are in general larger than VCDs retrieved using data from measurements of only one of the sides. This result can be explained by the larger amount of information available for the 9 https://doi.org/10.5194/amt-2020-208 Preprint. Discussion started: 1 September 2020 c Author(s) 2020. CC BY 4.0 License.  DOAS-apriori FTIR-apriori V3 avg DOAS-Constraint diag(Sa) Figure 6. A priori information for the MAX-DOAS and FTIR retrievals. The blue trace is the a priori used in the MAX-DOAS retrieval, the black dashed trace shows the square root of the Sa-matrix in the MAX-DOAS retrieval regularized according to optimal estimation (OE).
As commonly used for FTIR retrievals that form part of the NDACC network, an a priori taken from the WACCAM model was used. The green trace is the average of the MAX-DOAS retrieved HCHO profiles (V3, both sides) and gives an idea about the vertical distribution of HCHO above UNAM. This averaged profile is used as common a priori for the improved intercomparison between MAX-DOAS and FTIR. retrievals when dSCDs in different elevation angles and both scanning directions are used (Figure 8). However, a conclusive explanation from this analysis cannot be derived without investigating the time-dependent differences observed using different viewing directions.

5
The hourly differences between VCDs computed using the eastern or western sides of the scanning plane is investigated further, therefore simulated VCDs were calculated in order to compare them with measured VCDs. Simulated VCDs east-west differences are the result of the different amount of information in the retrievals in V1 and V2. The true profile has much higher HCHO concentrations in the polluted mixing layer than what the a priori reflects. The retrieval using both sides of the measurement plane contains more information originating from the measurements and allows the result on an optimal-10 estimation based retrieval to be less close to the a priori. The V1 and V2 retrievals do not always have the same amount of information, as the filtering criteria for the spectra do not act similarly throughout the day and spectral measurements are selected in an unbalanced way. Among the factors affecting the uneven amount of information used in the retrievals include permanent or temporal obstacles and the time-dependent probability of saturation of the spectra when viewing towards or close to the sun. This means that even if the atmosphere around the measurement site would be perfectly homogeneous in the horizontal plane, the columns retrieved using V1 and V2 data sets might be slightly different. We try to explain this by the different sensitivities and their averaging kernels (AK, AK tot east , AK tot west and their difference ∆AK tot east−west ), through the following equations.
x ret − x apr = AK(x true − x apr ) + (1) 5 Equation 1 (Rodgers, 2000) describes how the retrievd profile is related to the true profile and how other quantities as the total column can be derived. It is described by the following equation, where g tot defines the total column operator for profiles Here we choose x in the units of partial columns [molec/cm 2 ], so that AK tot is without units and shown in the last column of Figure 5.
If the errors 1 and 2 of each harmonised retrieval pairs have a random and sistematic component, the sistematic component is canceled out for each pair, while the random components result in 1,2 which should be zero in the average of a sufficient large number of measurement pairs.
If we assume that the retrieved profile x v3 using both sides of the measurement plane is to our current knowledge the best estimation for the HCHO profile x true , we use it to simulate the expected differences ∆col east−west . 10 This expected or simulated difference, which evidently depends on the time of the day, is calculated according to equation 6 and the results are shown in Figure 9 (red curve), along with the measured VCDs hourly differences. In this case, the specific distinction is made between the calculated east-west differences of the retrieved VCDs (blue), a simulated VCD difference explained above (red) and the resulting difference between these (calculated-simulated) in black dots.
The averaging kernels from the V1 and V2 retrievals, allows us to estimate and forecast a difference, because of their 15 different sensitivities. This effect, dominant after 15 h LT, most likely depends on the number of dSCDs available for the MMF retrievals that could be significantly reduced as the sun is closer to the viewing angles and do not pass the filtering criteria due to saturation. Alternatively, the forward model in QDOAS could be having more difficulties to explain the measured spectra, so that the errors in the retrieved dSCDs could be enhanced and thus unweighted in the MMF model calculations during these afternoon hours due to saturation or the presence of clouds towards the west. The detailed reasons for imbalanced information 20 content between the east and west retrievals are, however, not yet investigated. Nevertheless, the calculation of the red trace uses the HCHO profile retrieved in the V3, as it is probably the best estimation available, but it is of course not the true profile and the meaningfulness of equation 6 is therefore limited to be qualitative. Without grouping in different hours of the day, both differences originated from the gradient and from the information content cancel partly out and would not show a conclusive result.

25
As can be seen in Figure 9, the simulated curve resembles the calculated VCD differences (blue curve) and both curves present higher positive differences in the morning hours and a decline towards negative values in the afternoon and hence indicate that a large part of the observed difference is due to differences in information content and not necessarily due to 13 https://doi.org/10.5194/amt-2020-208 Preprint. Discussion started: 1 September 2020 c Author(s) 2020. CC BY 4.0 License. Figure 9. Average differences between the retrieved columns using the east and west measurement sides, as function of the hour of the day, are presented in the blue trace. The red trace simulates how the information available from the retrievals would produce a difference in the columns, assuming that the retrieval of the V3 data set using both measurement sides best describes the atmospheric state (see text for details). The black points is the difference between the found differences and the "forecasted" differences, and should show the part of the difference which might be related to a real inhomogeneity and a gradient between the east and the west HCHO concentrations in the mixing layer. The error bars represent the standard error and show that the measurement amount is large enough to calculate a statistically significant mean difference for each day hour, and the grouping for different hours is necessary. different real distributions. However, the difference between the calculated and simulated curves, shown as black dots, gives us a better indication of the relative HCHO abundances along the east-west viewing direction. During the morning hours, this calculated difference has positive values demostrating that the abundance of HCHO on the eastern side of the scanning plane is higher than in the western side. After 12 h LT, conditions change so that larger HCHO VCDs are measured towards the western 5 side of the scanning plane, peaking at 13-14 h. Afterwards, the situation appears to slightly return to the circumstances where HCHO is more evenly distributed in both directions and might be even a bit higher towards the east at around 16 h.

Time-dependent MAX-DOAS vs FTIR comparison
In the previous section, it was shown that analyzing the MAX-DOAS viewing directions independently can in part explain the large horizontal inhomogenity around the UNAM urban site. We now investigate the behaviour in the correlation between  h LT and 0.49 at 13 h, while slopes vary between 1.62 (12 h) and 0.93 (15-16 h). It is interesting to point out that the slope steadily increases between 9 h (1.21) and noon (1.62), the latter being the precise inflection point where the MAX-DOAS 5 instrument reports a significant change in the HCHO VCDs horizontal distribution (Figure 9). At 13 h (1.45) the slope starts to decrease until it stabilizes at 15-16 h (0.93). On the other hand, the correlation coefficient steadily decreases between 9 hours (0.71) and 13 h (0.49), increasing afterwards, reaching a final value of 0.6 at 16 h.
The relation between the MAX-DOAS and the FTIR VCDs is described by the scatter (the Pearson correlation coefficient), the slope and constant bias. As we already have seen in the comparison between the V3 MAX-DOAS data product (both sides) 10 with respect to the single sides, having a slope of 1.0 does not ensure that both retrievals are correct and similar to the true atmospheric state, but it rather means that both sensitivities are similar.
Based on Equation 1 (Rodgers, 2000) and assuming that the variability in the HCHO concentration profile can be described by a Gaussian probability distribution with the mean profile (x mean ) and a covariance matrix Sa (unfortunately not known), the Pearson's correlation and the slope in the scatter plot of two retrievals can be calculated using the averaging kernels and the 15 errors of both retrievals.
Neither the retrieved FTIR profile nor the MAX-DOAS profile retrieval have sufficent degrees of freedom, therefore the strategy of using the profile information from one instrument together with the averaging kernel of the other instrument is not too promising.
Here we try to evaluate the consistency of the two retrievals differently. Starting with Equation 2 and taking into acount that 20 both instruments are measuring coincidently the same atmospheric state x true .
The average of the product of the columns of both instruments is given by following equation: To simplify the interpretation we assume that the Averaging Kernels of both instruments are more or less constant and independent in i. This assumption is valid for the FTIR-Averaging Kernel ( Figure 10, Equation 7), but it is not the case for the 25 MAX-DOAS measurements and it is therefore an approximation, which might be verified in each case. Another simplification is the assumption that, the retrievals are corrected using the same apriori x apr profile, which is also the mean of the true profiles x true of the ensembles. So the average of ∆col DOAS and ∆col F T IR are zero.

5
If the correlation plot is limited to just this hour as shown in Figure 10, the Sa just describes the variability in that hour. For a certain time of the day it is more probable that the variability and covariance matrix Sa is described by a single dominant If the errors σ DOAS and σ F T IR in the columns of MAX-DOAS and FTIR can be neglected, the resulting Pearson's correlation coefficient would be 1.0 according to equation 9 and the slope would be given by (AK tot DOAS · v)/(AK tot F T IR · v), the quotient of the weighted averaged total column averaging kernel elements using the weights v.

10
In Mexico City, we could assume that at 9 h LT the mixing layer is well mixed with HCHO up to a certain height with a constant concentration but with 0 or at least a constant HCHO value above this height.
The variability of the concentration profile with a fixed shape (v = λ v e v , with λ v = |v|) from one day to another, has a strong impact on the Pearson's correlation coefficient R (more variation with respect to the errors (FTIR and MAX-DOAS) results in a better R-value), but not on the value of the slope. The slope is given by the averaging kernels of the two instruments 15 and the shape of the variable profile v, and for the simple assumption described above, that the only Eigenvector is constant in the mixing layer but 0 above it, the slope is the fraction of the mean averaging kernel elements in the mixing layer (MAX-DOAS/FTIR). In case we cannot explain the experimental measured slopes, we learn that there are some other processes involved, which are not described by the simplified Sa. Maybe there might be sometimes also a pollution plume above this well mixed layer. 20 The individual plots in Figure 10 showing the correlations and their slopes for each hour allow us to demonstrate that the assumption that the mixing layer height dominates the variability and that such simplification is valid on a certain hour, instead of simply cross validating the retrievals. The validation is therefore given due to the fact that a plausible variability for each hour explains the slope and correlation for different hours, rather than that the slope and the correlation is close to 1.0.

Background HCHO variability in a remote site 25
In order to investigate background HCHO levels, HCHO VCDs retrieved from measurements conducted with the high resolution FTIR spectrometer (Bruker HR120/5) in Altzomoni are presented in Figure 11.

Discussion & Conclusions
In this contribution we present a comparison between HCHO total column densities retrieved from two independent measurement techniques: MAX-DOAS and solar absorption FTIR. Both measurement techniques, although based on spectroscopy, exhibit a very different measurement strategy and geometry. Albeit these differences, a good agreement was obtained between 5 both instruments. Due to the versatility of the retrieval code used to process the MAX-DOAS data, VCDs were retrieved using measurements conducted towards different viewing directions. Retrieval products were obtained employing measurements conducted exclusively towards the east, the west, or using measurements conducted towards both sides of the measurement station. Considering the FTIR results as the reference, MAX-DOAS VCDs from these data sets where 5%, 9% and 28% larger than FTIR, respectively.

10
Reasons for the overestimation of the MAX-DOAS over the FTIR results are attributed to an enhanced ground level (lowest few kilometers of the atmosphere) sensitivity of the former with respect to the latter. However, the intrinsic differences between the two measurement techniques could also account for the discrepancies found in this study. In the first place, both measurement techniques have different sampling geometries and strategies. The MAX-DOAS instrument measures spectra at different elevation angles, leading to an altitude-averaged measurement in the lower atmosphere. From these measurements, 15 HCHO VCDs are computed using a numerical code -MMF- (Friedrich et al., 2019). For the MAX-DOAS instrument, typically, a complete cycle encompassing several observations may take several minutes, which could be a disadvantage during periods of rapidly varying radiation transport conditions in the atmosphere, such as varying cloud cover, aerosol load, or during sunrise and sunset. The distance from which photons are scattered and reach the instrument's telescope is variable, and depending on atmospheric conditions it has been calculated to be between 0.6 and 6 km (Platt and Stutz, 2008). Meanwhile, the FTIR in-20 strument receives direct sunlight, so that the air mass sampled by the instrument comes from a well defined cone angle smaller than the solar disc (external field of view around 7 mrad at UNAM and 2 mrad at Altzomoni) and the time resolution of the FTIR-spectra are described by the measurement duration and frequency. The measurement duration and time distance are 1 min duration and every 5 minutes at UNAM and 7 min duration and every 21 minutes at Altzomoni. After computations, direct sun total HCHO column is delivered. Further characterization of the differences between both measurement techniques, for example by using the same sampling geometry, is work in progress for the Spectroscopy and Remote Sensing Group at CCA-UNAM and in the future will provide further insights about the differences and possible biases between the two techniques due to distinctive spectroscopic characteristics and retrieval approaches.

5
Moreover, this research provided the opportunity to study in more detail horizontal HCHO inhomogeneities in the MCMA, identifying diurnal and seasonal variabilities of the abundance of HCHO total columns which in the future could be used to further study primary vs. secondary HCHO in the MCMA and develop specific analysis strategies focused on the identification and disaggregation of freshly emitted and secondary produced HCHO in the boundary layer of the MCMA. Satellite-based data has been used to corroborate the spatial inhomogeneity of the HCHO total column over the MCMA as shown in Figure 1, 10 strengthening the importance to continue these type of inhomogeneity studies at different azimuthal angles, in different zones of the MCMA and also focusing on other atmospheric constituents.
Identifying and characterizing horizontal inhomogeneities with respect to the abundance of molecules present in air can also be of service when making decisions regarding location and azimuth measurement angles for future MAX-DOAS stations in the MCMA. Future work includes to study horizontal inhomogeneities at other stations of the MAX-DOAS network as well as 15 horizontal inhomogeneities of other chemical species, such as nitrogen dioxide, which is routinely retrieved as well from the spectra measured by the MAX-DOAS instruments located in the MCMA.
It is worth mentioning that these type of strong spatial heterogeneity scenarios have been observed in different areas of the planet and specific studies of atmospheric constituents have been or are currently being performed in other urbanized and densely populated areas such as North America (Boeke et al., 2011;Chance et al., 2000;Millet et al., 2008), China (Cheng et al.,20 2015; Zhang et al., 2019), particularly in the Beijing-Tianjin-Hebei region (Zhu et al., 2018) and the Yangtze River Delta area (Chan et al., 2019;Hong et al., 2018;Tian et al., 2018;Wang et al., 2016); South Asia (Rana et al., 2019) and more specifically India (Chutia et al., 2019;Surl et al., 2018) and Pakistan (Khan et al., 2018;Khokhar et al., 2015). In addition specific case studies have been conducted globally (Wittrock et al., 2006) or in the Southern Hemisphere (Ahn et al., 2019), the Atlantic Ocean (Behrens et al., 2019) and the East China Sea (Tan et al., 2018). Findings of these studies imply enhanced abundance of 25 HCHO over highly populated areas, areas with increased industrial activity, zones exhibiting biogenic emissions and biomass burning activities, along major highways and in some instances identifying cases of regional transport of pollutants.
Megacities are in constant evolution, exhibiting continuous changes in territorial extension, population size and spatial redistribution, as well as in the types of socio-economical activities performed every day. In many cases the spatial growth is uneven, resulting on areas of the city more prone to emissions or accumulation of pollutants due to chemical transformations or The quantified diurnal and seasonal variability of HCHO as well as the characterized horizontal inhomogeneity in the MCMA, presented in this study, could be attributed to direct emissions or secondary formation of HCHO from released precursors from anthropogenic and/or biogenic sources that form part of the MCMA and constantly influence its atmospheric 5 composition. Identification of either primary emissions or secondary formation of HCHO is outside the scope of this study, however, based on the results presented here and previous research conducted by Garcia et al. (2006), future analyses could include studying other molecules present in the atmosphere as tracers of primary or secondary HCHO.
In terms of further characterizing HCHO horizontal inhomogeneity, the Tropospheric Emissions: Monitoring of Pollution (TEMPO) instrument (Zoogman et al., 2017), an airborne mission to be launched in the near future, will allow us to corroborate 10 the hourly horizontal changes in the HCHO distribution over the HCHO that have been identified in this study. Author contributions. CR is responsible for the QDOAS retrieval setup and parameter choices, for the setup and running of the MMF code, for processing HCHO OMI data as well as for data analysis and interpretation. CR wrote parts of the Abstract and parts of Sect. 2, 3 and 4. CG is responsible for developing and optimizing various HCHO retrievals for Altzomoni and UNAM. Contributed to the developement of the NDACC retrieval strategy which was finally applied. CG has taken care of the measurements and focused in the separation of fresh emitted HCHO and secondary produced HCHO in the boundary layer of Mexico City. WS wrote parts of Sect. 2 and 3. MMF is responsible for the retrieval MMF code development and testing, the retrieval chain setup from spectra to profiles, and the retrieval parameter choices for 20 MMF and software support. MMF wrote parts of Sect. 2. DR is responsible for running parts of the MMF code. CAMR is responsible for processing HCHO OMI data and assisting on creating Fig. 1. JA and AB provided technical support for instruments and data management.
MG was involved in the data interpretation and wrote Sect. 1 and parts of section 2, 3 and 4. WS, AB and MG are responsible for running the FTIR retrievals at UNAM and Altzomoni and for data analysis and interpretation. TB and FH provided the HR 120/5 spectrometer located in Altzomoni and developed the setup of the spectrometer and solar tracker. Provided assistance to take the spectrometer in Altzomoni into