the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Maximizing the Use of Pandora Data for Scientific Applications
Abstract. As part of the Pandonia Global Network (PGN), Pandora spectrometers are widely deployed around the world. These ground-based, remote-sensing instruments are federated such that they employ a common algorithm and data protocol for reporting on trace gas column densities and lower atmospheric profiles using two modes based on direct-sun and sky-scan observations. To aid users in the analysis of Pandora observations, the PGN standard quality assurance procedure assigns flags to the data indicating high, medium, and low quality. This work assesses the suitability of these data quality flags for filtering data in the scientific analysis of nitrogen dioxide (NO2) and formaldehyde (HCHO), two critical precursors controlling tropospheric ozone production. Pandora data flagged as high quality assures scientifically valid data and is often more abundant for direct-sun NO2 columns. For direct-sun HCHO and sky-scan observations of both molecules, large amounts of data flagged as low quality also appear to be valid. Upon closer inspection of the data, independent uncertainty is shown to be a better indicator of data quality than the standard quality flags. After applying an independent uncertainty filter, Pandora data flagged as medium or low quality in both modes can be demonstrated to be scientifically useful. Demonstrating the utility of this filtering method is enabled by correlating contemporaneous but independent direct-sun and sky-scan observations. When evaluated across 15 Pandora sites in North America, this new filtering method increased the availability of scientifically useful data by as much as 90 % above that tagged as high quality. A method is also developed for combining the direct-sun and sky-scan observations into a single dataset by accounting for biases between the two observing modes and differences in measurement integration times. This combined data provides a more continuous record useful for interpreting Pandora observations against other independent variables such as hourly observations of surface ozone. When Pandora HCHO columns are correlated with surface ozone measurements, data filtered by independent uncertainty exhibits similarly strong and more robust relationships than high-quality data alone. These results suggest that Pandora data users should carefully assess data across all quality flags and consider their potential for useful application to scientific analysis. The present study provides a method for maximizing use of Pandora data with expectation of more robust satellite validation and comparisons with ground-based observations in support of air quality studies.
- Preprint
(12151 KB) - Metadata XML
-
Supplement
(1650 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on amt-2024-114', Manuel Gebetsberger, 17 Aug 2024
# General Comments
## Overview
The presented manuscript proposes an alternative data flagging procedure to the standard PGN flagging for HCHO and NO2 column densities, retrieved from MAX-DOAS and direct sun measurements, in order to increase the data amount that can be used for scientific studies. As such, the topic of the manuscript is highly important for users of PGN data-products. This can help the data-users and readers of this manuscript to better understand the standard flagging, and most importantly to apply their own filter criteria with the presented approach, or even go beyond. The authors use the linear correlation coefficient as a metric to validate their novel approach for both species, although the focus and interest is more on HCHO. The correlation of HCHO to surface O3, and airborne data for both HCHO and NO2 are presented as case studies.
The motivation to increase the sample for scientific analysis is certainly important, but the reason why data are flagged still needs to be taken into account. Unfortunately, the main part and the supplemental review of the quality flags is described rather vague, with both missing parts and wrong statements of the current flagging. Therefore, the manuscript would highly benefit from a more in-depth analysis of the standard quality flags to highlight the driving quality indicators which lead to the flagging, and the additional corrections about the current flagging.
## Title
The title is misleading in terms applicability to which PGN data product. The presented study focused on HCHO and NO2, but with a strong focus on HCHO. However, the Pandora data pool also covers O3, SO2 and H2O, which have not been demonstrated in the manuscript. For both O3 and SO2 there are no MAX-DOAS data products available, which limit the presented combined-approach to HCHO and NO2. H2O would be available by both the direct sun and MAX-DOAS measurements, but was not considered here. Therefore, the presented approach is not generic enough to be applied for all Pandora data products, which should be properly reflected in the title.
## Comment On Quality flags
The quality flags are propagated from L1,to L2Fit,to L2 and end up in the different clusters for high, medium, low data quality, that can be un-assured, assured, or unusable. This means if a single retrieval is identified as low data quality on the L1 side, it cannot be of better quality on higher levels. If the number of dark cycles is already too low, or saturated data occurred already on the L0 side, or the spectrometer temperature is too far away from the characterized temperature in the laboratory, data will be flagged into medium or low quality already. The same applies on higher levels. If for instance the L1 retrieval is of high quality, but the spectral fitting wrms is exceeded due to a spectral signal which cannot be captured by the retrieval polynomials, data can be flagged into the categories based on the threshold which is exceeded.
The thresholds for some of the quality indicators come from the Gaussian Mixture Regression model approach. This approach is applicable to individual datasets, such as P25s1 HoustonTX. However, the PGN flagging does not use instrument-specific flags and uses a PGN average over multiple datasets. This could indeed lead to some datasets flagged to strict and some to weak. Has this approach been tested, to use for instance a HoustonTX-specific treshold of the wrms to not by-parse L1 filter?
More general, the wrms is THE quality indicator of the spectral fitting. By by-parsing this quality indicator as a flagging criteria, potential slant column biases introduced by spectral signals are ignored. The same applies for the unusable category of 20,21,22, which should not be ignored.
On the other hand, there might be quality indicators which are too strict and can lead to a filtering of valid retrievals, which is the motivation of this manuscript. It would be needed to identify those quality indicators. Maybe, there is one or two quality indicators which are responsible for the majority of the filtering of 'valid' retrievals. It would be interesting to see if the simple removal of them can already lead to the same effect as the proposed approach.
# Specific Comments
Here I refer to the lines of the manuscript:
18 change "PGN standard quality assurance" to "PGN standard quality flagging", since the assurance part does not change the high,medium, or low quality categorization
25 Have other uncertainty components been analyzed?
26 Confusing statement of "independent uncertainty filter". Does this refer to the reported "independent uncertainty" component in the L2 file or the presented approach which uses the "independent uncertainty"?
56 "interferences" would refer more to an optical problem. With respect the Delrin-problem, it was HCHO that was measured and retrieved, but it was not the just atmospheric HCHO in the lightpath.
60 replace https://blickm.pandonia-global-network.org to https://www.pandonia-global-network.org/ since not all data-users do have a blickm account. And blickm is a monitoring tool without providing reports, software, or data to download.
65 LuftBlick with capital "B"
75 With respect the direct sun HCHO retrieval I would additionally cite the ReadME:https://www.pandonia-global-network.org/wp-content/uploads/2023/11/PGN_DataProducts_Readme_v1-8-8.pdf since Spinei et al. is the originator of the MAX-DOAS retrieval but not of the direct sun.
91 At the end I would mention that the approach has been demonstrated solely for HCHO and NO2.
92 H2O is also an official data product provided by the PGN, with rcodes wvt1 and nvh3 for direct sun and MAX-DOAS retrievals, respectively.
105 Not all Pandoras are stabilized at 20, some are also measuring at 15, which highly depends on the location and environment where the instrument is set-up.
111 latency correction is not applied in a characterization step, and also not characterized in the laboratory, since it would require to open the spectrometer and flip the CCD.
112 stray light characterizations are not applied in the processor 1.8, which limits the stray light correction to the simple straylight method, which is to subtract the signal below 290 nm. The straylight correction matrix method currently not applied.
161 Here the user might benefit from the information that the highest angle is used a reference in the spectral fitting. Which further means, that if this angle is contaminated by an obstruction (e.g. tree), a spectral signal could enhance the wrms and further the data product. What was the azimuth angle of all the datasets used in this study? Where the instruments looking in the same direction?
185-189 uncertainties are not used in any part of the processor 1.8 flagging procedure. It is true that the "total uncertainty" of the processor 1.7 was used which is the independent uncertainty.But it was removed from the flagging criteria in 1.8. The reason was that this parameter was too dependent on the instrument's sensitivity and schedule/routines the Pandora is measuring, since longer exposure times typically have larger independent uncertainties. However, processor version 1.8 provides a detailed uncertainty budget which might be of interest also as a decision criteria which data to use.
190-202 What is the reason for "much of the data unavailable"? Which parameter is the driving quality indicator?
226 It is expected that the independent uncertainty has an overlap in all quality flag categories, since it is not reflecting any issues on L1 site (e.g. small number of cycles) or L2Fit side (spectral features which cannot be captured). It would also not reflect an air mass factor error on the L2 side, if the instrument has been using the wrong PC time for example. This problem is highly impacting the L2 columns in terms of the diurnal shape which is of interest for satellites like TEMPO. Pandora25s1 at HoustonTX has this problem. Here the periods have been categorized as unusable (20,21,22) by the quality assurance part. How is the presented approach accounting for such situations if no quality assurance has been applied on the dataset? Because this would is not reflected in the independent uncertainty, because the instrument can still be properly aligned and looking into the sun.
255 Figure 6. How is the independent uncertainty related to the atmospheric variability parameter?
265 How is this threshold defined and what is the objective approach behind?
270-272 Is this improvement related to the data removal due to the wrms < 0.01
Figure10 as soon as MAX-DOAS comes into the recipe, the approach is not applicable for O3,SO2. It would also be needed to analyze H2O to demonstrate the applicability in a broader context.
322-324 This strong increase is great! However, since the flagging approach is not taking into account L1 related problems, nor potential slant column biases in the spectral fitting (covered by the wrms), some justification is missing if each retrieval is really usable or not. Is this increase attributed due to by-parsing one or two of the standard flagging criteria already? And if, which are those?
355-360 Is the R^2 the proper measure to demonstrate the applicability? Under the assumption to have a linear correlation, the correlation and R^2 should remain similar if 100 or 1000 datapoints are used. If the R^2 differ significantly, could this imply to have an undersampling or wrong assumption of the relationship? Can the R^2 between two different populations be compared directly? The relationship in Figure 12 implies a little bit to be non-linear for the MAX-DOAS columns. Can you provide some uncertainty range of the R^2, maybe by cross-validations or bootstrapping approaches? Or is there any expected correlation from literature between HCHO and surface O3 which supports a certain R^2 value where the sample should converge?
487 What is meant by other methods?
524 Is the means bias value showing some seasonality due to different mixing heights in summer and wintertime? This would mean in summer the MAX-DOAS would not see a larger fraction of the total column compared to wintertime. This could indicate a smaller mean bias in winter than in summer.
Table 2 Can you provide any uncertainty ranges of the R^2?
565 is the wrms threshold of 0.001 site-specific or generally applicable? How is this 0.001 related the the wrms threshold of 0.01 reported in Figure10 and on line 265?
575 It would be very interesting to see why so many datapoints are discarded. If this is related to 1 or two quality indicators. I encourage the authors to look into the L2 file, where all the needed information is reported (see example of L2 flag propagation).
-
RC2: 'Comment on amt-2024-114', Anonymous Referee #2, 29 Oct 2024
Overview:
Rawat et all in their manuscript “Maximizing the Use of Pandora Data for Scientific Applications” present a methodology to increase the amount of “scientifically usable” columnar NO2 and HCHO data from Pandonia Global Network by applying different from PGN standard filtering criteria. The approach consists in using an independent uncertainty (detector photon noise propagated to slant columns) threshold to eliminate poor quality data. This threshold is derived from the independent uncertainty distribution for high-quality flagged data as μ + 3σ. The data are further filtered by nrms (> 0.01) and maximum horizontal distance estimation for tropospheric columns (>20 km), and restoring measurements with < 10% relative error. The filtering results are verified by conducting linear regression analysis between different combinations of standard PGN quality flagged tropospheric column vs total column data of NO2 and HCHO. The main assumption is that the data are “scientifically useful” if correlation R2 is consistent for various flag combination of tropospheric column (scattered sky) vs direct sun measurements after filtering.
The focus of the paper is to better understand the quality of trace gas column measurements and “recover” PGN data potentially incorrectly labelled as low quality. This is a relevant topic for a publication in AMT considering the importance of PGN for satellite validation and air quality research. However, in current version this paper does not add any new knowledge about the quality of the measurements, and “physical” reasons for accepting more measurements.
Major comments:
The main assumption that the data are “scientifically useful” if linear correlation R2 is consistent for various flag combination of tropospheric column from scattered sky vs total columns from direct sun measurements is not totally proven. While I agree that they have separate analysis “paths” they do not have to be correlated (e.g. sampling different air masses due to difference in observation geometries) and they can be correlated for wrong reasons (e.g effect of clouds and aerosols, observation geometry). Actually, the only times they could be correlated are under totally cloud free, homogeneous conditions and perfect instrument performance – the high-quality flagged data.
The authors need to show how the parameter subset that goes into quality flag determination changes because of their filtering to convince that the resulting data are scientifically acceptable. There are certain metrics (e.g. wavelength shift) that have less impact on the DOAS fitting and air mass factor quality than others (e.g. clouds). The value of this paper would be to identify such main “drivers” of data quality based on a very detailed evaluation of instrumental and atmospheric uncertainties in PGN data.
In general, poor quality in direct sun DOAS fitting results can rise from instrumental problems (tracker pointing issues, coherent light interference, internal stray light, filter wheel issues, spectrometer changes leading to wavelength and slit function drifts, etc, inaccurate location or time) and atmospheric (presence of the clouds, leading to change in photon path and spectral saturation, spatial stray light). Poor quality in scattered sky data is mainly due to presence of cumulus clouds at the higher scan angles, pointing at obstructions, presence of clouds in the reference spectrum and pointing close to the sun, changes in scattering conditions between the scan measurements etc. There are two parameters that reflect the data quality to the first order: nrms of the DOAS fitting residuals and relative column error. Nrms is instrument and fitting window dependent and thresholds can be determined from the fitting data. Also, nrms of 0.01 is a very large value for typical trace gas DOAS fits to be valid.
A lot of examples were provided on the data from Houston, Texas metropolitan area, a near coastal region with relatively high presence of partly cloudy conditions. The presented results of the proposed filtering suggest that more than 90% of scattered sky data were not impacted by clouds. This might be overly optimistic.
Direct sun HCHO depends on spectrometer stray light properties and/or some other potential optical interferences. As a result, caution should be taken when interpreting the measurements. For example, collocated instruments often will not produce the same HCHO total columns.
It appears that there are interpolation errors in figures 8 and 9: constant Y values for changing X values.
I find the idea of using correlation improvement between column (HCHO) and surface (O3) measurements is a weak argument for selecting data quality of the measurements. The goal should be to derive this (surface to column) dependence based on the best quality data and not force it through selection of the data. Multiple studies have shown that column to surface ratios depend on a number of meteorological, emission and photochemistry conditions, so using it as an argument in favor of the new filtering might not convince the broader scientific community.
GCAS measurements depend on surface reflectance, aerosol and trace gas profiles and need their own validation. These measurements are typically considered less accurate than ground-based measurements. Again, using them as a verification tool seems not appropriate.
Missing detailed description of the standard PGN data flags and parameters that go into their determination
Need to provide more details for Figure S1: wavelengths, how this residual stray light was determined, etc
Citation: https://doi.org/10.5194/amt-2024-114-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
333 | 130 | 15 | 478 | 51 | 19 | 23 |
- HTML: 333
- PDF: 130
- XML: 15
- Total: 478
- Supplement: 51
- BibTeX: 19
- EndNote: 23
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1