Long-term validation of MIPAS ESA operational products using MIPAS-B measurements
- 1Karlsruhe Institute of Technology, Institute of Meteorology and Climate Research, Karlsruhe, Germany
- 2Freie Universität Berlin, Institute of Meteorology, Berlin, Germany
- 3European Space Agency (ESA-ESRIN), Frascati, Italy
- 4Istituto di Fisica Applicata “N. Carrara” (IFAC) del Consiglio Nazionale delle Ricerche (CNR), Firenze, Italy
- 1Karlsruhe Institute of Technology, Institute of Meteorology and Climate Research, Karlsruhe, Germany
- 2Freie Universität Berlin, Institute of Meteorology, Berlin, Germany
- 3European Space Agency (ESA-ESRIN), Frascati, Italy
- 4Istituto di Fisica Applicata “N. Carrara” (IFAC) del Consiglio Nazionale delle Ricerche (CNR), Firenze, Italy
Abstract. The Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) was a limb-viewing infrared Fourier transform spectrometer that operated from 2002 to 2012 aboard the Environmental Satellite (ENVISAT). The final re-processing of the full MIPAS mission Level 2 data was performed with the ESA operational version 8 (v8) processor. This MIPAS data set not only includes retrieval results of pressure-temperature and the standard species H2O, O3, HNO3, CH4, N2O, and NO2, but also vertical profiles of volume mixing ratios of the more difficult to retrieve molecules N2O5, ClONO2, CFC-11, CFC-12 (included since v6 processing), HCFC-22, CCl4, CF4, COF2, and HCN (included since v7 processing). Finally, vertical profiles of the species C2H2, C2H6, COCl2, OCS, CH3Cl, and HDO were additionally retrieved by the v8 processor.
The balloon-borne limb-emission sounder MIPAS-B was a precursor of the MIPAS satellite instrument. Several flights with MIPAS-B have been carried out during the 10 years operational phase of ENVISAT at different latitudes and seasons, including both operational periods where MIPAS measured with full spectral resolution (FR mode) and with optimized spectral resolution (OR mode). All MIPAS operational products (except HDO) were compared to results inferred from dedicated validation limb sequences of MIPAS-B. To enhance the statistics of vertical profile comparisons, a trajectory match method has been applied to search for MIPAS coincidences along 2-day forward/backward trajectories running from the MIPAS-B measurement geolocations. This study gives an overview of the validation results based on the ESA operational v8 data comprising the MIPAS FR and OR observation periods. This includes an assessment of the data agreement of both sensors taking into account combined errors of the instruments. The difference between retrieved temperature profiles of both MIPAS instruments generally stays within ±2 K in the stratosphere. For most gases, namely H2O, O3, HNO3, CH4, N2O, NO2, N2O5, ClONO2, CFC-11, CFC-12, HCFC-22, CCl4, CF4, COF2, and HCN we find a 5 % to 20 % agreement of the retrieved vertical profiles of both MIPAS instruments in the lower stratosphere. For the species C2H2, C2H6, COCl2, OCS, and CH3Cl, however, larger differences within 20 % and 50 % appear in this altitude range.
Gerald Wetzel et al.
Status: open (until 18 Aug 2022)
-
RC1: 'Referee report on amt-2022-114', Anonymous Referee #1, 02 Aug 2022
reply
Review of "Long-term validation of MIPAS ESA operational products using MIPAS-B measurements", by G. Wetzel et al.
General comments:
This manuscript describes a comprehensive validation effort, using balloon-borne profiles from MIPAS-B, to study the biases and associated variability and uncertainties in the retrievals of a large number of VMR profiles from MIPAS aboard ENVISAT (MIPAS-E); some trajectory-based studies are used to provide additional "coincidences" between the balloon and satellite profiles, and more statistical analyses. This work covers the upper troposphere to the mid- to upper stratosphere. While there is no discussion of any systematic temporal changes (or trends), in part because the time period covered (2002-2012) is not quite long enough to study this well enough based on a few balloon flights, there are some noted differences between the two separate time periods when MIPAS-E was observing in different modes (the original full spectral resolution mode, FR, and the post-2004 optimized resolution mode, OR). One of the main conclusions is that the harder to measure species (for both instruments) lead to poorer overall agreement than for the species with stronger signals (and I assume that this is probably not too unexpected).
The comparisons are presented in fairly simple ways in a consistent fashion, which makes the large number of plots easier to digest (however, there is an issue with some of the font sizes, see comments later on). The summary Table is a good way to provide top-level conclusions, even if this can be somewhat oversimplified and difficult to do with a broad brush when differences change somewhat rapidly with altitude. Adding some suggested explanations for the larger differences (especially when outside the combined estimated error bars) could be useful, if possible and if not completely speculative. A few more comments regarding other relevant work (in particular, satellite-to-satellite intercomparison results from the SPARC Data Initiative) would be recommended and welcome, as this could reinforce the impression that MIPAS-E might have a real bias (or not), at least in a few specific cases (the same could be done versus ACE-FTS data in particular, since Raspollini et al. have already discussed some of those comparisons, although using a multi-satellite approach as done by the SPARC DI would be viewed as more comprehensive, even if multi-satellite means have potential issues as well, if some measurements are clearly less desirable than others). One should not forget that MIPAS-B is not necessarily "perfect data" either, so untangling a real bias versus just a relative bias can be difficult. It would also help if estimated systematic error bars for the MIPAS-E results were included in one of the Tables, since these values are provided for MIPAS-B, and the combined uncertainties are used (so estimates of error bars for MIPAS-E exist as well). Using a lower to mid-stratospheric range might be good enough for this, or one could consider separating this into two Tables - for two regions where the error bars might be significantly different; I am open to either approach, as long as more information is provided regarding the 'typical' satellite error bars (in tabular form). Otherwise, the manuscript is written in a fairly easy to follow manner, and I have no major objections or issues.
After a few improvements, which do add up to almost (but not quite) a major revision (see below for more details), I would recommend that this work proceed to publication in AMT, since this topic is well-suited for the AMT Journal (and there is also not much discussion in this manuscript regarding composition changes or processes in the stratosphere, for example). More specific (and also some very minor editorial-type) comments follow.
Specific comments:L200-201, here, why is a 2-sigma type of criterion not used, namely assign the term "significant difference" only when twice the SEM is smaller than the bias itself?? This is more in agreement with what most scientific studies would argue "significance" applies to (and if you disagree, please give some argument on this topic in your reply and in the text). The main impact might be in the Table of overall conclusions, where you discuss what may be "significant" (or unexplained) differences. Whether many of the plots should be changed is something else to think about - I am not necessarily arguing for this (but please be very specific regarding the meaning of the error bars given in these plots, 1-sigma or two-sigma, it seems that you list and show one sigma results...yes?).
L203, it would not be out of the question that unexplained errors in MIPAS-B could also be invoked to better "explain" relative differences between the two retrievals, at least in some cases possibly; also, unusual atmospheric variability could be partly responsible for a lack of "perfect coincidence" (also, trajectories and associated results are not "perfect" either). I just think that assigning "all" the "blame" for significant (enough) differences to MIPAS-E is not the only possible solution; perhaps you could admit to this without it invalidating the usefulness of MIPAS-B or these results, as I am not suggesting this at all either... "Unexplained relative biases" might well be a more resonable way of wording this, for example. It would also be useful to mention what the estimated systematic errors for MIPAS-E are, in Table 1, since this could also give the reader some feeling for which retrieval might be expected to be more accurate, if this is sometimes possible to say. However, certain factors like spectroscopic uncertainties (for example) would likely affect both retrievals in the same way, and if this sort of error was a dominant source of error, then neither instrument would be expected to be significantly more accurate (for absolute measurements) than the other... Just showing error bars for MIPAS-B is not really justified, in my view, and since such error bars do exist for MIPAS-E, why not give the reader some feeling for this as well? Are there enough issues in terms of the different satellite retrievals that this becomes a difficult problem to formulate? My issue here is that you have used some estimates, so why not provide at least a first-order example inn Table 1, or a similar Table? Tables do oversimplify things, especially if there is a fair amount of altitude dependence in the estimated uncertainties (error bars), but having something would be better than nothing. Please clarify, inasmuch as possible.
L238, it would be good to add just a sentence or so on the main differences between the current manuscript and the Raspollini et al. (2020) document, since these seem to deal with largely the same results. In fact, stating how ACE-FTS comparisons have enhanced these comparisons could be illuminating, especially when the MIPAS-B/MIPAS-E differences are (significantly) larger than one might have expected. On the same topic, I find that you should add at least a few sentences, when appropriate, regarding the results of the SPARC Data Initiative, for some of the species, especially when MIPAS-E biases (with respect to the satellite instrument mean) appear to follow the same tendency that is found here (although it is also interesting if they do not follow this tendency); I realize that the satellite intercomparisons can also be subject to discussion regarding where the "real truth" might be, as it is not necessarily found by showing a multi-instrument climatological mean. Nevertheless, I believe that it is a problem not to mention that document at all (or, actually, the more recent update by Hegglin et al., 2021) and give that work some credit in terms of at least relative bias identification for MIPAS-E; these sorts of studies rely on many more profiles and therefore, in principle, biases can be more robustly identified (although they are also relative biases, and exact knowledge of truth is always a difficult question). On this topic (SPARC DI), I recommend that at least a sentence or so be considered for each of a few of the species mentioned in this manuscript (H2O, N2O, CH4, HNO3, and NO2 are the main ones - while MIPAS-E ozone, in particular, is not seen to have significant issues), if it seems relevant/appropriate - but doing a bit more homework on this issue and adding some additional relevant text would be a change for the better.
L252, it would also be an improvement if you carried out a "gedanken" experiment, in order to at least roughly estimate what altitude uncertainty might be required to lead to such temperature differences (is it 100 m or more than 1 km, say?).
L267-270, see the comment for L252 also, is your thinking regarding H2O pure speculation or would there be a reasonable change in altitudes that could account for the observed relative biases in H2O (how large a change in z, if this is something one can explore "on the back of the envelope", without running full retrieval tests?). If this is just pure speculation, it is probably best to remove the text, I would say. If there are changes that can account for both the T and H2O differences, that might start to be more believable.
L289, "on the order of". Also, how do the SPARC DI results compare to these biases in MIPAS-E, i.e. is MIPAS-E on the high side versus other satellite data as well? If so, this might help your argument; if not, it may be more difficult to decide what to conclude - but adding a brief comment on this topic could well be useful.
L300, change "stated" to "mentioned". See my comment above for SPARC DI relevance, please check for CH4 and N2O as well.
L305, how large is the NO2 photochemical correction compared to the differences (pre-correction) between the two data sets? That is, does the correction actually improve the level of agreement? Again, the SPARC DI results might help the interpretation here (worth a try).
Table 3: Please be a little more consistent in terms of the comments when mentioning whether differences are extimated to be significant or not (this gets back to the 1-sigma versus 2-sigma question as well); in particular, there is a mention for H2O regarding differences [generally] being within the combined systematic errors, but why not be more specific also for CH4 and N2O?
Figure 1: This Figure could have larger fonts for the readers to be able to read the y-axis label (altitude) and the x-axis as well (one can remove the words "Volume Mixing Ratio" and just write "H2O (ppmv)" or "H2O / ppmv"). The larger labels in the plots do allow one to understand which species is shown, but the x-axis and y-axis labels could still be improved; at the same time, this would also allow for a larger font size in the numbers shown along the axes.
Figures 2 and similar: I should have mentioned this in the quick review, but the font size for the listed differences in these Figures is probably too small for readers to see well enough on a printed page (without using a zoom feature on the electronic version, even if this is the most likely use of published material these days). It would be good to reduce the unnecessary text and enable larger font sizes for the main comments; also, some things can be abbreviated and some can better be described in the legend(s) instead of inside the various annotated plots.
L389-390, do the differences between the OR and FR time periods suggest anything regarding the validity of the MIPAS-E data sets (absolute values and scatter or precision)? For example, is the OR mode (in some cases at least) maybe less robust or accurate than the FR mode, or is this too difficult to really ascertain?
Very minor (editorial-type) comments:L24, change "where" to "when".
L32, I suggest: "This includes an assessment of the data agreement between both sensors, taking into account the combined errors from both instruments."
L36, "a 5-20% level of agreement between the retrieved... For C2H2,...larger differences (within 20-50%) appear in this altitude range."
L43, "... operated between 2002 and 2012."
L52, ...logistical requirement that the satellite...
L75, solar time of 10:00...
L77, During each orbit, approximately...
L89, ...investigations, it was decided...
L91, back in operation
L97, "...was steadily increased..." [since this did happen]
L99-100, "... anomaly occurred, resulting in the loss..."
L141, comparable to or slightly better than
L142, overview of
L150, consistent with
L155, retrievals [plural might be better here]
L250, change MIPAS to MIPAS-B for extra clarity.
L257, add a comma before "we carefully looked at".
L281, the statistical agreement between the two data sets...
L321-322, suggesting the need for a more careful use...
L342, Deviations for CFC-11...are somewhat larger, up to ...
L346, is also clearly seen if one considers previous comparisons...
L350, is only available for CCl4 profiles
L361, positive bias for MIPAS-E...
L369, which is at the limit of the combined systematic errors.
L375, There is general agreement between both instruments between ...
L387, add a comma before "exceeding"
L393, available for COCl2.
L406, negative bias in MIPAS-E ...
L422, acts as a precursor for the stratospheric aerosol layer
L426, The agreement between the VMR profiles
L448, a somewhat poorer agreement
L456, "on the quality of the MIPAS satellite data."
-
RC2: 'Comment on amt-2022-114', Anonymous Referee #3, 02 Aug 2022
reply
Review of amt-2022-114The manuscript provides a validation overview of the MIPAS ESA operational products using measurements from a balloon version of MIPAS. Overall I found the manuscript is well-written, easy to follow, and provides information that is relevant to users of the operational MIPAS data. I would recommend publication after a few issues are considered. I have a few minor corrections and a larger concern about how the error analysis is performed, but I don't think accounting for any of my suggestions will be particularly difficult for the authors.General Comments:My main concern is how the combined error (Eq. 3) is calculated when comparing the two instruments.One issue is that this seems to neglect any potentially correlated error source between the two instruments.The classic example is a spectroscopic error, where if both retrievals use the same spectroscopic database this equation will overestimate the combined error, but there also may be correlated effects from non-LTE errors or other effects absent in both forward models.I understand that these errors may not be perfectly correlated between the two instruments due to differences in retrieval methods (I see different microwindows were mostly used), spectral resolution, etc., but if the dominant source of estimated systematic error between the two measurements is a potentially correlated error like a spectroscopic error it draws into question some of the conclusions made. It is not easy to correctly account for these correlations, but I would suggest at least stating what the dominant source of systematic error is for each case and analyzing if it is potentially correlated between the two instruments. Since some of the main conclusions of the paper are to recommend caution in areas where the observed differences between MIPAS-E and MIPAS-B is larger than the estimated systematic error it is critical that the estimated systematic error is interpreted correctly.A full validation of every species measured by MIPAS is a monumental task, and the MIPAS-E to MIPAS-B comparisons done by the authors is one piece of that puzzle. This is fine, I don't think the authors need to include more data or analysis, but as a naive MIPAS-E data user some of the results are hard to interpret on their own. The main takeaway that I get is that I should go read the MIPAS product quality document instead (of which a version of this manuscript serves as input to). Once again, this is not a problem by itself, but the manuscript could use some further explanation on how this work fits into the larger body of MIPAS validation efforts.Specific CommentsSection 2.1 l. 68: "hereinafter also referred to as MIPAS-E..."There are some places where simply MIPAS (mostly the remainder of this section) is used to refer to MIPAS-E. I would recommend always using MIPAS-E when referring to MIPAS on ENVISAT for clarity.Section 2.1 l. 78: "in steps of 3 km below 45 km''What about above 45 km?Section 2.1 l. 94: "... an equivalent improvement in the vertical and horizontal (along-track) sampling"I understand details of the sampling for the FR/OR modes of MIPAS can be found elsewhere, but the horizontal/vertical sampling of each mode should be stated in this section. Especially since the change in MOPD is stated.Section 2.1 l. 118: "All molecules except HDO ..."Is there a fundamental reason why HDO could not be validated as well?Section 2.2:Has there been any validation of the MIPAS-B data products separate from MIPAS-E that could be mentioned here?Section 2.2 l. 125: "... MIPAS-B performance is superior, in terms of NESR ...''Is the improvement in NESR from the averaging spectra or is there an instrumental difference that provides better NESR?Section 2.3 l. 200: "A bias between both instruments is considered significant if the SEM is smaller than the bias itself."Should twice the SEM be used here instead to be at the ~95% confidence interval?Section 2.3 l. 204: "Since the vertical resolution of the atmospheric parameter profiles of both instruments is of comparable magnitude, a smoothing by averaging kernels has not been applied to the observed profiles"I assume that the error estimates for both instruments also do not include the classical ``smoothing error''? If it is then this would cause the error estimates to be inflated since both instruments have a similar vertical resolution.Section 3 l. 226: "Trajectory matches are based on diabatic 2-day forward and backward trajectories with a collocation criterion of 1 h and 500 km as described in section 2."Is it possible to demonstrate how well the trajectory matching is working? If I understand correctly there are conditions where the measurement locations are collocated enough that trajectory matching is not necessary, maybe this can be used to show the effectiveness of the trajectory matching.Section 3.1 l. 245: "... although the standard deviations exceed the expected precision ..."Could this be because the trajectory matching introduces some variance into the comparisons as well? Even if the trajectory matching is perfect the collocation is still only within 500 km and 1 hr, which would contain some atmospheric variability.Section 3.1 l. 252: "A possible reason for this difference between both MIPAS sensors could be an inaccuracy in the altitude assignment..."Presumably this error is included in the error budgets of the instruments, you should be able to quantify if this could actually be the case.Section 3.2 l. 259: "FR and OR mode comparisons show different vertical shapes of the differences between MIPAS-E and MIPAS-B"Is there a significant difference between the retrieved vertical resolution of H2O in the FR and OR modes? Particularly with the strong altitude gradient of H2O a small change in vertical resolution could cause a large observed difference. In general for every species I wonder how much of the difference between the two modes can be explained from the changing averaging kernel.Section 3.6:Are there any estimates of how much error could be introduced due to an imperfect photochemical correction? I'm wondering if there could be some effect where the balloon flights tend to occur at a similar time each day and so you don't average over an ensemble of random SZA differences, but I'm just throwing things out there.Technical CorrectionsSection 3.1 l. 250: Is MIPAS here referring to MIPAS-B?
Gerald Wetzel et al.
Gerald Wetzel et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
147 | 38 | 5 | 190 | 4 | 3 |
- HTML: 147
- PDF: 38
- XML: 5
- Total: 190
- BibTeX: 4
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1