the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The novel GOME-type Ozone Profile Essential Climate Variable (GOP-ECV) data record covering the past 26 years
Abstract. We present the GOME-type Ozone Profile Essential Climate Variable (GOP-ECV) data record covering the 26-year period from July 1995 until October 2021. It is derived from a series of five nadir-viewing ultraviolet-visible(-near-infrared) satellite instruments of the GOME-type, including GOME/ERS-2, SCIAMACHY/ENVISAT, OMI/Aura, GOME-2/MetOp-A, and GOME-2/MetOp-B, which are merged into a single coherent long-term time series. It provides monthly mean ozone profiles at a spatial resolution of 5° x 5° latitude by longitude. The profiles are given as partial columns for 19 atmospheric layers ranging from the surface up to 80 km. The underlying profile retrieval algorithm is the Rutherford Appleton Laboratory scheme, which has sensitivity to both tropospheric and stratospheric amounts of ozone. The merged profile record has been developed by the German Aerospace Center (DLR) in the framework of the European Space Agency's Climate Change Initiative+ (ESA-CCI+) ozone project (Ozone_CCI+). Profiles from the individual instruments are first harmonized through careful inspection and elimination of inter-sensor deviations and drifts and then merged into a combined record. In a further step, the merged time series is harmonized with the GOME-type Total Ozone Essential Climate Variable (GTO-ECV) data record, which is based on nearly the same satellite sensors. GTO-ECV possesses an excellent long-term stability and with the homogenization an improvement of the robustness and stability of the merged profiles can be achieved. For this purpose, an altitude-dependent scaling is applied that utilizes ozone profile Jacobians obtained from a Machine Learning approach. We found that climatological ozone distributions derived from the final GOP-ECV data record agree with spatial and temporal patterns obtained from other long-term data records.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Atmospheric Measurement Techniques.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.- Preprint
(4278 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on amt-2024-196', Anonymous Referee #1, 06 Mar 2025
The manuscript by Coldewey-Egbers et al. presents a very interesting and innovative merged ozone profile data set from nadir observations (GOP-ECV). The authors introduce the main aim of the work, the harmonization procedure used for the merging of the data sets and the scaling procedure with respect to the GTO-ECV time series. This paper fits the scope of AMT, it is well written and scientifically sound. I found the given explanations overall convincing, with clear descriptions of the multiple, and sometimes complicated, steps. From my side, I only have some minor comments on specific aspects and some technical corrections.
Specific comments/questions
- I am wondering about the usage of SCIAMACHY data set over the period 2002-2004, as reported in Tab.1. Is it only used over these two years for the generation of GOP-ECV? Does the usage until 2012 have a negative impact on the merged dataset? I would suggest to include a short explanation in the manuscript about this choice. Regarding limb observations, in Sofieva et al. (2017) the usage of the first months of SCIAMACHY were not recommended due to some unexplained features in the anomalies (for SAGE-CCI-OMPS it is used from August 2003). Have you noticed any larger discrepancy in nadir data at the beginning of the SCIAMACHY period?
- OMI time series is used in this work as reference for the other datasets, also to remove drifts. Does the drift affecting OMI total column time series or its row-anomaly, e.g. Torres et al. (2018), Gaudel et al. (2024) supplements, have any potential impact on this choice?
- Just a couple of clarifications regarding the used neural network approach, as I am not very familiar with this. Is the described NN approach a sort of ozone profile retrieval? Are the derivatives extracted in a second step or directly provided by the NN?
In the simplest case from Tab.2, are you feeding the NN only with TOC for each class and let the hidden layers find a mapping between TOC and profile shape? It seems to me that in this case there could be profiles with the same TOC but different shape even within the same class. I just wonder how the NN is able to distribute the TOC variations vertically without having unique solutions.
How do you get to the number 420 in Table 2? I understand the 242 possible combinations in Table 3, as you have 2 hidden layers and 11 possibilities for each, times 2 options for the inputs, but I could not get to 420 combinations in Tab.1.
- A side note: is the spiky shape of the profiles in Fig. 5 a feature of the RAL retrievals? Most of them tend to have three local maxima.
Technical corrections
Line 13: I would add “presented in this manuscript” after “the homogenization”.
Line24: “banned” → “prevented”
Line 30-31: “the middle latitudes of the Northern Hemisphere” → “at northern mid-latitudes”
Line 59: Add a , after “data sets”.
Lines ~60: You could mention the advantage/disadvantage to use limb or nadir data to retrieve profiles in terms of vertical and spatial resolution.
Line 79: “allows us to generate” → “enables the generation of”
Line 80: “in particular important” → “particularly important”
Line 82: What is it meant with “investigation of changes in the profile”? Stratospheric ozone trends?
Line 85: “enables us to assess” → “facilitates the assessment of”
Line 97: Add , after “ozone profiles”.
Line 98: The UVN acronym was already introduced in the previous page.
Line 160: Add , after “level-2 products”.
Line 202: Also at northern mid-latitudes SCIAMACHY has a positive bias.
Lines 205-207: I would move this last three lines to the beginning of the paragraph (line 194), as these are general considerations about the seasonal cycle.
Line 216: “drift” is repeated two times.
Lines 220-222: Do you plot in Fig. 2 the fit, for example, to (GOME-OMI) anomalies (as you state in the text) or to OMI-GOME?
Line 227-229: I find hard to read the sentence starting with “From these deviations…”. I suggest to re-formulate such as: “From the time series of the offsets in each available spatial bin, at first, we calculate averages for each calendar month (“climatologies”) and then we average them over five broad latitude bands…”.
Line 242: “aligning” or “harmonizing”?
Line 262: “in particular as to the…” → “in particular in terms of the...”
Line 328: Remove , after “requires”.
Line 360: “of the parameters total ozone…” → I would add “of the parameters, i.e. total ozone…”
Line 371: Add , after “in advance”
Line 402: “only for example poleward of 50° N for 120°-180°” → “mostly at latitudes poleward of 50° N and at 120°-180° E.”
Line 436: “measurements from” → “measurements over”; “data from” → “data over”.
I would remove Line 446 as it repeats what said in the previous lines.
-
RC2: 'Comment on amt-2024-196', Anonymous Referee #2, 27 Mar 2025
“The novel GOME-type Ozone Profile Essential Climate Variable (GOP-ECV) data record covering the past 26 years”, by Coldewey-Egbers et al., presents a new data record of homogenized vertically resolved ozone data from nadir-viewing satellites that have been operating since the mid 1990’s. The authors describe their homogenization methodology, which includes both inter-satellite harmonization and then homogenization to the existing GTO-ECV record, which is based on total ozone column measurements from a similar subset of satellites. Given the significant uncertainties and controversies around recent ozone trends in both the troposphere and stratosphere, this new homogenized record will be a helpful contribution to the community. A careful description and documentation of how these types of data records are constructed is both necessary and a good fit for the AMT journal. Overall, this article does a good job of describing the chosen methodology for constructing the GOP-ECV record. However, some of the methodological choices seem to be lacking a clear justification, and my questions/concerns around these choices form the basis for my major questions, which I outline below. Other than these issues, there are a number of generally minor grammatical/typo/clarity issues that I identify in the minor comments section following these more major concerns:
- I understand the general motivation for wanting to create an ozone profile dataset whose vertically integrated column (i.e., TOC) matches that from an independent TCO dataset (which is believed to be stable and accurate for long term trends). But the number of adjustments being made to the profile dataset is concerning to me and confounds the interpretation of likely analyses with this data set. For example, each individual instrument is detrended and bias corrected relative to OMI as a function of longitude, latitude, and level, and then the entire merged GOP-ECV record is bias corrected to GTO-ECV. It is challenging for me to understand how this process distorts the original measurements, and whether or not the resulting GOP-ECV can really be trusted for trend studies or considered as an independent trend estimate given the way it is tied to GTO-ECV. One could imagine an alternative merging where the source trend records are preserved (i.e., only a bias offset is applied to homogenize the profile records). Then it would make sense to me that GOP-ECV could be considered an independent source for trends. With the described methodology, I think the interannual variability of the source records is preserved, which is scientifically useful, but it is not obvious how users of this data should interpret trends. Some discussion of this issue is critical to include, as it is highly likely that users will want to use this data set for ozone trend studies.
- An obvious, and much simpler, method for scaling the merged profiles would be to apply a multiplicative scaling uniformly to the profiles so that their TOC matches that of GTO-ECV. Some better justification of the complicated method used here (i.e., clustering, classification, neural network to get Jacobian) is needed. By adopting the method used in this paper, the authors are implicitly acknowledging that there are altitude-dependent adjustments that should be made to the profile data, but the justification for why one would expect this to be necessary is not clearly stated. In addition to providing a better justification for the complicated altitude-dependent scaling algorithm applied here, it would seem simple enough to provide some information on how much different the altitude-dependent scaling is from a simple (altitude independent) scaling of the L3 profiles to match GTO-ECV.
- Even if the proposed altitude dependent scaling method is justified, I’m concerned that the specific methodological choices made are not optimal, and some further explanation or exploration of sensitivity to methodological choices is needed. The described method seems to be a mixture of ML methods applied to both level 2 (i.e., individual ozone profile retrievals) and level 3 (monthly mean) data. For example, the clustering algorithm is applied to level 2 data, but then the classification is applied to L3 monthly mean merged profiles (e.g., Fig. 7) and used (in Sect 4.3) in the NN (at least, as far as I can tell this is what is happening. The description in Sect 4.3 about exactly what data is used in the NN is confusing and needs to be more specific about, e.g., which TOC is being used as input). If I’m understanding the description correctly, there is an assumption here that the L3 Jacobians should behave in the same way as the L2 Jacobians for a given class, but in reality the L3 values are in some (many?) cases a linear combination of multiple different classes of ozone profiles, so it’s not obvious to me that the Jacobian methods developed with L2 data can be simply transferred to L3 data (or vice versa). Conceptually, it makes more sense to me to do the clustering/classification directly on the L3 profile data, and then train the NN with the L3 TOC on these classes (presumably there are fewer classes in L3 space?). Or, alternatively, do everything in L2 space (including the altitude-dependent scaling) before creating the L3 monthly means and merging. There may be justifications for mixing L2 and L3, but the mixing should be more clearly described and justified.
Minor issues:
Line 2: Why does the dataset end in October 2021 when GOME-2 continues past this time?
Line 14-16: I don’t necessarily doubt that this sentence is true (that GOP-ECV agrees with other long-term data records), but it has not really been demonstrated in this paper and is therefore inappropriate to include this assertion in the paper.
Line 24: Instead of “banned” I think you mean “halted”?
Line 70: “An homogeneous” should be “A homogeneous”
Line 73 (and elsewhere): I don’t think it is appropriate or necessary to abbreviate “with respect to”.
Line 97: Bad grammar in this sentence. Could fix by changing “…climate data record of ozone profiles measurements…” to “…climate data record, ozone profile measurements…”
Line 98: “are combined viz.” is an awkward way to introduce a list. Suggest using a colon instead.
Line 111: I’ve never heard of a daytime. I think you mean local time.
Line 129: It seems suboptimal to use v2 for OMI but v3 for the other four sensors, especially given that is the reference. What are the implications of this?
Line 131-132: I’m confused by the reference here to you using L3 data from CDS. I thought you were constructing an L3 data set from the L2 profiles? This is alluded to on line 133.
Line 132-133: What do these versions mean?
Line 134-135: What do you mean “Monthly mean profile information is provided on a 1° x 1° grid”? I thought you were gridding the data. Maybe you are just describing what Keppens et al did, but this sentence is confusing and the information on their grid size doesn’t seem relevant.
Line 159: The word “basically” seems ambiguous and unnecessary here.
Line 185: Shouldn’t the left side of this equation be “sigma^2”, not just “sigma”. Or alternatively should there not be a square root on the righthand side?
Figure 2: OMI seems to show a strong negative trend over the 2005-2010 time period which isn’t present in GOME or SCIAMACHY. Is this real? What are the implications for this of using OMI as the reference?
Line 215: “first order polynomial”. Do you mean a linear fit?
Line 217: Exemplarily is an odd word to use here.
Line 258: “due the” should be “due to the”
Line 276: It’s awkward to start the numbered list with “and”
Line 351: “ozon” should be “ozone”
Line 358: The NN is trained on the samples shown in Fig. 5, which are L2 profiles if I understand correctly. But then later the NN is applied to L3 data. How applicable is it to do this? See my major question above.
Line 361: Which total ozone is being used as input here? Is it the integration of the 19 partial columns in each profile under consideration, or is it the TOC from GTO-ECV? Ultimately it is the GTO-ECV that is used as input for applying the altitude-dependent scaling (i.e., in eq. 7), so one would think that is what should be used for training. But I’m not sure that is even possible to do that for training. Some clarification is needed here.
Line 370: “providing” should be “provides”
Line 398: Exemplarily is an odd word choice and could be removed.
Figure 12: Why is there a longitudinal structure around Antarctica in the troposphere that seems to (somewhat) mimic the stratosphere? Is this present in the input data or an artifact of the altitude dependent scaling? Also, please change this figure to use a sequential color scale rather than a rainbow one (for reasons such as this, e.g., https://theconversation.com/how-rainbow-colour-maps-can-distort-data-and-be-misleading-167159).
Line 478-479: This assertion may be true but it has not been carefully demonstrated in this manuscript. The wording could be softened here to recognize that a more thorough comparison to other data sets is still needed.
Citation: https://doi.org/10.5194/amt-2024-196-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
193 | 24 | 7 | 224 | 8 | 4 |
- HTML: 193
- PDF: 24
- XML: 7
- Total: 224
- BibTeX: 8
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1