Validation of MIPAS IMK / IAA V 5 R _ O 3 _ 224 ozone profiles

We present the results of an extensive validation program of the most recent version of ozone vertical profiles retrieved with the IMK/IAA (Institute for Meteorology and Climate Research/Instituto de Astrofísica de Andalucía) MIPAS (Michelson Interferometer for Passive Atmospheric Sounding) research level 2 processor from version 5 spectral level 1 data. The time period covered corresponds to the reduced spectral resolution period of the MIPAS instrument, i.e., January 2005–April 2012. The comparison with satellite instruments includes all post-2005 satellite limb and occultation sensors that have measured the vertical profiles of tropospheric and stratospheric ozone: ACE-FTS, GOMOS, HALOE, HIRDLS, MLS, OSIRIS, POAM, SAGE II, SCIAMACHY, SMILES, and SMR. In addition, balloonborne MkIV solar occultation measurements and groundbased Umkehr measurements have been included, as well as two nadir sensors: IASI and SBUV. For each reference data set, bias determination and precision assessment are performed. Better agreement with reference instruments than for the previous data version, V5R_O3_220 (Laeng et al., 2014), Published by Copernicus Publications on behalf of the European Geosciences Union. 3972 A. Laeng et al.: Validation of MIPAS IMK/IAA V5R_O3_224 ozone profiles is found: the known high bias around the ozone vmr (volume mixing ratio) peak is significantly reduced and the vertical resolution at 35 km has been improved. The agreement with limb and solar occultation reference instruments that have a known small bias vs. ozonesondes is within 7 % in the lower and middle stratosphere and 5 % in the upper troposphere. Around the ozone vmr peak, the agreement with most of the satellite reference instruments is within 5 %; this bias is as low as 3 % for ACE-FTS, MLS, OSIRIS, POAM and SBUV.

is found: the known high bias around the ozone vmr (volume mixing ratio) peak is significantly reduced and the vertical resolution at 35 km has been improved.The agreement with limb and solar occultation reference instruments that have a known small bias vs. ozonesondes is within 7 % in the lower and middle stratosphere and 5 % in the upper troposphere.Around the ozone vmr peak, the agreement with most of the satellite reference instruments is within 5 %; this bias is as low as 3 % for ACE-FTS, MLS, OSIRIS, POAM and SBUV.

Introduction
In order to improve the predictive quality of atmospheric models, their constraints must be well refined.For this, the atmospheric processes underlying the fluctuation of the budget of atmospheric constituents should be understood well enough.For instance, despite expectations for a slow recovery of the stratospheric ozone layer in the coming decades, record or very low temperatures occurred in 2006 and 2011, leading to some of the deepest ozone holes over Antarctica.Understanding such ozone fluctuations is impossible without well-resolved high-quality measurements of vertical profiles of this important stratospheric gas.The pole-to-pole dayand-night measurements of ozone provided by the MIPAS (Michelson Interferometer for Passive Atmospheric Sounding) instrument in 2002-2012 represent an important data set for this purpose.
MIPAS is an instrument that was carried on the European Envisat satellite; along with ∼ 30 other atmospheric trace gases, MIPAS measured vertical profiles of ozone.MIPAS measured day and night, and pole to pole, providing more than 1300 profiles per day.The failure of a MIPAS mirror slide in 2004 led to the division of the 10 years of MIPAS data into two operational periods: 2002-2004 when the instrument measured with high spectral resolution (usually referred to as "full-resolution (FR) period") and 2005-2012 when the instrument measured with lower spectral but better vertical resolution ("reduced resolution (RR) period").The MIPAS data from these two periods are evaluated separately.
In this paper we present the results of an extensive validation of vertical ozone profiles retrieved from MIPAS reduced-resolution spectra with the IMK/IAA research processor.The MIPAS IMK/IAA (Institute for Meteorology and Climate Research/Instituto de Astrofísica de Andalucía) data set has been used as part of the SPARC (Stratosphere-troposphere Processes And their Role in Climate) Data Initiative (Tegtmeier et al., 2013) and in the HARMOZ (HARMonized data set of Ozone profiles) databank (Sofieva et al., 2013).The ozone data set from the MIPAS IMK/IAA processor was selected to be used in the framework of the European Ozone Climate Change Initiative project, after an extensive round-robin intercomparison of four existing MIPAS processors: the ESA (European Space Agency) operational processor with the scientific prototype hosted at IFAC (Institute of Applied Physics) Florence (Raspollini et al., 2013), a research processor hosted at ISAC (Institute of Atmospheric Sciences and Climate) Bologna (Carlotti et al., 2001(Carlotti et al., , 2006)), a research processor hosted at the University of Oxford (http://www.atm.ox.ac.uk/MORSE/), and the IMK/IAA processor.See Laeng et al. (2014) for a homogenized description of the four MIPAS processors and for details of the analysis performed.In the rest of this paper, "MIPAS data set" will refer to the MIPAS IMK/IAA data set.

MIPAS IMK/IAA V5R_O3_224 profiles
The description of the processing scheme of the MIPAS IMK/IAA research processor and its adaptation to the reduced resolution spectra of MIPAS are published in von Clarmann et al. (2003) andvon Clarmann et al. (2009).As shown in Laeng et al. (2014), all four MIPAS processors have a high bias around the ozone vmr (volume mixing ratio) peak (approximatively 35 km) compared to ozonesondes, lidars, ACE-FTS (Atmospheric Chemistry Experiment -Fourier Transform Spectrometer) and MLS (Microwave Limb Sounder).Though the IMK/IAA processor had the smallest bias, ozone mixing ratios were still higher by up to 0.2 ppmv (parts per million by volume) than those of MLS.In addition, the ozone from the MIPAS IMK/IAA processor (labeled as "KIT processor" in Laeng et al., 2014) had a peak of particularly poor vertical resolution at 35 km and the position of the ozone vmr peak was slightly higher than in the reference instruments, causing the high bias around the ozone vmr maximum.
The version of ozone profiles used in the analysis by Laeng et al. (2014) was V5R_O3_220.In the production of this version, the microwindows from both MIPAS band A (685-970 cm −1 ) and band AB (1020-1170 cm −1 ) were used.The displaced ozone vmr maximum as well as the peak in vertical resolution were both appearing at heights where the microwindows from the AB band were activated.It was pointed out already in Glatthor et al. (2006) that the exclusive use of band A microwindows can lower the ozone values at heights corresponding to the ozone vmr maximum.The reason for this is possibly an inconsistency in the spectroscopic data for the ozone bands located in MIPAS band A vs. band AB.Another possible explanation is interband calibration inconsistencies.Hence, in order to minimize the high bias, a new version of ozone was produced, namely version V5R_O3_224.
The differences with respect to the version V5R_O3_220 used in the round-robin exercise are the following: -No microwindows from the band AB were used at heights below 50 km; this reduced the bias around the ozone vmr maximum and fixed the problem of the displacement of the ozone vmr peak (see Fig. 1 for comparison of mean ozone profiles from the versions V5R_O3_220 and V5R_O3_224).As one can see in Fig. 1, the values at the ozone vmr maximum of the version V5R_O3_224 are slightly larger than the values of V5R_O3_220.However, the bias of V5R_O3_224 around the ozone vmr maximum is still smaller than the bias for the three other MIPAS processors; to demonstrate this, we overplotted the bias of V5R_O3_224 on the bias panel of comparison with MLS from Laeng et al. (2014), this is shown in Fig. 2.
-To compensate for the loss of information implied by dropping the AB microwindows at heights below 50 km, in this height range, three-times-more microwindows were used in the A band, see Table 1.This improved the previously poor vertical resolution around the ozone vmr maximum; Fig. 3 shows the vertical resolution of the previous version (left panel) and of the version under validation (right panel) for typical midlatitude retrieval.Note that the vertical resolution was ameliorated not at the expense of bigger uncertainties: the error around problematic height was even reduced in the new version.The oscillating behavior of the vertical resolution comes from the fact that the retrieval is performed on the grid finer than the original tangent height grid: the vertical resolution is better at grid points close to a tangent altitude of the measurement and worse between two adjacent tangent altitudes.
-The altitude-dependent strength of the regularization has been changed.The regularization matrix is now where L 1 is an (n−1)×n finite differences matrix, γ i are the altitude-dependent regularization strengths, and D is a matrix which is zero except for the diagonal values referring to the uppermost altitudes, which ties ozone to values near zero there.
-The strength of the constraint, γ i , was taken constant up to 70 km (in contrast to 65 km for the version V5R_O3_220).
The data used in this paper come from two versions: V5R_O3_224 (2005-April 2011) and V5R_O3_225 (May 2011-April 2012) temperature profiles which are used as a priori for temperature retrieval were derived from the NILU (Norwegian Institute for Air Research) data server, while for V5R_O3_225 the ECMWF temperatures directly from ECMWF were used, since NILU does not make ECMWF profiles available anymore.No relevant ozone differences were found in response to this change.

Overview of reference instruments
The reference data sets used in this study are summarized in Table 2.All spaceborne limb and occultation instruments that have flown and measured tropospheric/stratospheric ozone vertical profiles at the same time as MIPAS are included.
We also include the comparison with two nadir sensors: IASI (Infrared Atmospheric Sounding Interferometer) and SBUV (Solar Backscatter Ultraviolet), as well as with the vertical profiles from MkIV balloon measurements and Umkehr measurements.We do not include ozonesondes and lidars because extensive comparison with these was made in Laeng et al. (2014) for the previous version of the data.The IMK/IAA MIPAS ozone data set was found to deviate by less than 5 % from ozonesondes (10 % for tropical regions), and

Comparison methodology
For all satellite reference data sets except MLS, the optimal ratio (number of collocations) / (distance between measured air parcels) was achieved with the collocation criteria of 5 h and 500 km.For the dense sampling of MLS, the collocation criteria were tightened down to 4 h and 250 km.Note that the time interval of 4 h cannot be made shorter because it must be larger than the difference in Equator crossing local times of the carrying platforms (which are 10:00 LT for Envisat carrying MIPAS and 13:30 for Aura carrying MLS), otherwise the set of tropical collocations would be reduced.For MkIV and Umkehr data sets, the collocation criteria were taken 24 h and 1000 km.Application of collocation criteria produced the set of matched pairs reported in Table 2.All the plots in this study, including climatologies, were produced out of the collocated measurements.Figure 4 shows the latitudinal distributions over months of collocated measurements of MIPAS with each satellite reference instrument.All reference data sets except Umkehr were interpolated onto the MIPAS retrieval grid, which is a fixed altitude grid with 1 km steps between 6 and 44 km and 2 km steps between 44 and 70 km.Data sets delivered on an altitude grid were interpolated linearly.As the MIPAS IMK/IAA processor has a reliable pressure-altitude relation (see Sect. 6.3.4 of Laeng et al., 2014), the data sets provided on a pressure grid were interpolated via pressure in logarithmic domain using MIPAS pressures.Data sets provided in number density units were also transformed into volume mixing ratio by using the temperatures from the MIPAS retrieval.For GOMOS (Global Ozone Monitoring by Occultation of Stars), number density was converted into mixing ratio using ECMWF and MSIS-90 (Mass Spectrometer Incoherent Scatter-90) air density profiles at occultation locations.The discrepancies between the vertical resolutions of limb and occultation reference data sets and vertical resolution of MIPAS do not exceed a factor 1.5-2.For these data sets, sensitivity tests were performed and showed that within these margins, the application of averaging kernels is not relevant.Hence, no averaging kernels were applied when comparing with limb and occultation data sets.Nadir sensors have a vertical resolution which is quite different from MIPAS.When comparing with IASI, the MIPAS data set was convolved with IASI genuine averaging kernels.At the time when the analysis described in this paper was performed, no averaging kernels for individual SBUV ozone profiles were available, hence the comparison with SBUV was performed without taking into account the discrepancies in vertical resolutions.For the comparison with Umkehr, the MIPAS data set was transformed into Dobson units (DU) on Umkehr layers, the details are described in Sect.6.
To assess the bias between MIPAS and a reference instrument, we calculate the mean difference on n collocated pairs: or, in short notation (MIPAS − REF).The percentage bias with respect to a reference instrument is calculated as follows: One could argue that the normalization should be the same taken for all instruments, in other words, the denominator in the last equation should be MIPAS.It is however our choice to show the biases with respect to reference instruments, in order to obtain independent estimates of the bias.This of course implies that the biases with respect to different reference instruments calculated in this way cannot be directly compared to each other, except if the reference instruments have very similar mean profiles.
An assessment of precision is performed by analyzing the residual variance of the MIPAS data set with respect to reference data sets, namely, by comparing the standard deviation (5) calculated still on collocated profiles only (see von Clarmann, 2006).Such analysis is partly impeded by the fact that not all reference instruments provide full random error estimates (all but one in the last column of Table 2).Since for most reference instruments only measurement noise is reported, the estimated error of the differences between coincident measurements is expected to be lower than the standard deviation of the differences.In addition, the latter quantity includes the natural variability of ozone within the given collocation criteria.Thus, only upper estimates on the reliability of the MIPAS precision from these reference measurements can be made.Laeng et al. (2014) presented an approach to precision validation of vertical ozone profiles by a method not involving any reference instrument, and concluded that MIPAS IMK/IAA precision estimates for ozone are close to reality.

Comparison with satellite measurements
Figures 5 and 6 present the estimated bias and precision assessment of MIPAS ozone profiles with respect to reference instruments.To avoid the overloading of the bias summary plots, the satellite reference data sets were subdivided into two classes according to their biases with respect to ozonesondes: those having a known small bias in the main ozone layer (20-30 km) and those having a slightly larger bias in the main ozone layer.For this purpose, for each data set, the estimation of the bias with respect to ozonesondes was taken from the latest validation study performed on a data set from the same instrument and processor.The latest validation studies of reference instruments and biases found in them are summarized in the last column of Table 2.We would like to point out that these bias estimates are in agreement with estimates obtained in Hubert et al. (2012Hubert et al. ( , 2014) ) even though different versions of data sets were used in those studies.
The left panels of Figs. 5 and 6 represent the percentage bias with respect to the reference instruments.The curves on the right panels of Figs. 5 and 6 represent the residual variance which is calculated as the relative difference between the standard deviation of the differences and combined errors: where STOD 2 ≥ CE 2 .This residual variability (RV) estimates how large the natural variability within the collocation window of 5 h and 500 km must be to justify the observed spread, if the random error estimates of both instruments were realistic and complete.Ideally, for exactly matched pairs, RV should be zero for correctly characterized data.Large values of RV under calm atmospheric conditions characterized by smooth ozone distributions hint at underestimated random errors for at least one of the instruments.In a highly variable atmosphere (e.g., at high latitudes, particular involving polar vortices) RV will be large even for perfectly characterized measurements.Negative values of STOD 2 − CE 2 indicate that the random error estimates of at least one of the data sets under comparison is overconservative.
With this in mind, it becomes clear that the larger RV values in Figs. 5 and 6 of some instruments (e.g., ACE-FTS, GOMOS, and POAM) do not necessarily indicate a less complete error budget but simply reflect the fact that a larger fraction of these measurements were taken at higher latitudes in winter and spring (cf.Fig. 4), where the natural ozone variability is large.
For a number of instruments (MLS, OSIRIS, SAGE II, HALOE, both SMILES), the residual variance at 25-45 km altitude is only about 4 %.Allowing for some small residual natural variability also here, and taking into account that the error budget of some of these instruments includes measurement noise only but no uncertainties of randomly varying parameters, it seems fair to conclude that the MIPAS random error estimates are close to actual values.This confirms the earlier findings of Laeng et al. (2014) and Sofieva et al. (2014).
At lower altitudes our assumption of direct comparability without application of averaging kernels appears to be driven beyond its limit: different altitude resolutions imply that different instruments may see a different fraction of tropospheric air, which adds to the residual variability.Above of about 45 km, ozone sampling is instrument-specific and thus can be significantly displaced with respect to the nominal geolocation of the reported profile (von Clarmann et al., 2009, their Table 4).Therefore, ozone comparisons between instruments can be affected when the natural geographical and temporal variability of ozone is high, and therefore result in the enhanced residual variability.
One sees in Fig. 5 that for the main ozone layer (20-30 km) the bias of MIPAS with known small-biased data sets is within 7 %, while in the upper troposphere (15-20 km) and at 30-40 km heights the bias is within 5 %.Around the ozone vmr peak, the agreement with ACE-FTS, MLS and OSIRIS (Optical Spectrograph and InfraRed Imager System) is within 3 %.The bias with respect to MLS is similar to the bias with respect to SAGE II in the main ozone layer: it is positive and of the order of 4-5 %. Between 30 and 45 km, the bias with respect to MLS is of the same sign and magnitude, while the bias with respect to SAGE is twice as large.The smallest bias in Fig. 5 is observed vs. ACE: it remains within 2 % in the stratosphere and takes both signs.In turn, at 46-56 km heights, the bias with respect to ACE-FTS is the largest in absolute value on this panel: it is negative and reaches 15 %.Above 60 km, all reference instruments except SCIAMACHY (Scanning Imaging Absorption Spectrometer for Atmospheric CHartographY) demonstrate that MIPAS ozone is biased high.The best agreement above 60 km is observed with two other Envisat sensors: positive bias vs. GO-MOS does not exceed 8-21 %, and bias vs. SCIAMACHY does not exceed 17 % when it is negative and does not exceed 22 % when it is positive at 69-70 km heights.MIPAS is biased high by 0-7 % vs. all instruments collected in Fig. 5 between 20 and 40 km altitude.
Figure 6 provides biases vs. instruments which are known to have larger (between 5-7 and 20 %) biases vs. ozonesondes.In general, the comparisons resemble those of the small-biased instruments: the biases are always positive for 25-40 km heights, except for IASI, with values between 0 and +10 %.Below and above this height range, the spread among the instruments is larger and covers both positive and negative values between −20 and +20 %.The IASI bias in the UTLS (upper troposphere/lower stratosphere) is the largest on this panel, going up to 20 % in the main ozone layer.Such a bias of IASI at this altitude was already reported by e.g., Dufour et al. (2012).Above 30 km the sensitivity of IASI drops, as shown by the averaging kernels (Keim et al., 2009), and the profiles just reproduce the a priori.
The Sub-Millimetre Radiometer (SMR) data set provides quite large error bars on the whole height range of ozone profiles, which leads to the large combined error that sometimes does exceed the standard deviation of differences.In this case the estimate of the square of the residual variability is negative, this is why there is no green SMR curve between 32 and 45 km heights.
A relatively small (within 4 % in the stratosphere) bias with respect to POAM (Polar Ozone and Aerosol Measurement; brown curves in Fig. 6) goes along with a large estimate of residual variability: the residual variability derived from MIPAS-POAM is three times as large as that derived from most of the other instruments.The reasons are twofold: MIPAS-POAM coincidences occur at high latitudes only (see POAM panel on Fig. 4), and because of possible underestimation of its uncertainties by POAM.In the Northern Hemisphere (NH), POAM coincidences occur in the region impacted by the springtime breakdown of the polar vortex.In the Southern Hemisphere (SH), many coincidences occur near the edge of the winter vortex.Both of these regions can be expected to have large geophysical variability.Assuming that MIPAS error estimates are realistic, as suggested in Laeng et al. (2014), this big residual variance could also be an indication that the POAM uncertainties are underestimated.Similar conclusions can be drawn for IASI (yellow curve in Fig. 6).
Comparisons with HALOE (Halogen Occultation Experiment), HIRDLS (High Resolution Dynamics Limb Sounder), SCIAMACHY and both SMILES (Superconducting Submillimeter-Wave Limb-Emission Sounder) processors all expose a similar behavior of MIPAS ozone data relative to these instruments: MIPAS is positively biased by less than 10 % in the stratosphere.
The SCIAMACHY curve is absent at most heights in the right panel of Fig. 6 because the combined error of the SCIA-MACHY-MIPAS comparison exceeds almost everywhere the standard deviation of differences, which gives a negative STOD 2 − CE 2 quantity.This agrees with the conclusions of the analysis performed in the framework of the Ozone_cci project: SCIAMACHY seems to overestimate its uncertainties by up to a factor of 2.5.
The SBUV curve is absent in the right panel of Fig. 6 because the version 8.6 of SBUV data which is used for this analysis is provided without error estimates.
The behavior of the bias around the ozone vmr maximum in the comparison with small-biased data sets shows a systematic high MIPAS bias at this height range (left panels of Figs. 5 and 6), while some of not-small (with respect to ozonesondes) biased data sets have zero bias with MIPAS, for instance SBUV, SCIAMACHY, SMILES_NICT and POAM.This observation should be taken into the context, namely, that the separation of reference instruments into "small-biased" and "not so small-biased" was done based on their comparison with ozonesondes, which do not go higher than 30 km.Compared to instruments with a known small bias vs. ozonesonde data, MIPAS is always biased high around the ozone vmr maximum (left panel of Fig. 5).In contrast, the MIPAS bias is smaller and sometimes even zero in comparison to the instruments which have a slightly larger bias vs. ozonesondes (see left panel of Fig. 6).The interesting point is that the comparison to ozonesondes which led to the binning into the two instrument groups ends below the altitude of the ozone vmr maximum.This means that the behavior of the instruments with respect to ozonesondes can be extrapolated to larger altitudes.Furthermore, all comparisons to satellite reference instruments reveal a local maximum of the bias around 44 km, independent of the sign of the bias in this altitude region.This hints towards an artefact in MIPAS data, visible as a small bulge in the profiles in Fig. 1.The reason for this artefact is still unidentified.For a large number of reference instruments, the natural variability within the collocation radius necessary to explain the scatter if the error estimates were realistic is only about 5 %, although some of the error estimates include only measurement noise.Thus, this defines an upper limit by which the MIPAS error can be underestimated.We analyzed latitude dependence of the residual variability at several altitude cross sections (25, 35 and 50 km) and did not find any significant artifacts that can be used for diagnostics of the quality of MI-PAS ozone data.Finally, comparisons with seven data sets (MLS, SAGE, OSIRIS, as well as HALOE, HIRDLS and both SMILES data sets) agree on the estimates of about 4 % natural variability within 5 h and a 5 km collocation window between 23 and 48 km.
Figure 7 shows the scatter plots with small-biased solar occultation and limb measurements.The axes of this plot correspond to ozone volume mixing ratios derived from MIPAS and the reference instrument, and the color scale denotes the heights, as indicated at the right of the plot.In order to make all four plots comparable, we restricted height to the uppermost height of OSIRIS data points, 54 km.The size of the scatter around the straight line of unity slope going through the origin indicates that the noise in one or both data sets, and/or the amount of natural variability within the chosen collocation window is important.An offset from the ideal line hints at an additive bias; a slope different from unity hints at a multiplicative bias, and a curved line is an indication of a nonlinear or altitude-dependent bias.For high ozone values, the distribution of data points is centered not exactly around the reference line but shifted below, which indicates the high bias of MIPAS ozone data near the ozone vmr maximum.Data points below the reference line for high ozone values confirm the high bias of MIPAS ozone near the ozone maximum.The area above the reference line around 3.5 ppmv and 50 km (most obvious for the correlation with ACE-FTS) corresponds to the local low bias of MIPAS ozone which is clearly visible in Figs. 5 and 6.Except for these issues, the data points in the scatter plots are confined to a narrow band around the reference line in all cases.One notes in Fig. 7 that the width of the distribution of data points around the reference lines appears to be larger for MLS and OSIRIS than for ACE-FTS and SAGE.However, since the number of data points is much smaller for the latter than for the former, and because this representation is not normalized with respect to the number of data points, no conclusion on errors or variability can be drawn from this.The scatter plot with OSIRIS is cut by a zero or close-to-zero line on the OSIRIS side: this reflects that in OSIRIS processors negative vmrs are cut off, filtered or replaced by a fixed value close to zero.In contrast, MIPAS IMK/IAA processors retrievals of negative vmrs, although unphysical, are allowed, hence avoiding biasing the statistics.
A comparison of the evolution of ozone distributions over height with time as measured by MIPAS and MLS is shown in Fig. 8.One observes that MLS (upper panel) and MIPAS (middle panel) see the same atmospheric variability, which in addition is consistent with the seasonal cycle of monthly zonal mean ozone curves from a climatology comparison of the SPARC Data Initiative; cf.Fig. 6 and the left bottom panel of Fig. 8 in Tegtmeier et al. (2013).A clear seasonal cycle can be seen in the lower panel in Fig. 8, where the monthly means of percent differences of measurements for collocated pairs are shown.Note that this seasonal cycle is present in absolute as well as in relative differences with MLS and OSIRIS (see Fig. 9).This analysis was performed for reference instruments that have known small bias, similar sampling and coverage, and sufficient time overlap: GO-MOS, MLS, and OSIRIS; similar patterns (phase-shifted in the SH compared to the NH) were found in comparisons with all three instruments in all latitude bins.The reasons for this seasonality in the bias is currently under investigation: this could be due to a possibly multiplicative nature of the bias or to a time-dependence of the ozone vmr values themselves.It could also partly arise from tangent pressure and/or temperature systematic differences between measurements.Note that this seasonality of the bias does not affect the trend calculation from the paper by Eckert et al. (2014) because the seasonal cycle is fitted in their regression model.The annual and latitudinal variation of the bias with MLS has also been investigated.No systematic latitudinal variations have been detected.
Finally, five reference instruments presented here are used together with the previous version of MIPAS IMK/IAA data, V5R_O3_220, in the HARMOZ databank (Sofieva et al., 2013): ACE-FTS, GOMOS, OSIRIS, SCIAMACHY, and SMR.The bias with respect to these five data sets from HAR-MOZ agrees well with the analysis of the bias presented in Fig. 6 of Sofieva et al. (2013).

Comparison with Umkehr measurements
Umkehr measurements are based on zenith sky observation of solar radiation at two wavelengths in the UV part of the solar spectrum.One wavelength is strongly absorbed by ozone and the other is not.Ratio is measured as function of SZA (solar zenith angle).From these observations the optimum statistical solution is found.The vertical resolution of Umkehr ozone profiles is derived from the analysis of the averaging kernel matrix where the full width at half maximum (FWHM) is 5 km, taking into account that the bottom layers (pressure between surface and 250 hPa) are derived as double layers.The retrievals are done on days with clear sky conditions (clear zenith).The method was developed to minimize the a priori contribution on the retrieval (Petropavlovskikh et al., 2005).The data set is also corrected for the stray light contribution, which reduces the typical offset of Umkehr profiles in the upper layers, making the data set optimized for the monthly means' calculation.Above 32 hPa the operational Umkehr retrieval is known to underestimate ozone by as much as 5-10 % when compared to the SBUV profiles (Kramarova et al., 2013).The problem is corrected in the data set used in this analysis by including estimates of the stray light contributions to the observed Umkehr measurements.
For the sake of brevity, we show here the comparison only with data points from the Boulder station (40 • N) where almost daily profiles from 2005 to 2012 have been taken with a Dobson instrument.The comparison with four stations at different latitudes can be found in Laeng and Petropavlovskikh (2013).
As Umkehr has a known bias on individual profile levels, but the current retrieval algorithm is optimized for monthly mean calculations, we first compare monthly mean values from both instruments.Left side of Fig. 10 shows the monthly mean ozone values (in DU) of Umkehr and MIPAS overpasses as a function of time and atmospheric pressure in 2005-2012 at the Boulder station (40 • N, 105 • W).The color code on the right side represents ozone (DU) in Umkehr layers (top pressure of the layer is half of the pressure at the bottom).The vertical axes are log 10 (pressure).The right side of Fig. 10 represents the absolute and relative differences between Umkehr and MIPAS profiles as a function of time and pressure.The relative differences are mostly within ±10 %, with the exception of the layer between 32 and 16 hPa, where differences are larger (±20 %).The seasonal cycle in the absolute bias as observed in comparison with satellite instruments, can also be observed in the comparison of MI-PAS with Umkehr at the Boulder station.However, unlike in the satellite case, this seasonal cycle is less pronounced in relative bias plots (see right column of Fig. 10).This hints that the bias vs. Umkehr is dominated by its additive component.This seasonal cycle in the absolute bias is also well pronounced in the comparison with the Syowa station situated at a high southern latitude (−69 • S) (Laeng and Petropavlovskikh, 2013).However, in comparisons at Lauder (−45 • S) and Mauna Loa (19.5 • N) stations, there is no clear indication of seasonality of absolute bias of MIPAS with respect to Umkehr (Laeng and Petropavlovskikh, 2013).
To evaluate the bias on individual profile level, distributions of individual MIPAS and Umkehr values in two Umkehr layers, layer 5 (32-16 hPa) and layer 7 (approximately 8-4 hPa), were compared (Fig. 11).The histograms have the same shapes and numbers of modes, but there is an offset in the position of the modes.MIPAS is systematically biased high with respect to the Umkehr measurements.Similar high and low biases of MIPAS in the relevant altitude ranges have not been found in comparisons with any satellite instruments (see Sect. 5).For this reason we tentatively assign the biases to the Umkehr measurements.

Comparison with MkIV balloon measurements
Figure 12 presents the comparison of MIPAS ozone measurements with the three MkIV balloon profiles (Toon, 1991) within the MIPAS reduced resolution period.The first two MkIV profiles, from 20 September 2005 and 22 September 2007, were measured when MIPAS was temporarily inactive and no matches were found within 24 h and 1000 km.All three flights for which the matches were found took place in September months.The profiles were hence compared to September means of MIPAS in 30-40 • N latitudes.
For all three flights, no indication of a high MIPAS bias near the ozone vmr peak, which was observed in the comparison with satellite instruments, is found: the September means agree well with MkIV profiles over the entire altitude range.For the profile from the sunrise of 23 September 2007, three collocated MIPAS profiles were found (green lines).For the closest of these three profiles, the maximum deviation from MkIV profiles is 0.3 ppmv at 16 km height, and near the ozone vmr peak the agreement is excellent.It should be kept in mind, though, that it is difficult to build enough statistics with a few balloon flights to obtain a significant bias.Hence we can draw no conclusions regarding whether this bias corroborates or not the biases that were observed in satellite comparisons.

Conclusions
Ozone vertical profiles retrieved from the MIPAS spectra with the IMK/IAA research processor, version V5R_O3_224, were compared with ozone vertical profiles from ACE-FTS, GOMOS, HALOE, HIRDLS, IASI, MLS, OSIRIS, POAM, SAGE II, SBUV, SCIAMACHY, SMILES (JAXA and NICT), and SMR, as well as with MkIV balloon profiles and Umkehr measurements.A better agreement with reference instruments than for the previous version, V5R_O3_220 (Laeng et al., 2014), was demonstrated.The high bias near the ozone vmr peak has been significantly reduced by the use of spectral information from the MIPAS band A only, three times more microwindows  The agreement with satellite limb and solar occultation reference instruments that have a known small bias vs. ozonesondes data (ACE-FTS, GOMOS, MLS, OSIRIS, SAGE II) is within 7 % in the lower and middle stratosphere (20-40 km) and 5 % in the upper troposphere.Around the ozone vmr peak, the agreement with most of the satellite reference instruments is within 5 %; this bias is as low as 3 % for ACE-FTS, MLS, OSIRIS, POAM and SBUV.
The agreement with HIRDLS, POAM and SCIAMACHY, is typically within 7 % in the lower and middle stratosphere and 10 % in the upper troposphere.In the lower mesosphere, the best agreement (up to 22 %) is observed with GOMOS and SCIAMACHY.Near the ozone vmr peak, the agreement with ACE-FTS is better than 1.5 %, the agreement with MLS is better than 2 %, and the agreement with OSIRIS is better than 2.5 %.The bias with respect to ACE-FTS for 15-45 km is better than 3 %.Good agreement with three MkIV balloon profiles is observed.The known high bias of Umkehr data is confirmed, the agreement of monthly means is within 20 % for 32-16 hPa and within 10 % for the other altitude layers provided by Umkehr data.
The MIPAS random error estimates are approximately realistic, which confirms the earlier findings of Laeng et al. (2014).
Overall, this MIPAS data set has a small bias with respect to standard small-biased data records.Combining these results with the findings of Eckert et al. ( 2014), we conclude that the MIPAS data set can be used for climatological studies in an altitude range from 10 to 60 km.

Figure 2 .
Figure 2. Bias assessment of four MIPAS processors with respect to MLS from Laeng et al. (2014) with bias of V5R_O3_224 overplotted (orange curve).

Figure 3 .
Figure 3. Vertical resolution (left panel) and uncertainty estimates (right panel) of MIPAS ozone profiles from the versions V5R_O3_220 (blue lines) and V5R_O3_224 (green lines) on geolocation 20050219T181646Z.

Fig. 1
of this paper demonstrates that the previous version and the current version under validation are almost identical in the altitude range covered by ozonesondes.

Figure 4 .
Figure 4. Monthly latitudinal distributions of collocated measurements of MIPAS with reference instruments, in percents.Note that the color scales are different on each panel.

Figure 5 .
Figure 5. Bias estimation and residual variability (Eq.6) of MIPAS ozone profiles with respect to reference instruments that are smallbiased compared to ozonesondes.

Figure 6 .
Figure 6.Bias estimation and residual variability (Eq.6) of MI-PAS ozone profiles with respect to reference instruments that have known bias compared to ozonesondes.No bias correction has been applied.

Figure 7 .
Figure 7. Scatter plots of MIPAS ozone measurements with collocated measurements from small-biased solar occultation measurements (top panels) and small-biased limb measurements (bottom panels).

Figure 9 .
Figure 9. Evolution of absolute (upper panels) and relative (bottom panels) monthly mean differences between MIPAS and the reference instrument in 2005-2012 at 30-60 • N for MLS (left) and OSIRIS (right).

Figure 10 .
Figure 10.Monthly mean ozone values (in DU) of Umkehr (top left panel) and MIPAS (bottom left panel) and monthly means of relative (top right panel) and absolute (bottom right panel) MIPAS-Umkehr differences in 2005-2012 at Boulder station (# 067), 40 • N.

Table 2 .
Description of reference data sets.