IMK/IAA MIPAS temperature retrieval version 8: nominal measurements

. A new global set of atmospheric temperature proﬁles is retrieved from recalibrated radiance spectra recorded with the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS). Changes with respect to previous data versions include a new radiometric calibration considering the time-dependency of the detector non-linearity, and a more robust frequency calibration scheme. Temperature is retrieved using a smoothing constraint, while tangent altitude pointing information is constrained using optimal estimation. ECMWF ERA-Interim is used as temperature a priori below 43 km. Above, a priori 5 data is based on data from the Whole Atmosphere Community Climate Model Version 4 (WACCM4). Bias-corrected ﬁelds from speciﬁed dynamics runs, sampled at the MIPAS times and locations, are used, blended with ERA-Interim between 43 and 53 km. Horizontal variability of temperature is considered by scaling an a priori 3D temperature ﬁeld in the orbit plane in a way that the horizontal structure is provided by the a priori while the vertical structure comes from the measurements. Additional microwindows with better sensitivity at higher altitudes are used. The background continuum is jointly ﬁtted with the target 10 parameters up to 58 km altitude. The radiance offset correction is strongly regularized towards an empirically determined vertical offset proﬁle. In order to avoid the propagation of uncertainties of O 3 and H 2 O a priori assumptions, the abundances of these species are retrieved jointly long-term drift encountered in the previous The consistency between high resolution from and the reduced resolution pronounced temperature differences between and data are in elevated stratopause situations. The fact that the phase of temperature waves seen by MIPAS is not locked to the wave phase found in ECMWF analyses demonstrates that our retrieval provides independent information and does not merely reproduce the prior information. Since the results presented in this paper are a team effort, the following list of author contributions is by no means comprehensive. Instead only the most prominent contributions are listed. the retrieval setup, coordinated, partly performed 605 related test calculations, contributed graphics, and the ﬁnal editorial responsibility for this paper. TvC wrote large parts of the text, organized related discussions and cared about TUNER compliance of error estimates. BF provided the parameterized NLTE approach and implemented the updated frequency calibration and offset correction. BF, MGC and MLP took care that the retrieval setup was developed in a way that inter-consistence with the retrieval setups of middle and upper atmospheric measurement modes was maintained. Furthermore they provided the CO 2 uncertainties. MLP and BF built the a priori temperature distributions from the WACCM data. NG was responsible 610 for spectroscopy issues, error estimation calculations, carried out some of the retrieval tests, and contributed graphics. UG provided and maintained the retrieval software including data archive. further has for level-1b data import and quality control. SK A. the retrievals and visualized the results. SK contributed graphics. AK for L1-related issues. A. Laeng to quality control. DM the WACCM calculations. GPS identiﬁed issues to be solved and took care of quality control. All authors suggested solutions for various problems encountered during the development phase and critically discussed the results as well 615 as the manuscript.

line of sight. The combined temperature and tangent altitude retrieval is the first step in the chain of retrievals and is preceded only by the determination of a frequency shift (see Section 3.2).

Retrieval Method
In order to put the improvements discussed later into the context of the pre-existing retrieval scheme, we recapitulate the main 90 features of the MIPAS temperature retrieval scheme here. MIPAS spectra are analyzed with constrained nonlinear least squares fit. The updated guess of the state vector x i+1 at iteration i + 1 is calculated from the previous estimate x i as Here K is the Jacobian containing the partial derivatives ∂ym ∂xn ; superscript T indicates transposed matrices; S y,noise is the 95 covariance matrix representing measurement noise; R is a regularization matrix; y is the vector of measurements under consideration; F (x i ) is the vector of the respective simulated measurements, based on the Karlsruhe Optimized and Precise Radiative Transfer Algorithm (KOPRA, Stiller, 2000); and x a is the vector of prior information on x. The vertical grid for the retrieval of the temperature profile is 0, 4 [1] fitting residuals related to an entire limb scan are minimized simultaneously rather than in sequence (Carlotti, 1988). Contrary to the original global-fit approach, which was an un-regularized maximum likelihood retrieval, we do use regularization (Section 3.3). Adequate a priori information above the uppermost MIPAS tangent altitude proved to be of particular importance (Section 3.4). Also contrary to the original global-fit method, our retrieval scheme supports consideration of horizontal variability along the line-of-sight direction (Section 3.5), where the respective element of x associated with a certain altitude is a 105 scaling factor for the horizontal temperature distribution. Temperature is fitted jointly with a correction of the tangent altitude of the line of sight in order to minimize mutual error propagation. The temperature and tangent altitude fit uses spectral lines of CO 2 , because excellent prior knowledge on the vertical distribution of this gas is available, and because no rapid changes of its mixing ratios are expected. The retrieval relies on specific parts of the spectra, called "microwindows", which contain maximum information on the target quantities, but are least interfered of gases of unknown abundancy (see Section 3.6). This The fit is carried out using a maximum a posteriori scheme Rodgers (2000) using the mean of the spectral calibration scale over the entire FR and RR period, determined in a previous step, as a priori, and a priori variances of (0.00035 cm −1 ) 2 for the FR and (0.0007 cm −1 ) 2 for the RR measurements). The actual spectral correction for any wavenumber can be determined using the linear regression function.

Regularization
According to the retrieval vector x, R has a block-diagonal structure, and the choice of the regularization can be made independently for each group of variables.
In general we use a regularization term which is composed of a smoothing component R smooth and a diagonal component Here the diagonal component R diag is formally equivalent to the inverse of an a priori covariance matrix without information on inter-altitude correlations. For the regularization term R smooth following implementation of the altitude dependence has replaced the approach by Steck and von Clarmann (2001), which has been used in our retrievals up to version 5. Here L 165 is an (N − 1) × N first order finite difference operator as suggested by Tikhonov (1963); Twomey (1963); Phillips (1962), but scaled with the respective gridwidth to yield difference quotients. The γ-values control the altitude dependence of the strength of the regularization.
The regularization term used for the parameter temperature is R T = R smooth with the values of all γ i being set to 0.49 K −2 in the entire altitude range.

170
The tangent altitudes are constrained towards the line-of-sight engineering information. The respective block of R can be understood as an inverse a priori uncertainty covariance matrix describing both the relative pointing uncertainties between adjacent tangent altitudes and the absolute pointing uncertainty of the entire limb scan as a whole. The relative pointing a priori uncertainties were assumed to be 60 m in the RR measurement mode and 150 m in the FR measurement mode, in terms of 1σ standard deviations. The standard deviation of the absolute pointing uncertainty, representing the possible altitude shift of the 175 entire limb scan, was assumed to be 900 m.

A priori temperature and trace gas distributions
In older nominal mode retrieval versions problems occurred which could be traced back to the use of inadequate a priori temperature distributions for altitudes above the uppermost MIPAS tangent altitude. Here, neither reliable analysis data are available, nor can MIPAS vertically resolve the temperature profile. Older retrieval versions used the NRLMSISE-00 clima-180 tology (Picone et al., 2002) at these altitudes. However, this climatology has systematic biases (Emmert et al., 2020) and does not capture short-term variations occuring in dynamically active episodes such as elevated stratopause events. Due to missing MIPAS measurement information at related altitudes, this error propagated into the MIPAS temperatures in the nominal scan range. These temperature retrieval errors further propagated noticeably into retrievals of trace species, e.g. ozone (Laeng et al., 2018).

185
For IMK/IAA MIPAS version 8 temperature retrievals ECMWF ERA-Interim analysis fields (Dee et al., 2011) were used as a priori at altitudes up to 43 km, because ERA-5 was not available for the MIPAS time period when the processing was started. A priori temperatures above 53 km are based on Whole Atmosphere Community Climate Model (WACCM, Marsh, 2011;Marsh et al., 2013) Version 4 (WACCM4) fields of a specified dynamics run , which provided output specifically for the MIPAS measurement geolocations and times. Since a specified dynamics run was used, the actual 190 atmospheric conditions including stratospheric warming events and elevated stratopauses were sufficintly well reproduced.
The WACCM temperatures were bias-corrected using MIPAS version 5 middle and upper atmosphere measurements, which cover an altitude range of 18-102 km, but are performed less frequently (García-Comas et al., 2014). Multi-annual averages of MIPAS-collocated WACCM differences were used to construct an altitude-and latitude-dependent seasonal correction, independently for am and pm observations. Between 43 and 53 km, a smooth transition between ECMWF and bias-corrected 195 WACCM temperatures is obtained by linear interpolation along with hydrostatic correction of pressures at the given geometric altitudes.
CO 2 distributions are imported from an SD-WACCM4-based climatology. From MIPAS V5 data gas profiles are generated for interfering species, and for initial guess profiles for O 3 and H 2 O, which both are jointly fitted together with target state variables. The a priori of the latter is a zero profile, while the regularisation is of Tikhonov type.

Horizontal variability
Typically, a locally spherically symmetric atmosphere is assumed in profile retrievals. That is to say, within one profile retrieval the atmospheric state is assumed to be a function of altitude only and does not vary with latitude or longitude. Since limb measurements used for one profile retrieval cover, depending on the measurement mode, about 1600 to 2200 km in the horizontal, this horizontal homogeneity assumption is not without problems. Depending on the computational effort spent on accurate 205 radiative transfer modelling, a fully tomographic retrieval as suggested by Carlotti et al. (2001Carlotti et al. ( , 2006 or Steck et al. (2005) often is beyond reach. As a first step, horizontal inhomogeneities of temperature have been considered in the trace gas retrievals since MIPAS version 4 by retrieving a horizontal temperature gradient applicable in a range of ±400 km around the tangent point (Kiefer et al., 2010). For retrievals based on level-1b spectra of version 7 onwards we go a step further and consider a full a priori 3D temperature field, generated from ECMWF ERA-Interim data, extended by NRLMSISE-00 data above 60 km.

210
During the retrieval, the temperatures of this 3D a priori field are scaled at each altitude in a way that the horizontal structure is provided by the a priori while the vertical structure comes from the measurements. The respective component of the retrieval vector x is the 1D vector of scaling factors. Roughly speaking, the result is a temperature profile which provides the best spectral fit under the assumption that the a priori horizontal structure of the temperature field is correct. The information on the horizontal temperature variability enters through the a priori but the vertical structure is provided by the MIPAS temperature 215 retrievals.
Additionally, the retrieval of a horizontal gradient directly from the spectra of a single limb sequence is performed. However, the horizontal gradients are strongly regularized towards zero below 60 km, where ECMWF ERA-Interim temperature fields are available, and above 70 km, the topmost tangent altitude of MIPAS nominal measurement mode. In between, the regularization of the temperature gradient is chosen weaker in order to better exploit the information on the horizontal temperature gradient 220 provided by the measurements.

Microwindows
The retrieval does not use the entire measurement data but only parts of the spectra which are particularly sensitive to the target species, so-called 'microwindows' (see, von Clarmann and Echle, 1998 for the rationale behind this approach). For the combined temperature and tangent altitude retrieval CO 2 lines are used, because the mixing ratio distribution of CO 2 is well 225 known and only weakly structured. This reduces the number of unknowns in the retrieval.
In order to have more information on temperature at high altitudes, additional microwindows were included since data version V5. the one hand implies the consideration of line mixing (omitted in previous data versions), but on the other hand allows to use the same microwindow selection for analysis of MIPAS nominal and middle atmosphere measurements (García-Comas et al., 2020). So apart from the increased information gain for higher altitudes, this choice will lead to a better inter-consistency between the two datasets. Depending on the tangent altitude, certain data points within a microwindow can be discarded to avoid interference by other than CO 2 lines. differences between the idealized modeled line-shapes and the true super-or sub-Lorentzian pressure broadening; and (d) the emission by non-gaseous components of the atmosphere like clouds, aerosols, volcanic ash or meteoric dust. Since these non-to 33 km altitude in previous data versions and set to zero above. It turned out, however, that consideration of the background continuum up to altitudes of 58 km significantly improved the robustness of the retrievals and removed known biases in retrieved state variables. The cause of the continuum signal from high altitudes is presumably meteoric dust (Neely III et al., 2011). The relevance of a high-reaching continuum signal was first discovered by Haenel et al. (2015) in the context of the retrieval of SF 6 .

250
Only a smoothing constraint is applied to the continuum retrieval up to 58 km, without any diagonal term. Above, the continuum is regularized exclusively by a diagonal term and an apriori of zero. Formally, an individual continuum-profile is retrieved per microwindow, but the continuum values are not only constrained in the altitude domain but also in the frequency domain. The latter smoothing constraint avoids unrealistic jumps of the value of the background continuum between adjacent microwindows.

Offset correction
Besides the background continuum, we retrieve also a radiance offset profile which is meant to correct the radiance zero level calibration. While the continuum is additive to the absorption coefficient and appears in the exponent of Beer's law, the offset correction is directly additive in the radiance space. When radiative transfer is linear, which is the case for high tangent altitudes, the offset correction and the background continuum cannot be distinguished and the simultaneous retrieval of both leads to a 260 nullspace of solutions. This problem is solved by strongly constraining the background continuum to zero above 58 km, while the vertical offset profile is strongly regularized towards an empirically determined offset correction profile (Kleinert et al., 2018), which is used as a priori for the fit of the zero level correction. The actual offset per microwindow and per altitude is retrieved using both R smooth. and R diag. . The diagonal term corresponds to a variance roughly a factor of two larger than the offset uncertainty obtained by Kleinert et al. (2018), in order to account for possible unknown uncertainties. No regularization 265 of the offset in the frequency domain has been applied, i.e., the offset can vary independently between microwindows.

Joint fit of O 3 and H 2 O
Ideally, microwindows contain only signal of the target species and are free of any interfering signal. In general, however, such microwindows do not exist. In particular, H 2 O and O 3 have sizeable contributions in the microwindows of the temperature retrieval. Since the temperature retrieval is the first step in the retrieval chain, no actual information on the highly variable trace 270 gase abundances is available.
To avoid the mapping of inadequate assumptions on the actual H 2 O and O 3 abundances, these species' mixing ratio profiles are jointly retrieved with temperature. Since the microwindows of the temperature retrieval have not been optimized for joint retrieval of H 2 O and O 3 , the resulting mixing ratios are discarded. The only purpose of this joint-fit approach is to avoid related error propagation.

Spectroscopy
The HITRAN 2016 spectroscopic database (Gordon et al., 2017) was used for CO 2 , whose lines provide the information on temperature and the tangent altitude, as well as for most interfering species. Exceptions are O 3 and HNO 3 , for which the dedicated MIPAS spectroscopic database, provided by Flaud et al. (2003) was used.
3.11 Non-local thermodynamic equilibrium 280 Typically, radiative transfer in the stratosphere is calculated assuming that the atmosphere is in local thermodynamic equilibrium (LTE). Test calculations, however, have provided evidence that the consideration of non-LTE (NLTE) populations of vibrational states involved in the contributing CO 2 bands makes a difference also for temperature retrievals in the MIPAS nominal observational altitude range. The non-LTE effects are only moderate here and thus a full-blown non-LTE retrieval using all the machinery developed by Funke et al. (2005) seems undue. Instead we use a non-LTE parameterization that accounts for 285 the temperature dependence of vibrational non-LTE populations in an approximate manner (manuscript in preparation) which is briefly explained in the following.
Considering a simple 2-level system under non-LTE conditions, upper and ground state populations n 1 and n 0 , respectively, are related by with the collisional productions and losses P and L, respectively, radiative losses A, and non-thermal productions R (e.g., radiative production by solar absorption). In this equation, only P = L exp(−∆E/kT ), with ∆E being the energy difference between upper and ground state, is temperature dependent. Hence, Eq. 4 can be separated in a temperature dependent term a exp(−∆E/kT ) and a temperature independent term b with a = L/(L + A)n 0 and b = R/(L + A)n 0 .
The radiative transfer algorithm uses population ratios r = n NLTE /n LTE with n LTE = n 0 exp(−∆E/kT ). Using Eq. 4 and 295 the identity b exp(∆E/kT ) = r −a, the population ratio r(T, z) can be expressed as function of the ratio r(T 0 , z) at a reference temperature T 0 as with U = a and E = ∆E for the simple case described above. An updated version of the Generic RAdiative traNsfer AnD non-LTE population algorithm (GRANADA) (Funke et al., 2012) computes the parameter profiles U (z) and E(z) for realistic and 300 more complex situations (i.e., multi-level systems, non-linear interactions by VV collisions, etc.), allowing for a temperature parameterization of non-LTE population ratios in a local approximation. A seasonal and latitudinal climatology of r(T 0 , z), U (z) and E(z) for the local times of MIPAS ascending and descending overpasses has been calculated offline with GRANADA and is considered by the KOPRA radiative transfer model in the forward calculations (Funke and Höpfner, 2000) to estimate the non-LTE population ratios of vibrational states 01101, 02201, 10011, and 11101, involved in the observed 16 C 12 O 2 bands 305 for the actual temperatures, at each line-of-sight path segment during the retrieval iterations. This approach seems to be a fair compromise between rigor and efficiency.

Numerical issues
The accuracy of the numerical integration in the radiative transfer modelling has been improved in several places. In order to achieve a more accurate numerical integration of the radiance over the field of view, now 5 pencil beams are use through-310 out, while older retrievals (up to version 5) used only three pencil beams at some tangent altitudes. Also, in order to improve the numerical accuracy, a finer wavenumber grid is used for calculation of the monochromatic absorption cross sections (0.00048828125 cm −1 instead of 0.001 cm −1 ). The convolution of the spectrum with the apodization function (Norton and Beer, 1976) now includes a wider wavenumber range. Additionally, a more conservative rejection threshold for lines so small that they are deemed not to contribute in any sizeable way to the total signal has been chosen. Further, it was found that it 315 is advantageous to recalculate the absorption cross-sections during each iteration in the first seven layers above each tangent altitude. Formerly this costly line-by-line calculation was performed only during the first iteration and the cross-sections were re-used in all layers except the layer above the tangent altitude of the respective line of sight. However, when the temperature profile varies from iteration to iteration, the mass-weighted mean temperatures and pressures of the respective layer change, which is better accounted for by the new approach.

320
In the retrieval code an 'oscillation detection' has been activated which identifies failure of convergence in the sense that the iteration flips back and forth between two minima of the cost function according to In this case x i+1 is set to xi+1+xi 2 , and one further iteration step is performed.
In version 8, 99.95% of the retrieval converged successfully. This is an improvement compared to versions 5 and 7, with 99.37% and 99.85% convergence rate, respectively. The relevant sources of error are measurement noise, gain calibration, frequency calibration (spectral shift), mixing ratios of 330 CO 2 , uncertainties in spectroscopic data and the spectral line shape of the instrument. We first discuss the relevant error sources of the MIPAS temperature retrieval and report the input of assumed uncertainties of the error estimation. In order to comply with the TUNER (Towards Unified Error Reporting, von Clarmann et al., 2020) recommendations, we report uncertainties of chiefly random nature and systematic nature separately (Sections 4.2 and 4.3, respectively). All reported uncertainties are standard deviations (1σ).

335
Every single profile retrieval comes with a noise estimate, while parameter errors, model errors and so forth are provided as mean uncertainties for the representative conditions listed above. The total estimated error ranges from 0.5 K at 24 km and northern polar winter conditions to 2.3 K at 12 km and northern midlatitude daytime conditions.
In general the uncertainties are small in the lower stratosphere and then slowly increase towards higher altitudes (see Figs. 1-2). They also increase towards the tropopause region, and exhibit a strong increase below. This explains, together with the  variation of the tropopause altitude, why errors for a given altitude in the tropopause region were found to vary largely between different limb scans. The retrieval proves to be particularly susceptible to errors just above the lowermost tangent altitude used.
This illuminates why error estimates for northern and southern nighttime midlatitudes (middle panels in Fig. 2) differ so much in the tropopause region. This difference is merely caused by the different fraction of useful (not cloud-contaminated) limb scans reaching down into the troposphere. This behaviour is also seen in Figure 3 which shows the propagation of measurement 345 noise into the retrieved temperatures. High uncertainties are found just above the lowermost tangent altitude used (red solid line). The implication for the data user is that error estimates in the tropopause region can be regarded as fairly reliable in a statistical sense but can deviate for single profiles as described above.

Error sources
Following the terminology of von Clarmann et al. (2020)

Measurement errors
The following measurement errors were found to make a sizeable contribution to the overall error budget: Measurement noise, gain calibration error, instrument line shape uncertainty, and frequency calibration (spectral shift) uncertainties. Error propa-360 gation was performed using linear theory, applied to forward radiative transfer. The propagation of measurement noise was evaluated by means of Eq. 20 of von Clarmann et al. (2020), while the propagation of other measurement errors was estimated on the basis of sensitivity studies for the given atmospheric conditions.
Measurement noise, as estimated from the imaginary part of the spectra, is reported in the level-1b data. In the spectral region used for the temperature retrievals, values are in the range 15-33 nW/(cm 2 sr cm −1 ) after apodization.

365
Gain uncertainties were estimated from scaling ratios between overlapping channels deduced from dedicated IF16 measurements over the mission (see Fig. 11 of Kleinert et al. (2018)). They are estimated to be 1.4% during the FR period and 1.1% during the RR period. For the instrument line shape errors we used the estimates of modulation loss through self-apodization and its uncertainties, as presented by Hase (2003).
Although a spectral shift correction is carried out in a step preceding the combined temperature and pointing retrieval (see,

370
Section3.2), a residual frequency calibration error is considered. It is estimated as the root mean squares difference between the obtained frequency corrections from the shift retrievals and the linear regression line of these spectral shifts over wavenumber.
The resulting uncertainty is 0.00029 cm −1 .
Uncertainties in pointing and radiance offset (zero calibration) were not explicitly considered in the error estimation, because these quantities were simultaneously retrieved with temperature.

Parameter errors
During the temperature retrieval, the concentrations of all interfering gases except O 3 and H 2 O are assumed to be known and treated as parameters. In preceding MIPAS retrievals, climatological distributions of these interfering gases were used for this purpose. Accordingly, the climatological variability determined the uncertainty. For MIPAS version 8 retrievals, results from preceding MIPAS data processing were already available and could be used as estimates of the actual concentrations. The 380 respective uncertainties reduce to the uncertainties of the preceding retrieval. Resulting temperature uncertainties are below 0.1 K for all interfering species that were not jointly fitted.
Since CO 2 lines are used for the temperature retrieval, results are deemed particularly sensitive to the assumed CO 2 mixing ratios. These were taken from the WACCM4 runs described in Section 3.4. Respective estimated 1-σ uncertainties are reported in Table 3. In the troposphere and stratosphere, these are based on considerations of CO 2 uncertainties according to 385 the IPCC Fifth assessment report and uncertainties due the seasonal variability of CO 2 . Above the middle mesosphere, they were estimated after comparisons between WACCM CO 2 and measurements from space, mainly SABER and ACE, as shown in López-Puertas et al. (2017). We further assume that these uncertainties are fully correlated between different lines. We concede that these assumptions can be challenged. However, since we report the temperature uncertainties caused by spectroscopic uncertainties separately, 400 data users endowed with a different degree of optimism can easily rescale the resulting error estimates. Uncertainty estimates provided along with the spectroscopic data compilation by Flaud et al. (2003) appear to be less optimistic than ours. However, preliminary validation do not support the hence resulting larger temperature bias.
For former MIPAS temperature data, uncertainties due to the neglect of non-linear thermodynamic equilibrium and unaccounted horizontal variability of the atmospheric state were reported. These error sources are not considered here, because 405 non-linear thermodynamic equilibrium effects and the horizontally varying atmosphere are explicitly modeled (see Sections 3.11 and 3.5).

Random errors
Random errors are errors which explain the standard deviation of the differences between measurements of the same state variable by two different instruments. The main sources of random errors of MIPAS temperature are measurement noise, 410 spectral shift, gain calibration uncertainties, and the uncertainties of CO 2 mixing ratios. Measurement noise is random by its nature. Spectral shift has originally a more systematic characteristic, but the residual frequency calibration error after correction is random. According to our definition, also gain calibration uncertainties are random. While they are obviously systematic within one gain calibration period, they contribute in the long run rather to the standard deviation of differences between measurement systems than to the bias. Similar considerations apply to the uncertainties in CO 2 mixing ratios, which 415 we consider as random, although they are presumably positively correlated among subsequent measurements. The adequacy of this classification of uncertainties in random and systematic components will be critically tested in a dedicated validation study. None of the other random error components, e.g., mixing ratios of interfering species, makes a sizeable contribution to the error budget.
For most atmospheric conditions and altitudes, the random temperature uncertainty varies between 0.4 and 0.8 K. Occasional 420 excursions up to 1.3 K are encountered above 60 km altitude (Tables A1-A9 and Figures 1-2).
As a rule of thumb, measurement noise is -everything else unchanged -larger for colder and smaller for warmer atmospheres. For the other random error components, no such simple dependence of the error on the atmospheric state can be provided.
For some applications the error covariances are relevant. These depend both on the structure of the Jacobian of the inverse 425 problem and on the covariances of the ingoing uncertainties. While it is hard to fully quantify the latter, we present a sample error correlation matrix which characterises the former in the Appendix (Table B1). The correlation matrix allows the construction of an approximate covariance matrix for any given retrieval noise.

Systematic errors
Systematic errors are, regardless of their origin, errors which explain the bias between measurements of the same state variable 430 by different instruments observing the same part of the atmosphere. The main sources of systematic error in MIPAS temperatures below the mid-mesophere are uncertainties in spectroscopic data and instrument line shape uncertainties. To classify these as systematic is admittedly an idealization, because the actual conditions will somehow modulate the actual resulting errors; e.g., the impact of the uncertainty of the line intensity of an interfering gas depends on the abundance of the interfering species, which may vary randomly. Since, however, the CO 2 lines chosen for the temperature retrieval are strong lines and 435 only weakly interfered by transitions of other species, this random modulation of systematic errors is deemed negligible and the classification of related temperature uncertainties as chiefly systematic seems justified.
The other source of systematic error in MIPAS temperatures is uncertainties in the instrument line shape. Since the same set of coefficients is used for all measurements, this error is of clearly systematic nature. However, it must be kept in mind that modulations of the related initially systematic error by the variable sensitivity of the retrieval that depends on the actual state 440 of the atmosphere will generate a certain random component.
In all altitudes except the uppermost ones, the error budget is dominated by these systematic errors. With this in mind, it can be considered as a particularly grave deficit that uncertainties in spectroscopic data are so vaguely characterized with respect to their confidence limits and correlation characteristics.

445
Since our retrieval decomposes the inverse problem profile by profile, vertical correlations of measurement noise are represented by the respective covariance matrix. Related correlation coefficients are represented in Table B1. Errors due to spectral shift are expected to be almost fully correlated in the altitude domain because the frequency calibration correction is performed individually for entire limb scans. Since frequency calibration corrections are constrained towards the long-term mean, also a positive error correlation in the time domain has to be expected.

450
As stated above, positive correlations are expected for gain calibration errors of measurements recorded within one gain calibration period. This leads to positive error correlations in altitude and between subsequent limb scans. The typical length of a gain calibration period is one day, occasionally two days. Also errors due to uncertain mixing ratios of CO 2 are expected to be correlated in altitude and between subsequent limb scans. Correlation lengths depend on the actual spatial and temporal extension of the CO 2 anomalies.

Averaging kernels and vertical resolution
The vertical resolution of the temperature profiles, estimated as the full width at half maximum of the respective row of the averaging kernel matrix, varies around 3 km in the altitude range up to 40 km (Fig. 4). Above, it gradually deteriorates towards 7 km at 70 km. A local maximum of vertical resolution values of approx. 3.3 km is typically found at the tropical tropopause 460 layer (around 15 km altitude) and is attributed to particularly cold temperatures. The actual values of the vertical resolution are provided for each limb scan along with the data on the MIPAS data server (http://www.imk-asf.kit.edu/english/308.php). The averaging kernels are generally well-behaved in the sense that they peak at their nominal altitude. That is to say, the temperature retrieval at altitude z is most sensitive to the true temperature at altitude z. Further, the kernels are fairly symmetric.
This rules out major information displacement by the retrieval. The pronounced side-wiggles are a typical feature of a retrieval 465 on a grid that is much finer than the measurement grid. This does not point at a weakness of the retrieval set-up. Instead, the often smoother averaging kernels of retrievals on coarser retrieval grids just do not represent these features because the Jacobians do not resolve them. Understanding the column of an averaging kernel matrix as the response of the retrieval to a delta perturbation of the true profile, the so-called delta perturbation on a coarse grid perturbs a much wider part of the atmosphere and thus is not comparable to our fine-grid averaging kernels.

Temperature differences with respect to previous data versions
Preceding versions of MIPAS temperature data were already quite a mature and well validated data product (e.g. Wang et al., 2004Wang et al., , 2005. It has already been shown that MIPAS sees the expected temperature features in the middle atmosphere (e.g., von Clarmann et al., 2009). Thus, it does not come as a surprise that for most parts of the atmosphere, the differences between the new improved temperature data and the previous ones are small. (Fig. 5). Only near the stratopause and above major 475 differences are observed. These are attributed to the use of the extended set of microwindows (see Section 3.6) and to the new WACCM-based prior information (see Section 3.4), which is expected to represent the actual conditions much better than the MSIS-based climatology used before.
In this Section we concentrate on improvements with respect to the previous data version for cases where problems with the older data had already been identified, and we investigate, to which degree MIPAS provides additional information with 480 respect to pre-existing knowledge on temperature and line-of-sight pointing.

Drifts
The technical aspects of the drifts in MIPAS data due to detector aging have already been discussed in Section 2. Here we assess to which degree the revised non-linearity correction in the level-1b processing was successful to reduce related drifts in temperature. Figure 6 shows a time series of temperature differences between MIPAS version 8 and version 5 data. The altitude 485 range of this example is 35 to 40 km and the latitudinal coverage is global. Results for other altitudes are similar.
A relative drift between the data sets is obvious, and comparison with the data by McLandress et al. (2015) clearly suggests that the temporal development of the MIPAS V8 data is more realistic than that of the V5 data. This means that the new MIPAS nonlinearity correction successfully reduces the negative temperature drift.

490
In time series of MIPAS V5 data products jumps in atmospheric state variables can be often seen between the full spectral resolution period (2002)(2003)(2004) and the reduced spectral resolution period (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012). Although methodical development work was never targeted at removing these jumps, as a side effect of other retrieval optimization work, the full resolution and the reduced resolution datasets have become much better interconsistent in the sense that these jumps are now largely reduced.
An illustration of this inconsistency problem is given in Fig. 7. The top row shows monthly temperature means of V8 data 495 in 10 • bins for FR (July 2003) and RR (July 2009) data. There is no obvious inconsistency. However, the lower row of Fig. 7 shows, that the differences between V8 and V5 monthly mean data clearly differ for the full resolution data (i.e. V8 minus V5 for FR measurement period, lower left) and the reduced resolution data (V8 minus V5 for RR, lower right).
To further clarify this inconsistency, the difference between reduced resolution and full resolution monthly mean data was calculated separately for data versions V5 and V8. From Figure 8 it is obvious, that the differences in V5 (left panel) are much 500 more pronounced compared to those in V8. The structure of remaining differences in V8 can also be seen in the V5 differences, suggesting that this is a real atmospheric feature, since mean temperatures of July 2009 and 2003 can be expected to differ somewhat. The result of this analysis is that our V8 data is much more consistent between the MIPAS FR and RR measurement periods than preceding data versions.

505
The dependence of retrieved temperatures above about 60 km on the prior information is caused by the fact that MIPAS cannot resolve the shape of the temperature profile above the highest tangent altitude. This problem has motivated us to replace at these altitudes the climatological NRLMSISE-00-based prior information with prior information from a debiased specified dynamics WACCM run (see, Section 3.4). As a test case, an elevated stratopause event in February 2009 was chosen. A discussion of this episode and independent evidence of this event are reported, e.g., in Funke et al. (2017). The onset of this event was in 510 the beginning of February, and in the second half of February the temperature anomaly reached altitudes relevant to MIPAS retrievals. Figure 9 shows the difference between V8 and V5 temperature for February 20, 2009. The different behaviour of the retrievals is evident. Globally, differences in the data versions are confined to altitudes above 60 km and occasionally exceed 5 K. Here the positive temperature differences hint at too low temperatures in version 5 even at altitudes where MIPAS has 515 measurement information. This is a result of error correlations with altitudes above about 68 km where the retrieval has to rely on the shape of the a priori profile. The too cold temperatures in V5 (showing up as positive differences V8-V5) compensate the too warm a priori-driven temperatures above 70 km to best fit the measured radiance signal.
At northern polar latitudes the inclusion of the new a priori information, which better reflects the actual conditions, is more drastic. The warm region above 70 km is not represented by the V5 NRLMSISE-00-based a priori, and this error propagates 520 downward to 40 km, showing up as temperature oscillations with too warm temperatures in V5 (negative V8-V5 difference) around 50 km and too cold temperatures (positive V8-V5 differences) around 42 km altitude. In summary, the new data version better represents this event not only at altitudes above the uppermost tangent altitude (around 68 km) but also below, because the inadequate temperature profile above the uppermost tangent altitude in V5_T_221 triggered, via error correlations, temperature errors also at altitudes where MIPAS is able to resolve the temperature profile. 525 23 https://doi.org/10.5194/amt-2020-459 Preprint. Discussion started: 16 December 2020 c Author(s) 2020. CC BY 4.0 License.

5.3
Case study on differences with respect to ECMWF temperatures: Temperature waves MIPAS is able to reveal structures in temperature profiles independently of the a priori information. We demonstrate this by two examples of features in temperature profiles, which might be attributed to gravity waves. In the left panels of Fig. 10 temperature profiles for MIPAS retrievals (black lines) and the corresponding ECMWF-based a priori (red lines) are shown, while the right panels show the differences of temperature profile minus the respective vertically smoothed temperature profile 530 for retrieval and a priori data. Smoothing is done with a boxcar of 10 km width.
We rule out that the retrieved wave structure is a numerical artefact of the retrieval caused by too weak regularization, because the MIPAS result agrees well with the ECMWF ERA-interim analysis, which shows very similar structure. The upper panels in Figure 10 show an example. The example shown in the lower panel demonstrates, that MIPAS is able to retrieve such structures independently from the a priori information. There (as in many other cases) we find wave structures in both 535 datasets, MIPAS and ECMWF analyses, with similar vertical wavelength but different phase. The retrieval scheme chosen does not employ any mechanism that would be able to map a vertically shifted structure in the prior information onto the result.
Therefore these results prove that structures in vertical profiles, and in particular these wave structures, are independent MIPAS measurement information.

540
Contrary to other MIPAS data processors , Raspollini et al. 2013 and (http://www.atm.ox.ac.uk/MORSE/), the IMK/IAA processor retrieves the pointing information in terms of tangent altitudes from the spectra, using the engineering information as a Bayesian constraint only, but not as a hard constraint (see Section 3.3). The comparison between the retrieved data and the level-1b engineering information has been used in the past to characterize the MIPAS pointing, and to improve the algorithm involved in the calculation of the line of sight (Kiefer et al., 2007). Meanwhile several improvements of this 545 algorithm have been implemented, and now the comparison reveals the following: 1. The engineering information on the tangent altitudes has changed in a noticeable manner between data versions V5 and V8. Mean changes between engineering tangent altitudes exceeded 600 m at most altitudes.
2. Mean differences in retrieved tangent altitudes (V8-V5) are smaller than about 100 m at altitudes below 40 km and steadily increase above to values of 600 m at 60 km altitude.  5. No discernable latitude dependence was found in these differences.
6. These results confirm that indeed quite independent tangent altitude information is retrieved by the IMK/IAA MIPAS 560 processor and that the retrieval is not over-constrained towards the engineering information. and (f) a radiance offset correction for each microwindow and each tangent altitude. Beyond new level-1b radiance spectra, improvements with respect to older data versions refer to the following upgrades of the retrieval scheme: The frequency calibration correction scheme is made more robust. Additional microwindows were included to obtain more information from high altitudes. A non-LTE parameterization that accounts for the temperature dependence of vibrational non-LTE populations in an approximate manner has been adopted. Better temperature a priori information is used for higher altitudes, taking the actual 570 conditions better into account. Trace gas mixing ratios from previous MIPAS data versions are used to model the contributions of interfering species. An empirical background continuum is retrieved to altitudes up to 58 km instead of 32 km only. An improved offset calibration correction has been used. Due to their significant contribution to the signal in CO 2 microwindows, mixing ratios of O 3 and H 2 O were jointly fitted. Forward calculations were based on updated spectroscopic data. A TUNERcompliant error budget is provided.

575
The developments described above led to the following improvements in the MIPAS temperature data: The drift caused by the non-linearity correction applied in the course of the radiance calibration has been reduced. Results from the high spectral resolution period (2002)(2003)(2004) and the reduced spectral resolution period (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) are now more consistent. Temperature profiles for situations where the temperature profile above the altitude range covered by MIPAS tangent altitudes deviates strongly from the climatological mean, e.g., elevated stratopause situations, are now much more realistic. Compared to previous 580 data versions, a larger fraction of the retrievals converged. We have shown that, although ECMWF ERA-Interim temperature fields are used to constrain the temperature retrievals, vertical temperature wave information can be retrieved which is independent of the prior information used.
The further evaluation of MIPAS version 8 temperatures is deferred to a dedicated validation study. This work is confined to measurements recorded in nominal and UTLS measurement modes. The temperature retrieval from spectra recorded in the 585 middle and upper atmospheric measurement modes are reported in a companion paper by García-Comas et al. (2020).
Acknowledgements. Spectra used for this work were provided by the European Space Agency. We would like to thank the MIPAS Quality Working Group for enlightening discussions, Claus Zehner for particularly helpful support. This study was partly funded by DLR under contract number 50EE1547 (SEREMISA). The IAA team was supported by MCIU under projects ESP2017-87143-R and PID2019-110689RB-   For some applications the error covariances are relevant. In Table B1 we present a sample error correlation matrix. The error correlation matrix is a covariance matrix of retrieval noise component-wise divided by the standard deviations. The result is a matrix of correlation components that can be used to construct an approximate covariance matrix for any given retrieval noise. implemented the updated frequency calibration and offset correction. BF, MGC and MLP took care that the retrieval setup was developed in a way that inter-consistence with the retrieval setups of middle and upper atmospheric measurement modes was maintained. Furthermore they provided the CO2 uncertainties. MLP and BF built the a priori temperature distributions from the WACCM data. NG was responsible