Level 1b error budget for MIPAS on ENVISAT

The Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) is a Fourier Transform Spectrometer measuring the radiance emitted from the atmosphere in limb geometry in the thermal infrared spectral region. It was operated on-board the ENVISAT satellite from 2002 to 2012. Calibrated and geolocated spectra, the so-called level 1b data, are the basis for the retrieval of atmospheric parameters. In this paper we present the error budget for the level 1b data of the most recent data version 8 in terms of radiometric, spectral and line of sight accuracy. The major changes of version 8 compared to older 5 versions are also described. The impact of the different error sources on the spectra is characterized in terms of spectral, vertical and temporal correlation, because these correlations have an impact on the quality of the retrieved quantities. The radiometric error is in the order of 1 to 2.4%, the spectral accuracy is better than 0.3 ppm, and the line of sight accuracy at the tangent point is around 400m. All errors are well within the requirements and the achieved accuracy allows atmospheric parameters to be retrieved from the measurements with high quality. 10

The basis for the retrieval are spectrally and radiometrically calibrated and geolocated spectra, the so-called level 1b data.The quality of these data is essential for the quality of the retrieved species, and a good error estimate is required in order to estimate the precision and accuracy of the retrieved atmospheric parameters (see, e.g., Blumstein et al., 2007;Jarnot et al., 2006).
In this paper, we give an overview of the quality of the MI-PAS level 1b data.We investigate the different error sources and quantify the precision and accuracy of the calibrated spectra.The different types of errors are discussed, and the errors are characterized in terms of spectral and vertical correlation as well as correlation in time.The latter is very important for trend analyses.In Sects. 2 and 3, an overview of the instrument and the level 0 to 1b processing is given, respectively.The following sections treat the different error sources and discuss measurement noise (Sect.4), radiometric accuracy (Sects.5 and 6), spectral accuracy (Sect.7), and line of sight accuracy (Sect.8).All error sources are summarized in Sect.9, and they are characterized in terms of spectral, vertical, and temporal correlation.

The MIPAS instrument
The heart of the instrument is a Michelson-type interferometer with two input and two output ports.It allows twosided interferograms with a maximum optical path difference (MOPD) of up to ±20 cm to be measured.One input port receives radiation from the atmosphere, while the second input port looks at a cold plate of high emissivity cooled to 70 K.Each output port is equipped with four detectors (A1 to D1 and A2 to D2 for the two ports, respectively) covering the spectral range from 685 to 2410 cm −1 .The spectra from the eight detectors are summarized in five spectral bands (denoted A, AB, B, C, and D in Fig. 1) in the level 1b product.The spectral coverage of the individual detectors is different for the two ports (see Fig. 1) in order to ensure full spectral coverage even if one detector fails.Channel A2, which is optimized for the spectral range of band A, also covers the range of band AB, and channel B1, which is optimized for band AB, also covers band B. The long-wavelength channels A1, A2, B1, and B2 use photoconductive mercury cadmium telluride (MCT) detectors, while photovoltaic MCT detectors are used in the short wavelength channels C1, C2, D1, and D2.
In nominal measurement mode, the instrument is looking at rearward direction in limb geometry (Fig. 2).The altitude of the tangent point corresponds to the center of the instrument instantaneous field of view (IFOV).The MIPAS IFOV size is 0.0523 • (in elevation) × 0.523 • (in azimuth), which  (Kleinert et al., 2007).
is roughly equivalent to 3 km (vertically) × 30 km (horizontally) at the tangent point.
One interferogram, taken at one tangent altitude, is called a sweep, and a set of sweeps taken at different tangent altitudes is called an altitude scan or simply, scan.The movement of the interferometer mirrors changes direction from one sweep to the next.One scan is always composed of an uneven number of sweeps (17 in Fig. 2), such that the same tangent altitude is sampled with an opposite sweep direction from one altitude scan to the next.The two sweep directions are named forward and reverse, respectively.
MIPAS is equipped with an internal blackbody.For radiometric calibration, the instrument points towards the internal blackbody or into deep space, i.e., at a tangent altitude of about 210 km.The radiometric gain is determined from pairs of blackbody and deep space measurements on a daily basis, and additional deep space measurements are performed for offset determination several times per orbit.In order to enhance the signal-to-noise ratio, several spectra are co-added for the calibration measurements.Gain and offset are determined individually for the two sweep directions of the interferometer.For more details on the instrument, see Fischer et al. (2008).
The MOPD and therewith the spectral resolution has been modified during the mission.From 2002 until March 2004, the full optical path difference of ±20 cm, corresponding to a spectral sampling of 0.025 cm −1 , was applied for atmospheric measurements.Radiometric calibration measurements were performed with a reduced MOPD of ±2 cm.Due to increasing anomalies in the velocity of the interferometer drive unit, measurements were suspended in March 2004.In order to minimize the risk of an instrument failure, the following measurements were taken with an MOPD of ±8.2 cm, identical for atmospheric and calibration measurements.The interferograms are cut to a length of ±8.0 cm during level 1b processing, corresponding to a spectral sampling of 0.0625 cm −1 .After a short test phase in August 2004, measurements were resumed with the reduced MOPD in the beginning of 2005.The shorter measurement time (1.8 instead of 4.5 s per interferogram) was used to increase the vertical sampling from 17 to 27 sweeps per scan in nominal mode, leading to an optimized trade-off between spectral and spatial resolution.The first measurement period with full The change of the spectral resolution in 2004 also required an adaption of the radiometric calibration measurements.A new trade-off between measurement time and noise in the calibration data had to be found (Kleinert and Friedl-Vallon, 2004).Table 1 lists the main characteristics of the calibration measurements for FR and OR measurements.

Level 1b processing
The measured signal undergoes several processing steps onboard before being sent to ground.The onboard processing includes numerical filtering and decimation, bit truncation, and packetizing.Furthermore, the signals of C1 and C2 as well as D1 and D2 are equalized (to match detector responses) and averaged onboard to bands C and D, respectively.The main steps of the level 1b processing are given in short below.A more detailed description of the level 1b processing is given in Kleinert et al. (2007) and Lachance et al. (2013).
Spike detection and correction.In case of spikes in the interferograms due to cosmic rays or transmission errors, the affected interferograms are either discarded or the spikes are corrected by a simple correction algorithm.The values of the affected data points are divided by 2 until they are below a threshold defined by the adjacent points not affected by the spike.Calibration data with spikes are discarded, scene data are corrected.
Fringe count error detection and correction.In case of fringe count errors during turnaround, the measured interferogram is shifted by an integer number of sampling points.These are corrected by shifting the interferograms back accordingly.
Detector nonlinearity correction.Due to the nonlinear behavior of the photoconductive detectors, their response is dependent on the total photon flux.The nonlinearity has been characterized on ground and in flight.A first-order correction of the nonlinearity consist of scaling each interferogram according to the incident photon flux.Since the interferogram DC is not measured, the peak-to-peak value of the ACcoupled digitized interferogram ADC max−min is used as a measure for the total photon flux.The interferograms of the nonlinear detectors A1, A2, B1, and B2 are scaled according to their ADC max−min values before radiometric calibration.
Radiometric calibration.A two-point calibration according to Revercomb et al. (1988) is performed using measurements of an internal blackbody and deep space measurements.The radiometric calibration is performed separately for the forward and reverse interferogram sweep directions.
Spectral calibration.In order to correct for a drift of the laser wavelength of the reference laser, the spectral axis is scaled by a spectral correction factor.This factor is determined from the spectral position of well-characterized atmospheric lines.
Geolocation assignment.The level 1b processor reports the geolocation with each measured spectrum, i.e., the altitude and position over the Earth geoid of the line of sight (LOS) tangent point at the time of the measurement.The geolocation is determined at the measurement time using the satellite attitude and position, the pointing azimuth and elevation mirror angles, scanning mirror nonlinearity characterization data, an atmospheric refraction model, and LOS calibration data.

Improvements of the level 1b processing
The general level 1 processing has not changed throughout the mission.In detail, however, the processing has undergone several improvements with new processing versions.In the following, we describe the main improvements for the most recent processing version 8.
Improved nonlinearity characterization.The analysis of in-flight characterization measurements throughout the mission revealed that the photoconductive detectors are subject to aging.The response slowly decreases and with this, the detectors become more linear over time.Moreover, the characterization work has shown that the relation between the size of the interferogram peak (ADC max−min ) and the total photon flux is dependent on the instrument temperature and on the degree of ice contamination.In consequence, new parameters for nonlinearity correction have been determined from in-flight characterization measurements, depending on time after launch, instrument temperature, and degree of ice contamination.Parameters from in-flight characterization have already been applied to data version 7, but they have again been improved for version 8.
Improved gain calibration.Although gain measurements were acquired on a daily basis, the gain function used for radiometric calibration was updated only once per week.The gain variation is usually sufficiently slow that the error introduced by the temporal drift of the gain function is below 1 %.In some situations, however, the gain variation is significantly better captured when using the daily gain measurements (as far as they are available).Therefore it has been decided to use the daily gain measurements for processing version 8.
Improved spectral calibration.The spectral calibration factor (SCF) was calculated and updated every four elevation scans.The long-term analysis of the SCF has shown that the reference laser is much more stable than expected and that the variation of the SCF over time was dominated by the noise of the determination.Therefore the SCF is only updated once per day (together with the radiometric gain function), and mean spectra over one full orbit and the appropriate altitude range are used to determine the spectral calibration factor.
Improved LOS calibration.From the LOS calibration data, an annual cycle and negative trend can be deduced.The cycle and trend have been characterized and a corresponding correction has been applied to the tangent altitude information.

Measurement noise
The measurement noise of the scene spectra is given by the noise equivalent spectral radiance (NESR).It is determined from the imaginary part of the calibrated spectra after highpass filtering.NESR 0 denotes the NESR at zero input radiation to the instrument.The NESR 0 has been calculated on ground and in flight.Some examples of NESR spectra together with the requirement are shown in Fig. 3.In order to better compare FR and OR measurements, the NESR 0 values of the FR measurements as well as the requirements have been scaled according to the different spectral reso- lution, i.e., they have been multiplied by √ 0.025/0.0625.The NESR does not change much over the mission and is below the requirement in most of the spectral range (from about 740 to about 2140 cm-1).For atmospheric measurements, the NESR is larger than the NESR 0 because of the increasing photon load on the detectors.The NESR for atmospheric measurements at low tangent altitudes (below 10 km) is about 20 % to 50 % larger than the NESR 0 , depending on the strength of the atmospheric signal in the different bands (not shown).
The variation of the NESR throughout the mission is shown in Fig. 4. Overall, the variation of the NESR is below 25 %, except for the time period of January to May 2005, where the NESR increased due to strong ice contamination.The seasonal variation of the NESR as well as the overall small increase over the mission is very well correlated with the instrument temperature (see De Laurentis, 2012, p. 7).

Radiometric accuracy
The radiometric calibration translates the measured intensities to radiometric units, i.e., to nW cm −2 sr −1 cm.Error sources, which have an impact on the radiometric accuracy are  pointing jitter.
As a requirement, the radiometric accuracy shall be better than or equal to the sum of 2× NESR and 5 % of the source spectral radiance for the nonlinear bands (A, AB, and B), i.e., the spectral range between 685 and 1500 cm −1 .For the linear bands (C and D), the requirement is the sum of 2× NESR and 2 % of the source spectral radiance for 1570 cm −1 and the sum of 2× NESR and 3 % for 2410 cm −1 with a linear increase in this spectral range (Geßner and Fladt, 1995).In fact, a scaling accuracy of 1 % is desired for band A in order to guarantee an accurate temperature retrieval, but the requirement was relaxed to 5 % because of the expected uncertainties related to the nonlinearity correction.
In this study, the radiometric error is separated into a scaling error and an offset error.The scaling error acts multiplicatively on the spectrum, while the offset error acts additively.For all error sources above, the scaling and offset contribution is quantified, and a spectral, temporal, and altitude dependency is given, where appropriate.The various error contributions are listed in Table 3 in Sect.9, where a summary of the overall level 1b data accuracy is given.

Noise in the gain measurements
Blackbody and deep space measurements that serve to calculate the gain function show a certain amount of measurement noise.In order to reduce the measurement noise, several consecutive blackbody and deep space measurements are co-added.During commissioning phase, it was verified that these measurements do not contain any highly resolved spectral features.Therefore the spectral resolution of these measurements is reduced in order to further reduce the noise level.The spectral reduction introduces a correlation of the noise between adjacent data points.The gain measurement approach is different for full resolution and optimized resolution measurements.The main characteristics are listed in Table 1.
The gain error due to noise is shown in Fig. 5.Note that this error has a statistical origin, but it acts like a systematic error on the calibrated spectra since the same error due to noise is applied to all spectra of one scan (separate for forward and reverse sweep directions though) and to a certain number of consecutive scans (usually 1 day).Furthermore the noise is spectrally correlated due to the reduced spectral resolution.The 2σ value of the noise amplitude has been used to estimate this systematic error.

Temporal variation of the gain function
The gain is determined from a series of blackbody and deep space measurements on a daily basis.In case of measurement interruptions, the time gap may also be more than 1 day.Figure 6 shows the variation of the gain function in selected spectral regions (one for each band) over the mission.There is a regular increase in the gain function due to ice contamination, followed by a sudden decrease after decontamination.This effect is strongest in bands A and C because of the spectral signature of ice.There is one period with very strong ice contamination between January and May 2005 where no decontamination was performed.
When looking only at the gain values directly after decontamination, one can observe a continuous increase of the gain function over time in the long-wavelength bands A, AB, and B. This is due to detector aging, which affects the photoconductive detectors A1, A2, B1, and B2.Band B and, to a lesser extent, band AB sometimes show unexplained jumps of up to  2 % in the gain function from one gain measurement to another, often shortly after a decontamination period.
During the level 1b processing, the same gain function is applied to all measurements of that day.If there are days with atmospheric measurements but without gain measurements, the next available gain function in time is applied.Care is taken that the instrument state has not changed between gain and atmospheric measurements, especially in terms of ice contamination.
The variation from one gain measurement to the next is taken as a measure for the uncertainty of the gain calibration.Figure 7 shows a histogram of the gain changes from measurement to measurement in the different bands.Decontamination events have been removed from this statistics.
The variation from one gain measurement to the next is below ±1 % in more than 98 % of the measurements (band A: 98.98 %, band AB: 99.51 %, band B: 98.32 %, band C: 99.67 %, band D: 99.30 %).Band A and C show a slight shift to positive values, due to the regular increase of the gain function because of ice contamination.In contrast, bands AB and B show enhanced values down to −0.5 % (band AB) and to −2 % (band B), respectively, due to the unexplained gain behavior shortly after decontamination.The FWHM (full width at half maximum) increases with wavenumber because of the higher measurement noise (relative) at higher wavenumbers.
In order to quantify a typical value for the gain variation from measurement to measurement, the value comprising 95 % of the data is chosen.This leads to a typical gain variation of 0.4 % in bands A, B, and C, 0.3 % in band AB, and 0.6 % in band D. The error varies slowly with wavenumber, uncorrelated between bands, fully correlated in altitude, fully correlated in time between two gain measurements (usually 1 day), but completely uncorrelated from one gain measurement to the next (i.e., on timescales larger than 1 day or a few days in some situations).

Inaccuracies of the calibration blackbody
The accuracy of the calibration blackbody is limited by the knowledge of the temperature of the cavity, temperature nonuniformities, the quality of the emissivity characterization, and the temperature knowledge of the environment.From the on-ground characterization it is estimated to be less than 0.5 % (Châteauneuf et al., 2001).A possible degradation of the blackbody over the mission can be detected by a change in the gain function over all bands.Band D, which is not affected by detector aging, shows a constant gain over the mission.This allows us to conclude that the quality of the blackbody is preserved over the instrument's lifetime.

Noise in the offset measurements
The offset, which is governed by the instrument selfemission, is determined several times per orbit.The repetition rate as well as the number of co-added spectra and the spectral resolution are given in Table 1.The error due to noise in the offset measurements is shown in Fig. 8.As for the noise in the gain measurements, the error is of statistical origin, but it is systematic in time between subsequent offset measurements, and it is spectrally correlated corresponding to the spectral resolution of the offset measurements.Furthermore the error is constant with altitude (within one limb scan) because the same offset is subtracted from all atmospheric measurements of one scan.
The offset error due to noise in the offset measurements is spectrally correlated within the spectral resolution of the offset, it is constant in time between subsequent offset measurements (i.e., several minutes), and it is vertically constant.

Temporal variation of the instrument offset
The instrument self-emission varies slightly along the orbit.This is well captured by the regular offset measurements.Figure 9 shows the offset variation along the orbit for selected wavenumbers in the different spectral bands in November 2003 (FR mode).The position of the offset measurements within the orbit is represented in terms of latitude.0 • represents the ascending equator crossing, 90 • represents the north pole, 180 • the descending equator crossing, and 270 • the south pole.Each point in the plot represents one offset measurement.In order to reduce the noise level, the offset spectra of 15 orbits have been co-added for each latitude position (i.e., 90 spectra per measurement point, since six sweeps (three forward and three reverse) are taken per offset measurement).The variation between two subsequent offset measurements (i.e., between two data points in Fig. 9) is below 2 nW cm −2 sr −1 cm in band A and even lower in the other spectral bands.In OR mode, where the time span between two offset calibration measurements is larger, the variation is below about 4 nW cm −2 sr −1 cm and still below the offset error due to noise.
The offset error due to variations in the instrument temperature is spectrally correlated over all bands, it is correlated (but not constant) in time between two offset measurements, and it is strongly vertically correlated, although not constant, because the different altitudes are measured at different times and thus at different instrument temperatures.

Uncertainty of the nonlinearity correction
Initially, it was planned to monitor the nonlinearity by dedicated characterization measurements in flight, where the onboard calibration blackbody temperature was varied (socalled IF4 measurements).Unfortunately, the achievable temperature range was too small for a reliable characterization.Therefore, the parameters from the on-ground characterization have been applied to the data of the whole mission up to data version 5.In order to reveal possible changes in the nonlinearity over the mission and to improve the nonlinearity characterization, an alternative characterization method, the so-called DC zero method, has been developed using out-ofband artifacts caused by the detector nonlinearity (Birk and Wagner, 2010;Kleinert et al., 2015).The out-of-band data are usually suppressed by the onboard filtering and decimation.They are only available in a special raw data mode (socalled IF16 measurements) where the filtering and decimation is switched off.Thirty IF16 measurements were acquired throughout the mission, mostly combined with decontamination events.These measurements cover the blackbody, deep space, and the atmosphere.
Using these measurements, it is possible to determine the detector response curve (output as a function of incident photon flux) and to derive the required scaling factors for the nonlinearity correction dependent on the interferogram peakto-peak value ADC max−min .It turned out that the detector curve changes over time due to detector aging, furthermore it is dependent on instrument temperature and the degree of ice contamination.Therefore, instrument temperature, ice contamination load, and orbit number (i.e., time) serve as further input to calculate the appropriate detector curve.
The DC zero method utilizes the fact that the DC zero point for all interferograms in the linear domain is the same for 100 % modulation efficiency (Birk and Wagner, 2010).When the modulation efficiency is known, the nonlinearity information can be derived from the out-of-band artifacts utilizing scene and calibration IF16 spectra with different integral radiance.The method was tested for the Bruker IFS 125HR spectrometer at DLR (Deutsches Zentrum für Luftund Raumfahrt) where DC values are available.The agreement of both methods (with and without using the DC values) is within the uncertainty.In principle, the modulation efficiency can be obtained by taking into account the IF4 blackbody measurements, but it turned out that especially for channels with less nonlinearity (B1, B2), the derived modulation efficiency results were not reliable.Therefore the modulation efficiency is estimated from the optical specifications and instrument properties to be 91 % in all nonlinear channels (Kleinert et al., 2015).This is based on the assumption that the instrument is well aligned and the modulation efficiency is rather wavenumber independent in the relevant spectral range of 685 to 1500 cm −1 .
A multidimensional regression in orbit number (equivalent to time), temperature, and ice has been applied to the data.There are three main sources of uncertainty for the determination of the nonlinearity: (1) the assumption that the detector curve is characterized by a third-order polynomial for channels A1 and A2 and by a second-order polynomial for channels B1 and B2, (2) the estimate of the modulation efficiency, and (3) the regression error.
The uncertainty of the resulting scaling factors is estimated to be better than 2 % (Birk and Wagner, 2010).Since the nonlinearity correction is applied to blackbody, deep space, and atmospheric measurements, this error leads to both a multiplicative and an additive error in the calibrated spectra.The multiplicative error can be estimated to less than 2 % since the errors in the scaling factors of blackbody, deep space, and atmospheric spectra are correlated and partly compensate.This compensation effect is best for large atmospheric radiance levels and thus for low tangent altitudes.
For the offset error, the situation is different.The radiance level of atmospheric measurements of high tangent altitudes is close to the one of the deep space spectrum, leading to similar ADC max−min values.Therefore the scaling factors applied during the nonlinearity correction are similar, and the resulting offset error is to a large extent compensated.The offset error increases with increasing radiance level, i.e., towards lower tangent altitudes.It is below 5 nW cm −2 sr −1 cm in the stratosphere and below 10 nW cm −2 sr −1 cm in the troposphere in band A. In band AB, it is below 1 and 2 nW cm −2 sr −1 cm, respectively, and in band B it is below 0.5 nW cm −2 sr −1 cm and therewith well below the NESR level.
A further error source due to nonlinearity is the impact of the cubic artifact on the spectra.The nonlinearity not only leads to a different (mean) response depending on the incident photon flux, which is corrected by the appropriate scaling of the interferograms, but it also leads to a distortion of the interferogram peak, leading to artifacts in the spectrum.Quadratic terms of the nonlinearity curve lead to out-of-band artifacts and do not distort the signal of interest, whereas cubic terms lead to artifacts inside the nominal spectral range.These artifacts act as an additive contribution to the uncalibrated spectra and with this, they alter the gain function, leading to a scaling error in the calibrated spectrum.The cubic artifact in the atmospheric spectrum leads to an offset error.Both scaling and offset errors spectrally vary.Figure 10a shows the estimated offset error due to the cubic artifact for channel A2 at a tangent altitude of 52 and 15 km.The error for A1 is smaller, due to the smaller spectral range (see Fig. 1) and thus the smaller photon load.The offset error is well below the NESR level.The estimated gain error for A1 and A2 is shown in Fig. 10b.It is largest for small wavenumbers and is up to 1.8 % for channel A2 at the beginning of the mission.Due to the detector aging, the error decreases over time.Since the two channels A1 and A2 are combined to one spectral band A, the error in the level 1b data is between the error of A1 and A2.It is estimated to about 1.5 % at 685 cm −1 and to less than 1 % above 700 cm −1 .For the channels B1 and B2, cubic artifacts are negligible.
The analysis of in-flight measurements with varying blackbody temperatures (IF4 measurements) also revealed a small nonlinearity for band C. The blackbody measurements taken at different temperatures have been radiometrically calibrated and compared to the expected Planck function.While the values are within 0.1 % for band D, they show deviations of up to 0.4 % for band C.An error in the blackbody temperature would have a larger effect on band D than on band C, therefore the deviation in band C is attributed to a small nonlinearity effect.
Since the first order effect of the nonlinearity error is a scaling error of the uncalibrated spectra, the error is rather wavenumber independent within one band.The error may vary from one band to another because each detector is characterized independently.Only the neglect of the cubic artifact in band A has a spectral dependency as illustrated in Fig. 10.The error is altitude dependent; the offset error is larger for low tangent altitudes while the gain error is larger for high tangent altitudes.The error also varies in time, since the detector properties change over time and the relation between total photon load and ADC max−min is also not constant under all circumstances.These variations are not well captured by the sparse characterization measurements.Furthermore, most of the IF16 measurements were taken while the satellite was close to the Kiruna ground station to enable fast enough downlink speed for the raw data mode.Also, these measure- ments were mostly shortly before and after the passive decontamination.Before decontamination the ice load on the detectors was at the maximum, while after decontamination the thermal equilibrium may not have been fully established.Thus, the characterization measurements may not be fully representative for the standard measurement situation.The timescales of the nonlinearity error can only be estimated from the underlying physical effects, namely detector aging, ice contamination, and temperature variations.These effects vary on a timescale of weeks (ice, temperature) to years (aging).

Microvibrations
Microvibrations (introduced by the satellite bus and detector mechanical cooler) are introducing phase modulations in the interferometer.For individual spectra an offset error close to the low-wavenumber boundary occurs with up to 1 % of the unperturbed spectral intensity.The error periodically changes from spectrum to spectrum.Since many spectra are co-added for the gains, microvibrations are canceled out in the gains but are present in the scene spectra.The expected ghost lines are well below the NESR and thus not detectable in the calibrated spectra.Since the phase of the ghost lines is changing from spectrum to spectrum, they cancel out when co-adding several spectra, e.g., for monthly means.

Pointing jitter
Pointing jitter can be observed in raw data IF16 measurements.Pointing jitter leads to an amplitude modulation of the interferogram, which is strongest in presence of strong atmospheric gradients.The frequency of the pointing jitter is 135 Hz, and the amplitude is in the order of 100 m for most of the mission, with amplitudes up to 250 m between 2003 and mid-2005.Pointing jitter can cause ghost lines in the spectra and leads to a small widening of the effective field of view.As for the microvibrations, the phase of the pointing jitter varies from interferogram to interferogram, such that possible ghost lines cancel out when averaging over a larger dataset.Simulations have shown that the expected ghost signatures are within the 1σ NESR levels and thus not easy to detect in calibrated spectra.From retrieval results no obvious impacts related to pointing jitter were found.
6 Estimate of the radiometric error from calibrated spectra In the previous section, the radiometric error was estimated based on the analysis of the underlying physical effects.In this section, the radiometric error is estimated directly from calibrated spectra.The gain error can be estimated from the comparison of calibrated spectra of different channels in overlapping regions.The offset error is estimated from spectral regions where no atmospheric signal is expected.The quality of this error estimation is limited.Though the comparison of spectra of different channels in overlapping regions cannot give an absolute error, it is a good consistency check.Any differences found should be within the error estimated in the previous section.

Estimate of gain error
As shown in Fig. 1, the spectral channels of the different regions show a certain overlap before digital filtering, decimation, and channel combination.Since these steps are usually already performed onboard, the overlapping regions are only available in IF16 measurements, where the raw interferograms are directly sent to ground.When calibrating these measurements, it is possible to deduce a scaling error by determining the correlation between the data from different channels; the radiances of one channel are plotted vs. the corresponding radiances of the other channel.A straight line is fitted to this scatter plot, resulting in a slope and offset which should ideally be 1 and 0, respectively.The deviations from the ideal values are used for error assessment.The slope was determined for all overlapping channels and all available IF16 orbits using all available scene spectra.Differences between channels point towards a radiometric error in at least one of the channels.This method does not allow for an absolute error quantification, but it is a valuable check of the self-consistency of the data.
Unfortunately, the number of IF16 measurements over the mission is sparse (only 30), and the number of altitude scans is limited (one to four per orbit).Scaling ratios have been determined for each available sweep using the following overlapping spectral ranges (all numbers in cm −1 ): A2 / A1 700-800 B1 / A2 1000-1070 B2 / B1 1200-1500 C2 / C1 1550-1750 D2 / D1 1850-2400 The values show a large scatter, but no systematic forward-reverse differences have been found, and the altitude dependency is rather small.Therefore, the median value for each orbit has been used as an indicator for a scaling difference between overlapping channels.The median, instead of the mean, has been chosen in order to be more resistant to outliers.The results are shown in Fig. 11.A linear fit to the data has been added in order to reveal a possible trend.The data for B2 / A1 has been calculated from the ratios B2 / B1, B1 / A2, and A2 / A1.While the ratios for the linear channels C and D are very close to 1, the nonlinear channels show systematic differences up to 1 %.Since these differences are all positive, they add up to an inconsistency between channel A1 and channel B2 of about 2 % at the beginning of the mission.The linear fit shows a small trend towards smaller differences at the end of the mission.The values for the individual orbits, however, show a rather large scatter of sometimes more than 1 %.Overall, the differences found between the different channels can be explained with the estimated errors for the temporal gain variation (Sect.5.2) and the nonlinearity correction (Sect.5.6).
The consistency between the channels A1 and A2 can also be deduced from nominal data.Because of the nonlinearity correction, which is different for A1 and A2 and is performed on ground, the combination of channels A1 and A2 to band A is also performed on ground.It is thus possible to process A1 and A2 separately and compare the results.This has been done for 14 orbits throughout the mission: 2 in FR mode and 12 in OR mode.Other than for the IF16 measurements, where only one to four scans per orbit were available, the data in nominal mode provides data over the full orbit.This allowed for calculating a mean scaling difference over the orbit for each of the 27 tangent altitude levels.It was not possible to determine a scaling difference for the uppermost tangent altitude because the atmospheric signal was too weak.In FR mode, the altitude range was covered by only 17 instead of 27 tangent altitudes, therefore the FR data has been interpolated to 27 altitude levels to allow for a better comparison.The scaling difference is shown in Fig. 12.The agreement is mostly within 0.5 % to 1.5 %, well in line with the ratios deduced from the IF16 measurements.The differences are generally larger for higher altitudes, which points towards an error in the nonlinearity correction and rules out other error sources, such as a slightly different field of view.In this case, relative differences should be larger for lower tangent altitudes where the gradient of the atmospheric signal is much stronger.The difference slightly decreases towards the end of the mission, which is also in line with the IF16 data.There is, however, a certain variation in time, e.g., the difference for orbit 37 580 is larger than that for the neighboring orbits.

Estimate of offset error
The offset error can be estimated directly from calibrated spectra from spectral regions where no atmospheric signal is expected.This works especially well for high tangent altitudes, but in band A the offset can be determined down to about 30 km in the atmospheric window.Above 65 km, mean radiances of the uppermost tangent altitude of different measurement modes have been calculated for selected spectral intervals where no atmospheric signal is expected.In order to reduce the noise level, orbital mean values have been calculated in the following spectral regions (all numbers in cm −1 ): A 840-870 AB 1140-1170 B 1215-1235 C 1724-1729 D 1985-2015 In these spectral regions, the atmospheric contribution is estimated to be below 0.05 nW cm −2 sr −1 cm above 60 km from forward calculations.The offset, i.e., the mean spectral radiance, has been calculated for the uppermost tangent altitude in different measurement modes: nominal mode (NOM, about 70 km), middle atmosphere mode (MA, about 100 km), and upper atmosphere mode (UA, about 170 km).The data used (226 185 spectra in total) have been separated in FR and OR mode; furthermore, they have been analyzed separately for day and night and for forward and reverse sweep direction.The offset values are summarized in Table 2, together with the 1σ standard deviation and the NESR for comparison.There is a systematic positive offset in the data, which has also been observed by López-Puertas et al. (2009) and Günther et al. (2018).The offset decreases with increasing altitude and wavenumber.The data also reveal a systematic daynight difference with higher values at daytime.Furthermore, a systematic forward-reverse difference can be observed in full resolution mode.This difference disappears in optimized resolution mode (see Fig. 13).The offset is about 1 order of magnitude below the NESR and is therefore not visible in single spectra.
In order to reveal offset variations over time and/or latitude, spectra from upper atmosphere measurements in the altitude range of 100 to 170 km were analyzed for six latitude bands (see Fig. 14).For each latitude band, measurements of typically 1 day of the whole altitude range were co-added, separate for day and night.This leads to about 1000 co-added spectra per data point.The result is shown in Fig. 14 for band A. Upper atmosphere measurements were rather sparse at the beginning of the mission but were regularly acquired about every 10 days from November 2007 onwards.The figure shows a seasonal variation of about 1.5 nW cm −2 sr −1 cm at high latitudes.At southern latitudes, this variation is anticorrelated between day and night, while it is correlated at northern latitudes.Depending on the season, there is a latitudinal variation of the offset of up to 2 nW cm −2 sr −1 cm.The variation of the offset is similar in the other bands, with a smaller amplitude, corresponding to the generally smaller offset.The latitudinal variation of the offset is similar for the whole altitude range investigated (Manuel López-Puertas, personal communication, 2008).
In band A, the offset in calibrated spectra can also be estimated for lower tangent altitudes, because no broadband at-  At high altitudes, the offset is around 2.5 nW cm −2 sr −1 cm, in line with the values found for the uppermost tangent altitude in nominal mode.When going further down, the offset is systematically increasing.At 33 km, the offset is about 8 nW cm −2 sr −1 cm.Since an increasing offset with decreasing tangent altitudes has been observed in all spectral bands between 150 and 68 km, it is expected that the increase below 68 km is also similar in all bands.Therefore, the offset error at 33 km is estimated to be 3.1, 1.9, 0.3, and 0.15 nW cm −2 sr −1 cm in bands AB, B, C, and D, respectively.
The forward-reverse difference can be attributed to a calibration error.It is only present during the FR part of the mission, and it is constant over time and independent of tangent altitude.This error cancels out when averaging over time because of the odd number of sweeps in one limb scan.The data is automatically averaged over forward and reverse measurements.The offset variation with altitude cannot be completely explained with instrument effects.Part of this offset could be related to the cubic nonlinearity artifact (see Fig. 10a), but the offset error introduced by this artifact is too small to explain the whole offset observed.Therefore it is assumed that there is a certain straylight contribution from Earth or clouds.Also the day-night variation as well as the seasonal latitude dependent variation of the offset cannot be explained with known instrument effects, but the observed offset variation gives an impression of the expected offset error and its variation.

Spectral accuracy
The spectral axis is scaled according to the wavelength of the reference laser.The spectral calibration factor (SCF) is determined on a daily basis, and the SCF is updated together with the gain function.Figure 16 shows in red the variation of the SCF over the mission as determined by the spectral calibration.The variation from one SCF determination to the next is depicted in blue on the right axis.It is dominated by the noise of the SCF determination.This variation is used as an estimate for the spectral calibration accuracy.The accuracy is mostly within 0.14 ppm in the FR period and within 0.27 ppm in the OR period, corresponding to a spectral shift of 0.0004 and 0.00065 cm −1 , respectively, at 2410 cm −1 .This is well within the requirement of 0.001 cm −1 .This error increases linearly with wavenumber, it is fully vertically correlated, and it is fully correlated in time (usually 1 day) until a new SCF is applied.

Line of sight accuracy
Achieving a good LOS accuracy at the tangent point for a limb sounder is very challenging.For example, in rearward an error of 0.01 • on the pointing angle corresponds to 0.5 km at the tangent point.Dedicated LOS calibration measurements have been acquired in a mode where the instrument is pointed at stars on a weekly basis.The pointing errors were calculated from the expected and actual time of the star passing through the IFOV. Figure 17 presents the pointing errors determined along the mission.At the beginning of the mission, the random variation corresponds to an onboard satellite attitude control software bug which was corrected in December 2003.Toward the end of mission, the calibration was no longer possible due to a detector noise increase.From the data, an annual cycle and negative trend has been deduced.This behavior was also observed with other instruments onboard the satellite along with a validation campaign of MIPAS-retrieved ozone against ozone measured at ground stations (Hubert et al., 2016, p. 36).A model has been fitted to the data and is used to correct the altitude in level 1b processor version 8.
The engineering tangent altitudes reported in the level 1b product have been validated against an independent temperature and LOS retrieval (von Clarmann et al., 2003).For the version 8 data, the retrieved tangent altitudes are generally higher than the engineering tangent altitudes.The overall offset is in the order of 0 to 400 m over the mission.At low tangent altitudes, differences of up to 700 m have been observed with typical differences of 300 to 500 m (Michael Kiefer, personal communication, 2017).The higher error in the troposphere is related to atmospheric refraction.The level 1b processor uses a standard atmosphere in the calculation, and the difference between the actual atmosphere state and the standard model leads to an additional error.The overall error is well below the requirement of ±1800 m.The accuracy of the latitude and longitude is estimated to ±0.021 and ±0.004 • , respectively.

Summary of the level 1b data accuracy
The various sources of uncertainties are summarized in Table 3.The different sources for scaling and additive error are summed up quadratically to give an overall scaling and additive error estimate.For each error source, the spectral, spatial (vertical), and temporal correlation is characterized.In some cases, two values are given: a typical value and an upper limit in brackets.This upper limit refers to either only a small spectral range of the band or short time periods during the mission.Details about the individual errors are given in the respective sections above.

Conclusions
We have quantified the MIPAS level 1b error in terms of radiometric, spectral, and line of sight accuracy.The thorough characterization of the instrument and level 1b processing has led to several improvements in the latest level 1b processing version 8 compared to earlier processing versions.The radiometric error has been separated into a multiplicative gain error and an additive offset error, and the different types of error have been characterized in terms of spectral and vertical correlation lengths and in terms of evolution in time.The error correlation is important for its impact on the retrieved species, e.g., errors with short correlation lengths in time cancel out when averaging over a longer time span.
The estimated accuracy has been cross-checked by analyzing the self-consistency of calibrated spectra.From special measurements, it could be shown that scaling differences between the data acquired by different detectors are within  a according to the spectral resolution of the calibration measurements b depending on spectral emissivity c increasing with altitude d highly correlated but not constant within one band e decreasing with time f decreasing with altitude the estimated gain errors.The offset error is deduced from calibrated spectra using spectral regions and altitude ranges where no atmospheric signal is expected.At high tangent altitudes, this error is rather below the error estimated from the characterization, but it increases systematically with decreasing altitude, which is not expected from instrument characterization.Therefore it is assumed that this effect is related to straylight rather than an instrumental offset.The errors are well within specifications, and the achieved accuracy allows for the retrieval of atmospheric parameters from the measurements with high quality.It should be noted, however, that the analysis of trends is very sensitive to longterm drifts of instrument properties, namely changes in the nonlinearity of the photoconductive detectors.
The experience with the MIPAS instrument has shown that a thorough characterization work is extremely important for a good data quality throughout the mission.Regular characterization measurements are indispensable in order to reveal instrument changes, e.g., due to aging, and the regular transmission of raw, unprocessed data is very valuable to understand the instrument and identify possible issues.Flexibility must be allowed in operation mode and the calibration process to cope with changing situations in long-term missions.Last but not least an exhaustive on-ground characterization of parameters which cannot be determined during flight is very valuable for understanding the data measured in flight and also improves the data quality.These aspects should be considered for any future satellite mission.
Competing interests.The authors declare that they have no conflict of interest.Special issue statement.This article is part of the special issue "Towards Unified Error Reporting (TUNER)".It does not belong to a conference.

Figure 1 .
Figure 1.Spectral channels and resulting spectral bands of MIPAS.The curves show unfiltered spectra of blackbody measurements in relative units.The maximum of each channel is scaled to 1. Band A is composed of channels A1 and A2, AB of B1, B of B2, and C and D of C1 and C2 and D1 and D2, respectively.The colored boxes indicate the spectral coverage of the spectral bands.

Figure 3 .
Figure 3. NESR 0 values on ground and in flight for selected orbits along the mission.

Figure 4 .
Figure 4. NESR values in the middle of each band throughout the mission.Again, the values measured in FR mode (2002 to 2004) have been scaled to the spectral resolution of the OR mode.

Figure 5 .
Figure 5. Relative gain error due to noise for FR and OR mode in percent.Please note the logarithmic scale.

Figure 6 .
Figure 6.Relative gain difference with respect to orbit 2552 of 26 August 2002 in selected spectral regions over the mission.

Figure 7 .
Figure 7. Histogram of the relative gain change in the five spectral bands over the mission.

Figure 8 .
Figure 8. Offset error due to noise in the offset measurements.

Figure 9 .
Figure 9. Variation of the instrument offset along the orbit for selected wavenumbers in November 2003 (FR mode).

Figure 10 .
Figure 10.(a) Estimated offset error for A2 due to the neglect of the cubic artifact for a tangent altitude of 15 km (black) and 52 km (gray).(b) Estimated gain error due to the neglect of the cubic artifact in the blackbody and deep space spectra for channel A1 (orange) and A2 (red) for orbit 1680 at the beginning of the mission.

Figure 11 .
Figure11.Scaling ratios between overlapping channels deduced from IF16 measurements over the mission.A linear fit to the data has been added.

Figure 12 .
Figure 12.Ratio of calibrated spectra of channels A2 and A1.Altitude 26 corresponds to about 67 km, altitude 1 to about 7 km.

Figure 13 .
Figure 13.Difference between spectra of forward and reverse sweep direction for FR mode (black) and OR mode (red) at about 70 km tangent altitude.35 000 and 3800 spectra have been co-added per sweep direction for FR and OR mode, respectively.

Figure 14 .
Figure 14.Mean offset over 100 to 170 km tangent altitude in band A, separated in six latitude bands and separate for day (red) and night (blue).

Figure 15 .
Figure 15.Offset determined from calibrated spectra in an altitude range of 33 to 63 km around 832 cm −1 .

Figure 16 .
Figure16.Spectral calibration factor (SCF) as determined from atmospheric measurements over the mission (red).The difference between subsequent SCF values is depicted in blue on the right axis.

Figure 17 .
Figure 17.MIPAS pointing errors along the mission (grey) and fitted error model (red).

Table 1 .
Radiometric calibration measurements spectral resolution is named full resolution (FR) mode, while the measurement period from January 2005 to April 2012 is named optimized resolution (OR) mode.

Table 2 .
Offset values determined from calibrated spectra.f-r is forward and reverse.All values are in nW cm −2 sr −1 cm.

Table 3 .
Summary of the level 1b data accuracy.NL is nonlinearity.For details, see text.