Research article 10 Feb 2020
Research article  10 Feb 2020
A channel selection method for hyperspectral atmospheric infrared sounders based on layering
 ^{1}College of Meteorology and Oceanography, National University of Defense Technology, Nanjing, China
 ^{2}Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science and Technology, Nanjing, China
 ^{3}South China Sea Institute for Marine Meteorology, Guangdong Ocean University, Zhanjiang, China
 ^{1}College of Meteorology and Oceanography, National University of Defense Technology, Nanjing, China
 ^{2}Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science and Technology, Nanjing, China
 ^{3}South China Sea Institute for Marine Meteorology, Guangdong Ocean University, Zhanjiang, China
Correspondence: Zheng Sheng (19994035@sina.com)
Hide author detailsCorrespondence: Zheng Sheng (19994035@sina.com)
This study introduces an effective channel selection method for hyperspectral infrared sounders. The method is illustrated for the Atmospheric InfraRed Sounder (AIRS) instrument. The results are as follows. (1) Using the improved channel selection (ICS), the atmospheric retrievable index is more stable, with the value reaching 0.54. The coverage of the weighting functions is more evenly distributed over height with this method. (2) Statistical inversion comparison experiments show that the accuracy of the retrieval temperature, using the improved channel selection method in this paper, is consistent with that of 1DVar channel selection. In the stratosphere and mesosphere especially, from 10 to 0.02 hPa, the accuracy of the retrieval temperature of our improved channel selection method is improved by about 1 K. The accuracy of the retrieval temperature of ICS is also improved at lower heights. (3) Statistical inversion comparison experiments for four different regions illustrate latitudinal and seasonal variations and better performance of ICS compared to the numerical weather prediction (NWP) channel selection (NCS) and primary channel selection (PCS) methods. The ICS method shows potential for future applications.
Since the successful launch of the first meteorological satellite, TIROS, in the 1960s, satellite observation technology has developed rapidly. Meteorological satellites observe the Earth's atmosphere from space and are able to record data from regions that are otherwise difficult to observe. Satellite data greatly enrich the content and range of meteorological observations, and, consequently, atmospheric exploration technology and meteorological observations have taken us to a new stage in our understanding of weather systems and related phenomena (Fang, 2014; Zhao et al., 2019). From the perspective of vertical atmospheric observation, satellite instruments are developing rapidly. In their infancy, the traditional infrared measurement instruments for detecting atmospheric temperature and moisture profiles, such as the TIROS Operational Vertical Sounder (TOVS) (Smith et al., 1991) or High Resolution Infrared Sounder (HIRS) in the Advanced TIROS Operational Vertical Sounder (ATOVS) (Chahine, 1972; Li et al., 2000; Liu, 2007), usually employed filter spectrometry. Even though such instruments have played an important role in improving weather prediction, it is difficult to continue to build upon improvements in terms of observation accuracy and vertical resolution due to the limitation of low spectral resolution. By using this kind of filterbased spectroscopic measurement instrument, therefore, it is difficult to meet today's needs in numerical weather prediction (Eyre et al., 1993; Prunet et al., 2010; Menzel et al., 2018). To meet this challenge, a series of plans for the creation of highspectralresolution atmospheric measurement instruments has been executed in the United States and in Europe in recent years. One example is the AIRS (Atmospheric InfraRed Sounder) on the Earth Observation System, “Aqua”, launched on 4 May 2002 from the United States. AIRS has 2378 spectral channels, providing sensitivity from the ground to up to about 65 km in altitude (Aumann et al., 2003; Hoffmann and Alexander, 2009; Gong et al., 2012). The United States and Europe, in 2010 and in 2012, also installed the CRIS (Crosstrack Infrared Sounder) and the IASI (Infrared Atmospheric Sounding Interferometer) on polarorbiting satellites, respectively.
China also places great importance on the development of such advanced sounding technologies. In the early 1990s, the National Satellite Meteorological Center began to investigate the principles and techniques of hyperspectralresolution atmospheric observations. China's development of interferometric atmospheric vertical detectors eventually led to the launch of Fengyun no. 3 on 27 May 2008 and Fengyun no. 4 on 11 December 2016, both of which were equipped with infrared atmospheric instruments. How best to use the hyperspectralresolution observation data obtained from these instruments, to obtain reliable atmospheric temperature and humidity profiles, is an active area of study in atmospheric inversion theory.
Due to technical limitations, at first only a limited number of channels could be built into the typical satellite instruments. In this case, channel selection generally involved controlling the channel weighting function by utilizing the spectral response characteristics of the channel (such as center frequency and bandwidth). With the development of measurement technology, increasing numbers of hyperspectral detectors were carried on meteorological satellites. Due to the large number of channels and data supported by such instruments today (such as AIRS with 2378 channels and IASI with 8461 channels), it has proven extremely cumbersome to store, transmit and process such data. Moreover, there is often a close correlation between the channel, causing an illposedness of the inversion and potentially compromising accuracy of the retrieval product based on hyperspectralresolution data.
However, hyperspectral detectors have many channels and provide realtime mode prediction systems with vast quantities of data, which can significantly improve prediction accuracy. But if all the channels are used to retrieve data, the retrieval time considerably increases. Even more problematic are the glut of information produced and the unsuitability of the calculations for realtime forecasting. Concurrently, the computer processing power must be enough to meet the demands of simulating all the channels simultaneously within the forecast time. In order to improve the calculation efficiency and retrieval quality, it is very important to properly select a set of channels that can provide as much information as possible.
Many researchers have studied channel selection algorithms. Menke (1984) first chose channels using a data precision matrix method. Aires et al. (1999) made the selection using the Jacobian matrix, which has been widely used since then (Aires et al., 2002; Rabier et al., 2010). Rodgers (2000) indicated that there are two useful quantities in measuring the information provided by the observation data: Shannon information content and degrees of freedom. The concept of information capacity then became widely used in satellite channel selection. In 2007, Xu (2007) compared the Shannon information content with the relative entropy, analyzing the information loss and information redundancy. In 2008, Du et al. (2008) introduced the concept of the atmospheric retrievable index (ARI) as a criterion for channel selection, and, in 2010, Wakita et al. (2010) produced a scheme for calculating the information content of the various atmospheric parameters in remote sensing using Bayesian estimation theory. Kuai et al. (2010) analyzed both the Shannon information content and degrees of freedom in channel selection when retrieving CO_{2} concentrations using thermal infrared remote sensing and indicated that 40 channels could contain 75 % of the information from the total channels. Cyril et al. (2003) proposed the optimal sensitivity profile method based on the sensitivity of different atmospheric components. Lupu et al. (2012) used degrees of freedom for signals (DFS) to estimate the amount of information contained in observations in the context of observing system experiments. In addition, the singular value decomposition method has also been widely used for channel selection (Prunet et al., 2010; Zhang et al., 2011; Wang et al., 2014). In 2017, Chang et al. (2017) selected a new set of Infrared Atmospheric Sounding Interferometer (IASI) channels using the channel score index (CSI). Richardson et al. (2018) selected 75 from 853 channels based on the highspectralresolution oxygen Aband instrument on NASA's Orbiting Carbon Observatory2 (OCO2), using information content analysis to retrieve the cloud optical depth, cloud properties and position.
Today's main methods for channel selection use only the weighting function to study appropriate numerical methods, such as the data precision matrix method (Menke, 1984), singular value decomposition method (Prunet et al., 2010; Zhang et al., 2011; Wang et al., 2014) and the Jacobi method (Aires et al., 1999; Rabier et al., 2010). The use of the methods allows sensitive channels to be selected. The abovementioned studies also take into account the sensitivity of each channel to atmospheric parameters during channel selection, while ignoring some factors that impact retrieval results. The accuracy of retrieval results depends not only on the channel weighting function but also on the channel noise, background field and the retrieval algorithm.
Channel selection mostly uses the information content and delivers the largest amount of information for the selected channel combination during the retrieval (Rodgers, 1996; Du et al., 2008; He et al., 2012; Richardson et al., 2018).
This method has made great breakthroughs in both theory and practice, and the concept of information content itself does consider all the height dependencies of the kernel matrix K (Rodgers, 2000). However, earlier works have neglected the height dependencies of K for simplicity. This paper uses the atmospheric retrievable index (ARI) as the index, which is based on information content (Du et al., 2008; Richardson et al., 2018). Channel selection is made at different heights, and an effective channel selection scheme is proposed that fully considers various factors, including the influence of different channels on the retrieval results at different heights. This ensures the best accuracy of the retrieval product when using the selected channel. In addition, statistical inversion comparison experiments are used to verify the effectiveness of the method.
2.1 Channel selection indicator
According to the concept of information content, the information content contained in a selected channel of a hyperspectral instrument can be described as H (Rodgers, 1996; Rabier et al., 2010). The final expression of H is as follows:
where S_{a} is the error covariance matrix of the background or the estimated value of atmospheric profile, S_{ε} represents the observation error covariance matrix of each hyperspectral detector channel, $\widehat{\mathbf{S}}=({\mathbf{S}}_{\mathrm{a}}{\mathbf{S}}_{\mathrm{a}}{\mathbf{K}}^{T}{\left({\mathbf{KS}}_{\mathrm{a}}{\mathbf{K}}^{T}+{\mathbf{S}}_{\mathit{\epsilon}}\right)}^{\mathrm{1}}{\mathbf{KS}}_{\mathrm{a}})$ denotes the covariance matrix after retrieval and K is the weighting function matrix.
In order to describe the accuracy of the retrieval results visually and quantitatively, the atmospheric retrievable index (ARI), p, (Du et al., 2008) is defined as follows:
Assuming that before and after the retrieval the ratio of the rootmeansquare error of each element in the atmospheric state vector is 1−p, then $\left\widehat{\mathbf{S}}{\mathbf{S}}_{\mathrm{a}}^{\mathrm{1}}\right=(\mathrm{1}p{)}^{\mathrm{2}n}$ is derived. By inverting the equation, the ARI that is p can be obtained in Eq. (2), which indicates the relative portion of the error that is eliminated by retrieval. In fact, before and after retrieval, the ratio of the rootmeansquare error of each element cannot be 1−p. Therefore, p defined by Eq. (1) is actually an overall evaluation of the retrieval result.
2.2 Channel selection scheme
The principle of channel selection is to find the optimum channel combination after numbering the channels. This combination makes the information content, H, or the ARI defined in this paper as large as possible, in order to maintain the highest possible accuracy in the retrieval results.
There are M layers in the vertical direction of the atmosphere and N satellite channels. Selecting n from N channels, there will be ${C}_{N}^{n}$ combinations in each layer, leading ${C}_{N}^{n}$ calculations to get ${C}_{N}^{n}$ kinds of p results. Furthermore, there are M layers in the vertical direction of the atmosphere. Therefore, the entire atmosphere must be calculated $M\cdot {C}_{N}^{n}$ times. However, the calculation $M\cdot {C}_{N}^{n}$ times will be particularly large, which makes this approach impractical in calculating p for all possible combinations. Therefore, it is necessary to design an effective calculation scheme, and such a scheme, i.e., a channel selection method, using iteration is proposed, called the “sequential absorption method” (Dudhia et al., 2002; Du et al., 2008). The method's main function is to select (“absorb”) channels one by one, taking the channel with the maximum value of p. Through n iterations, n channels can be selected as the final channel combination. The steps are as follows:
(I) The expression of information content in a single channel.
First, we use only one channel for retrieval. A row vector, k, in the weighting function matrix, K, is a weighting function corresponding to the channel. After observation in this channel, the error covariance matrix is as follows:
It should be noted that (s_{ε}+kS_{a}k^{T}) is a scalar value in Eq. (3), thus Eq. (3) can be converted to the following equation:
Substituting Eq. (4) into Eq. (2) gives the following equation:
(II) Simplification of Eq. (5) for calculating the p value.
Since S_{a} and S_{ε} are positive definite symmetric matrices, they can be decomposed into ${\mathbf{S}}_{\mathrm{a}}=\left({\mathbf{S}}_{\mathrm{a}}^{\mathrm{1}/\mathrm{2}}{)}^{T}\right({\mathbf{S}}_{\mathrm{a}}^{\mathrm{1}/\mathrm{2}})$ and ${\mathbf{S}}_{\mathit{\epsilon}}=\left({\mathbf{S}}_{\mathit{\epsilon}}^{\mathbf{1}/\mathbf{2}}{)}^{\mathbf{T}}\right({\mathbf{S}}_{\mathit{\epsilon}}^{\mathbf{1}/\mathbf{2}})$.
This can be defined using the following equation:
The matrix R can then be regarded as a weighting function matrix, normalized by the observed error and a priori uncertainty. A row vector of R, $r={s}_{\mathit{\epsilon}}^{\mathrm{1}/\mathrm{2}}\mathit{k}{\mathbf{S}}_{\mathrm{a}}^{\mathrm{1}/\mathrm{2}}$, represents the normalized weighting function matrix of a single channel. Substituting r into Eq. (5) gives the following equation:
For arbitrary row vectors, a and b, using the matrix property $\mathrm{det}\left(I+{a}^{T}b\right)=\mathrm{1}+b{a}^{T}$, the new expression for p is as follows:
(III) Iteration in a single layer.
First, the iteration in a single layer requires the calculation of R. Using S_{a}, S_{ε} , K and Eq. (6), R can be calculated. Second, using Eq. (8), p of each candidate channel can be calculated. Moreover, the channel corresponding to maximum p is the selected channel for this iteration. After a channel has been selected, according to Eq. (3) we can use $\widehat{\mathbf{S}}$ to get S_{a} for the next iteration. Finally, channels which are not selected during this iteration are used as the candidate channels for the next iteration.
When selecting n from N channels, it is necessary to calculate $(Nn/\mathrm{2})n\approx Nnp$ values, which is much smaller than ${C}_{N}^{n}$. In addition to high computational efficiency by using this method, another advantage is that all channels can be recorded in the order in which they are selected. In the actual application, if n^{′} channels are needed and ${n}^{\prime}<n$, we will not need to select the channel again but record the selected channel only.
(IV) Iteration for different altitudes.
Because satellite channel sensitivity varies with height, repeating the iterative process of step (III) selects the optimum channels at different heights. Assuming there are M layers in the atmosphere and selecting n from N channels, it is necessary to calculate $M\cdot (Nn/\mathrm{2})n\approx M\cdot Nnp$ values, a much smaller number than $M\cdot {C}_{N}^{n}$. In this way, different channel sets can be used to evaluate corresponding height in the retrieved profiles.
2.3 Statistical inversion method
The inversion methods for the atmospheric temperature profiles can be summarized in two categories: statistical inversion and physical inversion. Statistical inversion is essentially a linear regression model, which uses a large number of satellite measurements and atmospheric parameters to match samples and calculate their correlation coefficient. Then, based on the correlation coefficient, the required parameters of the independent measurements obtained by the satellite are retrieved. Because the method does not directly solve the radiation transfer equation, it has the advantage of fast calculation speed. In addition, the solution is numerically stable, which makes it one of the highestprecision methods (Chedin et al., 1985). Therefore, the statistical inversion method will be used for our channel selection experiment and a regression equation will be established.
According to an empirical orthogonal function, the atmospheric temperature (or humidity), T, and the brightness temperature, T_{b}, are expanded as follows:
where T^{∗} and ${\mathbf{T}}_{\mathbf{b}}^{\ast}$ are the eigenvectors of the covariance matrix of temperature (or humidity) and brightness temperature, respectively. A and B stand for the corresponding expansion coefficient vectors of temperature (humidity) and brightness temperature.
Using the leastsquares method and the orthogonal property, the coefficient conversion matrix, V, is introduced:
where
Using the orthogonality, we get the following equation:
For convenience, the anomalies of the state vector (atmospheric temperature), T, and the observation vector (brightness temperature), T_{b}, are taken as follows:
where $\widehat{\mathbf{T}}$ stands for the retrieval atmospheric temperature. $\stackrel{\mathrm{\u203e}}{\mathbf{T}}$ and $\stackrel{\mathrm{\u203e}}{{\mathbf{T}}_{\mathbf{b}}}$ are the corresponding average values of the elements, respectively. ${\widehat{\mathbf{T}}}^{\prime}$ and ${{\mathbf{T}}_{\mathbf{b}}}^{\prime}$ represent the corresponding anomalies of the elements, respectively.
Assuming there are k sets of observations, a sample anomaly matrix with k vectors can be constructed:
Define the inversion error matrix as follows:
The retrieval error covariance matrix is as follows:
where
S_{e} stands for the sample covariance matrix of T, S_{y} denotes the sample covariance matrix of T_{b}, and S_{xy} represents the covariance matrix of T and T_{b}. The elements on the diagonal of the error covariance matrix, S_{δ}, represent the retrieval error variance of T. The matrix G that minimizes the overall error variance is the leastsquares coefficient matrix of the regression Eq. (15), which meets the following criterion:
Taking a derivative of Eq. (21) with respect to G, $\frac{\partial}{\partial \mathbf{G}}\mathrm{tr}\left({\mathbf{S}}_{\mathit{\delta}}\right)=\mathrm{0}=(\mathrm{2}{\mathbf{S}}_{xy}+\mathrm{2}{\mathbf{GS}}_{y})$, which means that
Substituting Eq. (22) into Eq. (15) finally gives the leastsquares solution as follows:
It should be noted that the leastsquares solution obtained here aims to minimize the sum of the error variance for each element in the atmospheric state vector after retrieval for several different times. At present, statistical multiple regression is widely used in the retrieval of atmospheric profiles based on atmospheric remote sensing data. As long as there are enough data, S_{xy} and S_{y} can be determined.
3.1 Data and model
The Atmospheric Infrared Sounder (AIRS) is primarily designed to measure the Earth's atmospheric water vapor and temperature profiles on a global scale (Aumann et al., 2003; Susskind et al., 2003). AIRS is a continuously operating crosstrackscanning sounder, consisting of a telescope that feeds an echelle spectrometer. The AIRS infrared spectrometer acquires 2378 spectral samples at a resolution λ∕Δλ, ranging from 1086 to 1570, in three bands: 3.74 to 4.61, 6.20 to 8.22 and 8.8 to 15.4 µm. The footprint size is 13.5 km. The spectral range includes 4.3 and 15.5 µm for important temperature observation and CO_{2}, 6.3 µm for water vapor, and 9.6 µm for ozone absorption bands (Menzel et al., 2018). The rootmeansquare error (RMSE) of the measured radiation is better than 0.2 K (Susskind et al., 2003). Moreover, global atmospheric profiles can be detected every day. Due to radiometer noise and faults, there are currently only 2047 effective channels. However, compared with previous infrared detectors, AIRS boasts a significant improvement in both the number of channels and spectral resolution (Aumann, 1994; Huang et al., 2005; Li et al., 2005).
The rootmeansquare error of an AIRS infrared channel is shown in Fig. 1. The measurement error is not below 0.2 K for all the instrument channels. There are a few channels with extremely large measurement errors, which reduce the accuracy of prediction to some extent. Among them, some extremely large measurement errors reduce the accuracy of prediction to some extent (Susskind et al., 2003). At present, more than 300 channels have not been used because their errors exceed 1 K. If data from these channels were to be used for retrieval, the accuracy of the retrieval could be reduced. Therefore, it is necessary to select a group of channels to improve the calculation efficiency and retrieval quality. In this paper we study channel selection for temperature profile retrieval by AIRS.
For the calculation of radiative transfer and the weighting function matrix, K, the RTTOV (Radiative Transfer for TIROS Operational Vertical Sounder) v12 fast radiative transfer model is used. Although initially developed for the TOVS (TIROS Operational Vertical Sounder) radiometers, RTTOV can now simulate around 90 different satellite sensors measuring in the MW (microwave), IR (infrared) and VIS (visible) regions of the spectrum (Saunders et al., 2018). The model allows rapid simulations (1 ms for 40 channel Advanced TOVS, ATOVS, on a desktop PC) of radiances for satellite visible, infrared, or microwave nadirscanning radiometers given atmospheric profiles of temperature and trace gas concentrations and cloud and surface properties. The only mandatory gas included as a variable for RTTOV v12 is water vapor. Optionally, ozone, carbon dioxide, nitrous oxide, methane, carbon monoxide and sulfur dioxide can be included, with all other constituents assumed to be constant. RTTOV can accept input profiles on any defined set of pressure levels. The majority of RTTOV coefficient files are based on the 54 levels (see Table A1 in Appendix A), in the range from 1050 to 0.01 hPa, though coefficients for some hyperspectral sounders are also available on 101 levels.
In order to correspond to the selected profiles, the atmosphere is divided into 137 layers, each of which contains corresponding atmospheric characteristics, such as temperature, pressure and the humidity distribution. Each element in the weighting function matrix can be written as $\partial yi/\partial xj$. The subscript i is used to identify the satellite channel, and the subscript j is used to identify the atmospheric variable. Therefore, $\partial yi/\partial xj$ indicates the variation in brightness temperature in a given satellite channel, when a given atmospheric variable in a given layer changes. We are thus able to establish which layer of the satellite channel is particularly sensitive to which atmospheric characteristic (temperature, various gas contents) in the vertical atmosphere. The RTTOV_K (the K mode) is used to calculate the matrix H(X0) (Eq. 1) for a given atmospheric profile characteristic.
3.2 Channel selection comparison experiment and results
In order to verify the effectiveness of the method, three sets of comparison experiments were conducted. First, 324 channels used by the EUMETSAT Satellite Application Facility on Numerical Weather Prediction (NWPSAF) were selected. NCS is short for NWP channel selection in this paper. NCSs were released by the NWPSAF 1DVar (onedimensional variational analysis) scheme, in accordance with the requirements of the NWPSAF (Saunders et al., 2018). Second, 324 channels were selected using the information capacity method. This method was adopted by Du et al. (2008) without the consideration of layering. PCS is short for primary channel selection in this paper.
Third, 324×M channels were selected using the information capacity method for the M layer atmosphere. ICS is short for improved channel selection in this paper. In order to verify the retrieval effectiveness after channel selection, statistical inversion comparison experiments were performed using 5000 temperature profiles provided by the ECMWF dataset, which will be introduced in Sect. 4.
The observation error covariance matrix, S_{ε}, in the experiment is provided by NWPSAF 1DVar. In general, it can be converted to a diagonal matrix, the elements of which are the observation error standard deviation of each hyperspectral detector channel, which is the square of the rootmeansquare error for each channel. The rootmeansquare error of the AIRS channels is shown in Fig. 1. The error covariance matrix of the background, S_{a}, is calculated using 5000 samples of the IFS137 data provided by the ECMWF dataset (The detailed information will be introduced in Sect. 4). The last access date is 26 April 2019 (download address: https://www.nwpsaf.eu/site/update137levelnwpprofiledataset/ last access: 11 January 2020). The covariance matrix of temperature is shown in Fig. 2. The results are consistent with the previous study by Du et al. (2008).
The reference atmospheric profiles are from the IFS137 database, and the temperature weighting function matrix is calculated using the RTTOV_K mode, as shown in Fig. 3; the results are consistent with those of the previous study by Du et al. (2008). For the airbased passive atmospheric remote sensing studied in this paper, when the same channel detects the atmosphere from different observation angles, the value of the weighting function matrix K changes due to the limb effect. The goal of this section is focusing on the selection methods of selecting channels; therefore, the biases produced from different observation angles can be ignored.
In order to verify the effectiveness of the method, the distribution of 324 channels in the AIRS brightness temperature spectrum, without considering layering, is indicated in Fig. 4. The background brightness temperature is the simulated AIRS observation brightness temperature, which is from the atmospheric profile in RTTOV put into the model. Figure 4a shows the 324 channels selected by PCS, while Fig. 4b shows the 324 channels selected by NCS.
Without considering layering, the main differences between the 324 channels selected by PCS and NCS are as follows. (1) In the near 10 µm band, fewer channels are selected by PCS because the retrieval of ground temperature is considered by NCS. (2) In the near 9 µm band, no channels are selected by PCS because the retrieval of O_{3} is not considered in this paper. (3) As is known, the spectral range from 6 to 7 µm corresponds to water vapor absorption bands, but fewer channels are selected by NCS; (4) Near 5 µm band, it includes 4.2 µm for N_{2}O and 4.3 µm for CO_{2} absorption bands. As is shown in Fig. 4, fewer channels are selected by PCS in those bands. PCS is favorable for atmospheric temperature observation. Because 4.2 and 4.3 µm bands are sensitive to high temperature, a better observation can be obtained for higher temperatures. (5) In the near 4 µm band, a small number of channels are selected by NCS, but no channels are selected by PCS.
Above all, the information content considered in this study only takes the temperature profile retrieval into consideration, thus the channel combination of PCS is inferior to that of NCS for the retrieval of surface temperature and the O_{3} profile. The advantages of the channel selection method based on information content in this paper are mainly reflected in the following ways: (1) the stratosphere and mesosphere are less affected by the ground surface, thus the retrieval result of PCS is better than that of NCS. (2) Due to the method selected in this paper there are more channels at 4.2 µm for N_{2}O and 4.3 µm for CO_{2} absorption bands. The channel combination of PCS is better than that of NCS for atmospheric temperature observation at higher temperature.
By comparing channel selection without considering layering, we note the general advantages and disadvantages of PCS and NCS for the retrieval of temperature and can improve the channel selection scheme. First, the retrieval of the temperature profile for 324 channels selected by PCS is obtained. The relationship between the number of iterations and the ARI is shown in Fig. 5.
The ARI for PCS tends to be 0.38 and is not convergent, thus the PCS method needs to be improved. In this paper, the atmosphere is divided into 137 layers and, based on the information content and iteration, 324 channels are selected for each layer. Then, the temperature profile of each layer can be retrieved based on statistical inversion (see Sect. 4). The relationship between the number of iterations and the ARI for ICS is shown in Fig. 5b. When the number of iterations approaches 100, the ARI of ICS tends to be stable and reaches 0.54. Thus, in terms of the ARI and convergence, the ICS method is better than that of PCS.
Furthermore, because an iterative method is used to select channels, the order of each selected channel is determined by the contribution from the ARI. The weighting function matrix of the top 324 selected channels, according to channel order, is shown in Fig. 6.
As illustrated in Fig. 6, in the first 100 iterations, the distribution of the temperature weighting function for PCS is relatively scattered; it does not reflect continuity between the adjacent layers of the atmosphere. Besides, the ICS result is better than that of PCS, showing that (1) the distribution of the temperature weighting function is more continuous and reflects the continuity between adjacent layers of the atmosphere and (2) regardless of the number of iterations, the maximum value of the weighting function is stable near 300–400 and 600–700 hPa, without scattering, which is closer to the situation in real atmosphere.
4.1 Temperature profile database
A new database including a representative collection of 25 000 atmospheric profiles from the European Centre for Mediumrange Weather Forecasts (ECMWF) was used for the statistical inversion experiments. The profiles were given in a 137level vertical grid extending from the surface up to 0.01 hPa. The database was divided into five subsets focusing on diverse sampling characteristics, such as temperature, specific humidity, ozone mixing ratio, cloud condensates and precipitation. In contrast with earlier releases of the ECMWF diverse profile database, the 137level database places greater emphasis on preserving the statistical properties of sampled distributions produced by the Integrated Forecasting System (IFS) (Eresmaa and McNally, 2014; Brath et al., 2018). IFS137 spans the period from 1 September 2013 to 31 August 2014. There are two operational analyses each day (at 00:00 and 12:00 Z), and approximately 13 000 atmospheric profiles over the ocean. The pressure levels adopted for IFS137 are shown in Table A2 (see Table A2 in Appendix A).
The locations of selected profiles of temperature, specific humidity and cloud condensate subsets of the IFS91 and IFS137 databases are plotted on the map in Fig. 7. In the IFS91 database, the sampling is fully determined by the selection algorithm, which makes the geographical distributions very inhomogeneous. Selected profiles represent those regions where gradients of the sampled variable are the strongest: in the case of temperature, midlatitudes and high latitudes dominate, while humidity and cloud condensate subsets concentrate at low latitudes. However, the IFS137 database shows a much more homogeneous spatial distribution in all the sampling subsets, which is a consequence of the randomized selection.
The temporal distribution of the selected profiles is illustrated in Fig. 8. The coverage of the IFS137 dataset is more homogeneous than the IFS91 dataset. Moreover, the IFS137 database supports the mode with input parameters, such as detection angle, 2 m temperature and cloud information. Therefore, it is feasible to use the selected samples in a statistical multipleregression experiment.
4.2 Experimental scheme
In order to verify the retrieval effectiveness of ICS, 5000 temperature profiles provided by the IFS137 were used for statistical inversion comparison experiments. The steps are as follows.

A total of 5000 profiles and their corresponding surface factors, including surface air pressure, surface temperature, 2 m temperature, 2 m specific humidity and 10 m wind speed, are put into the RTTOV mode. Then, the simulated AIRS spectra are obtained.

The retrieval of temperature is carried out in accordance with Eq. (23). The 5000 profiles are divided into two groups. The first group of 2500 profiles is used to obtain the regression coefficient, and the second group of 2500 is used to test the result.

The results are then verified; the test is carried out based on the standard deviation between the retrieval value and the true value.
4.3 Results and discussion
For the statistical inversion comparison experiments, the standard deviation of temperature retrieval is shown in Fig. 9. First, because PCS does not take channel sensitivity as a function of height into consideration, the retrieval result of PCS is inferior to that of ICS. Second, by comparing the results of ICS and NCS we found that below 100 hPa, since the method used in this paper considers near ground to be less of an influencing factor, the channel combination of ICS is slightly inferior to that of NCS, but the difference is small.
From 100 to 10 hPa, the retrieval temperature of ICS in this paper is consistent with that of NCS, slightly better than the channel selected for NCS. From 10 to 0.02 hPa, near the space layer, the retrieval temperature of ICS is better than that of NCS. In terms of the standard deviation, the channel combination of ICS is slightly better than that of PCS from 100 to 10 hPa. From 10 to 0.02 hPa, the standard deviation of ICS is lower than that of NCS by about 1 K, meaning that the retrieval result of ICS is better than that of NCS.
In order to further illustrate the effectiveness of ICS, the mean improvement value of the ICS and its percentages compared with the PCS and NCS at different heights are shown in Table 1. Because PCS does not take channel sensitivity as a function of height into consideration, the retrieval result of PCS is inferior to that of ICS. In general, the accuracy of the retrieval temperature of ICS is improved. Especially from 100 to 0.01 hPa, the mean value of ICS is evidently improved by more than 0.5 K, which means the accuracy can be improved by more than 11 %. By comparing the results of ICS and NCS we found that below 100 hPa, since the method used in this paper considers near ground to be less of an influencing factor, the channel combination of ICS is slightly inferior to that of NCS, but the difference is small. From 100 to 0.01 hPa, the mean value of ICS is improved by more than 0.36 K, which means the accuracy can be improved by more than 9.6 %.
This is because, as shown in Fig. 4, (1) stratosphere and mesosphere is less affected by the ground surface, thus the retrieval result of PCS is better than that of NCS. (2) Due to the method selected in this paper, there are more channels at 4.2 µm for N_{2}O and 4.3 µm for CO_{2} absorption bands, and the channel combination of PCS is superior to that of NCS for atmospheric temperature observation in the hightemperature zone. Moreover, ICS takes channel sensitivity as a function of height into consideration, thus its retrieval result is improved.
The accuracy of the retrieval temperature varies from place to place and changes with atmospheric conditions. Therefore, in order to further compare the inversion accuracy under different atmospheric conditions, this paper has divided the atmospheric profile from the IFS137 database introduced in Sect. 4 into four regions: the equatorial zone, subtropical regions, midlatitude regions and the Arctic. The average temperature profiles in these four regions are shown in Fig. 10. The retrieval temperature varies from place to place and changes with atmospheric conditions. In order to further compare the regional differences of inversion accuracy, the temperature standard deviations of ICS in four typical regions are compared in Sect. 5.2.
5.1 Experimental scheme
In order to further illustrate the different accuracy of the retrieval temperature using our improved channel selection method under different atmospheric conditions, the profiles in four typical regions were used for statistical inversion comparison experiments. The experimental steps are as follows:

A total of 2500 profiles in Sect. 4 are used to work out the regression coefficient.

The atmospheric profiles of the four typical regions, i.e., the equatorial zone, subtropical regions, midlatitude regions and the Arctic, are used for statistical inversion comparison experiments and to test the result.

The results are then verified; the test is carried out based on the standard deviation between the retrieval value and the true value.
5.2 Results and discussion
Using statistical inversion comparison experiments in four typical regions, the standard deviation of temperature retrieval is shown in Fig. 11. Generally, the retrieval temperature by ICS is better than that of NCS and PCS. In particular, above 1 hPa (the stratosphere and mesosphere) the standard deviation of atmospheric temperature can be improved by 1 K with PCS and NCS. Thus, ICS shows a great improvement. The results were consistent with Sect. 4.
In order to further compare the regional differences of inversion accuracy, the temperature standard deviation of ICS in four typical regions are compared in Fig. 12.
The temperature standard deviations of the ICS in the four typical regions are large (Fig. 12). Below 100 hPa, due to the high temperature in the equatorial zone, the channel combination of ICS is better than that of PCS and NCS for atmospheric temperature observation at higher temperature. The standard deviation is 0.5 K. Due to the method selected in this paper there are more channels at 4.2 µm for N_{2}O and 4.3 µm for CO_{2} absorption bands, which has been previously described in Sect. 3. Near the tropopause, the standard deviation of the equatorial zone increases sharply. It is also due to the sharp drops in temperature. However, the standard deviation of the Arctic is still around 0.5 K. From 100 to 1 hPa, the standard deviation of ICS is 0.5 to 2 K. With the increase in latitude, the effectiveness considerably increases. According to Fig. 11, ICS takes channel sensitivity as a function of height into consideration, thus its retrieval result is better.
Although the improvements of ICS in the four typical regions are different, in general, the accuracy of the retrieval temperature of ICS is improved. Because PCS does not take channel sensitivity as a function of height into consideration, the retrieval result of PCS is inferior to that of ICS. In general, the accuracy of the retrieval temperature of ICS is improved.
In recent years, the atmospheric layer in the altitude range of about 20–100 km has been named “the nearspace layer” by the aeronautical and astronautical communities. It is between the spacebased satellite platform and the aerospace vehicle platform, which is the transition zone between aviation and aerospace. Its unique resource has attracted a lot of attention from many countries. Research and exploration, therefore, on and of the nearspace layer are of great importance. A new channel selection scheme and method for hyperspectral atmospheric infrared sounder AIRS data based on layering is proposed. The retrieval results of ICS concerning the nearspace atmosphere are particularly good. Thus, ICS aims to provide a new and an effective channel selection method for the study of the nearspace atmosphere using the hyperspectral atmospheric infrared sounder.
An improved channel selection method is proposed, based on information content in this paper. A robust channel selection scheme and method are proposed, and a series of channel selection comparison experiments are conducted. The results are as follows.

Since ICS takes channel sensitivity as a function of height into consideration, the ARI of PCS only tends to be 0.38 and is not convergent. However, as the 100th iteration is approached, the ARI of ICS tends to be stable, reaching 0.54, while the distribution of the temperature weighting function is more continuous and closer to that of the actual atmosphere. Thus, in terms of the ARI, convergence and the distribution of the temperature weighting function, ICS is better than PCS.

Statistical inversion comparison experiments show that the retrieval temperature of ICS in this paper is consistent with that of NCS. In particular, from 10 to 0.02 hPa (the stratosphere and mesosphere), the retrieval temperature of ICS is obviously better than that of NCS at about 1 K. In general, the accuracy of the retrieval temperature of ICS is improved. Especially, from 100 to 0.01 hPa, the accuracy of ICS can be improved by more than 11 %. The reason is that stratosphere and mesosphere are less affected by the ground surface, thus the retrieval result of ICS is better than that of NCS. Additionally, due to the method selected in this paper, there are more channels at 4.2 µm for the N_{2}O and at 4.3 µm for the CO_{2} absorption bands, and the channel combination of ICS is better than that of NCS for atmospheric temperature observation at higher temperature.

Statistical inversion comparison experiments in four typical regions indicate that ICS in this paper is significantly better than NCS and PCS in different regions and shows latitudinal variations, which shows potential for future applications.
The data used in this paper are available from the corresponding author upon request.
ZS contributed the central idea. SC, ZS and HD conceived the method, developed the retrieval algorithm and discussed the results. SC analyzed the data, prepared the figures and wrote the paper. WG contributed to refining the ideas and carrying out additional analyses. All coauthors reviewed the paper.
The authors declare that they have no conflict of interest.
The study was supported by the National Key Research Program of China: Development of highresolution data assimilation technology and atmospheric reanalysis dataset in East Asia (Research on remote sensing telemetry data assimilation technology, grant no. 2017YFC1501802). The study was also supported by the National Natural Science Foundation of China (grant no. 41875045) and Hunan Provincial Innovation Foundation for Postgraduates (grant nos. CX2018B033 and CX2018B034).
This research has been supported by the National Natural Science Foundation of China (grant no. 41875045), the National Key Research Program of China: Development of highresolution data assimilation technology and atmospheric reanalysis dataset in East Asia (grant no. 2017YFC1501802), and the Hunan Provincial Innovation Foundation for Postgraduates (grant nos. CX2018B033, CX2018B034).
This paper was edited by Lars Hoffmann and reviewed by four anonymous referees.
Aires, F., Schmitt, M., Chedin, A., and Scott, N.: The “weighting smoothing” regularization of MLP for Jacobian stabilization, IEEE. T. Neural. Networks., 10, 1502–1510, https://doi.org/10.1109/72.809096, 1999.
Aires, F., Chédin, A., Scott, N. A., and Rossow, W. B.: A regularized neural net approach for retrieval of atmospheric and surface temperatures with the IASI instrument, J. Appl. Meteorol., 41, 144–159, https://doi.org/10.1175/15200450(2002)041<0144:ARNNAF>2.0.CO;2, 2002.
Aumann, H. H.: Atmospheric infrared sounder on the earth observing system, Optl. Engr., 33, 776–784, https://doi.org/10.1117/12.159325, 1994.
Aumann, H. H., Chahine, M. T., Gautier, C., and Goldberg, M.: AIRS/AMSU/HSB on the Aqua mission: design, science objective, data products, and processing systems, IEEE. Trans. GRS., 41, 253–264, https://doi.org/10.1109/TGRS.2002.808356, 2003.
Brath, M., Fox, S., Eriksson, P., Harlow, R. C., Burgdorf, M., and Buehler, S. A.: Retrieval of an ice water path over the ocean from ISMAR and MARSS millimeter and submillimeter brightness temperatures, Atmos. Meas. Tech., 11, 611–632, https://doi.org/10.5194/amt116112018, 2018.
Chahine, M. I.: A general relaxation method for inverse solution of the full radiative transfer equation, J. Atmos. Sci., 29, 741–747, https://doi.org/10.1175/15200469(1972)029<0741:AGRMFI>2.0.CO;2, 1972.
Chang, K. W., L'Ecuyer, T. S., Kahn, B. H., and Natraj, V.: Information content of visible and midinfrared radiances for retrieving tropical ice cloud properties, J. Geophys. Res., 122, 4944–4966, https://doi.org/10.1002/2016JD026357, 2017.
Chedin, A., Scott, N. A., Wahiche, C., and Moulinier, P.: The improved initialization inversion method: a high resolution physical method for temperature retrievals from satellites of the tirosn series, J. Appl. Meteor., 24, 128–143, https://doi.org/10.1175/15200450(1985)024<0128:TIIIMA>2.0.CO;2, 1985.
Cyril, C., Alain, C., and Scott, N. A.: Airs channel selection for CO_{2} and other tracegas retrievals, Q. J. Roy. Meteor. Soc., 129, 2719–2740, https://doi.org/10.1256/qj.02.180, 2003.
Du, H. D., Huang, S. X., and Shi, H. Q.: Method and experiment of channel selection for high spectral resolution data, Acta. Physica. Sinica., 57, 7685–7692, 2008.
Dudhia, A., Jay, V. L., and Rodgers, C. D.: Microwindow selection for highspectralresolution sounders, Appl. Opt., 41, 3665–3673, https://doi.org/10.1364/AO.41.003665, 2002.
Eresmaa, R. and McNally, A. P.: Diverse profile datasets from the ECMWF 137level shortrange forecasts, Tech. rep., ECMWF, 2014.
Eyre, J. R., Andersson E., and McNally, A. P.: Direct use of satellite sounding radiances in numerical weather prediction, High Spectral Resolution Infrared Remote Sensing for Earth's Weather and Climate Studies, Springer, Berlin, Heidelberg, https://doi.org/10.1007/9783642845994_25, 1993.
Fang, Z. Y.: The evolution of meteorological satellites and the insight from it, Adv. Meteorol. Sci. Technol., 4, 27–34, 2014.
Gong, J., Wu, D. L., and Eckermann, S. D.: Gravity wave variances and propagation derived from AIRS radiances, Atmos. Chem. Phys., 12, 1701–1720, https://doi.org/10.5194/acp1217012012, 2012.
He, M. Y., Du, H. D., Long, Z. Y., and Huang, S. X.: Selection of regularization parameters using an atmospheric retrievable index in a retrieval of atmospheric profile, Acta. Physica Sinica., 61, 2012.
Hoffmann, L. and Alexander, M. J.: Retrieval of stratospheric temperatures from atmospheric infrared sounder radiance measurements for gravity wave studies, J. Geophys. Res.Atmos., 114, D07105, https://doi.org/10.1029/2008JD011241, 2009.
Huang, H. L., Li, J., Baggett, K., Smith, W. L., and Guan, L.: Evaluation of cloudcleared radiances for numerical weather prediction and cloudcontaminated sounding applications, Atmospheric and Environmental Remote Sensing Data Processing and Utilization: Numerical Atmospheric Prediction and Environmental Monitoring, I. S. O. Photonics., https://doi.org/10.1117/12.613027, 2005.
Kuai, L., Natraj, V., Shia, R. L., Miller, C., and Yung, Y. L.: Channel selection using information content analysis: a case study of CO_{2} retrieval from near infrared measurements, J. Quant. Spectosc. Ra., 111, 1296–1304, https://doi.org/10.1016/j.jqsrt.2010.02.011, 2010.
Li, J., Wolf, W. W., Menzel, W. P., Paul, Menzel. W., Zhang, W. J., Huang, H. L., and Achtor, T. H.: Global soundings of the atmosphere from ATOVS measurements: the algorithm and validation, J. Appl. Meteor., 39, 1248–1268, https://doi.org/10.1175/15200450(2000)039<1248:GSOTAF>2.0.CO;2, 2000.
Li, J., Liu, C. Y., Huang, H. L., Schmit, T. J., Wu, X., Menzel, W. P., and Gurka, J. J.: Optimal cloudclearing for AIRS radiances using MODIS, IEEE. Trans. GRS., 43, 1266–1278, https://doi.org/10.1109/tgrs.2005.847795, 2005.
Liu, Z. Q.: A regional ATOVS radiancebias correction scheme for rediance assimilation, Acta. Meteorologica. Sinica., 65, 113–123, 2007.
Lupu, C., Gauthier, P., and Laroche, S.: Assessment of the impact of observations on analyses derived from observing system experiments, Mon. Weather. Rev., 140, 245–257, https://doi.org/10.1175/MWRD1005010.1, 2012.
Menke, W.: Geophysical Data Analysis: Discrete Inverse Theory, Acad. Press., Columbia University, New York, https://doi.org/10.1016/B9780123971609.000199, 1984.
Menzel, W. P., Schmit, T. J., Zhang, P., and Li, J.: Satellitebased atmospheric infrared sounder development and applications, B. Am. Meteorol. Soc., 99, 583–603, https://doi.org/10.1175/BAMSD160293.1, 2018.
Prunet, P., Thépaut, J. N., and Cass, V.: The information content of clear sky IASI radiances and their potential for numerical weather prediction, Q. J. Roy. Meteorol. Soc., 124, 211–241, https://doi.org/10.1002/qj.49712454510, 2010.
Rabier, F., Fourrié, N., and Chafäi, D.: Channel selection methods for infrared atmospheric sounding interferometer radiances, Q. J. Roy. Meteorol. Soc., 128, 1011–1027, https://doi.org/10.1256/0035900021643638, 2010.
Richardson, M. and Stephens, G. L.: Information content of OCO2 oxygen Aband channels for retrieving marine liquid cloud properties, Atmos. Meas. Tech., 11, 1515–1528, https://doi.org/10.5194/amt1115152018, 2018.
Rodgers, C. D.: Information content and optimisation of high spectral resolution remote measurements, Adv. Sp. Res., 21, 136–147, https://doi.org/10.1016/S02731177(97)009150, 1996.
Rodgers, C. D.: Inverse Methods for Atmospheric Sounding, Inverse methods for atmospheric sounding, World Scientific, Singapore, https://doi.org/10.1142/3171, 2000.
Saunders, R., Hocking, J., Turner, E., Rayer, P., Rundle, D., Brunel, P., Vidot, J., Roquet, P., Matricardi, M., Geer, A., Bormann, N., and Lupu, C.: An update on the RTTOV fast radiative transfer model (currently at version 12), Geosci. Model Dev., 11, 2717–2737, https://doi.org/10.5194/gmd1127172018, 2018.
Susskind, J., Barnet, C. D., and Blaisdell, J. M.: Retrieval of atmospheric and surface parameters from AIRS/AMSU/HSB data in the presence of clouds, IEEE T. Geosci. Remote, 41, 390–409, https://doi.org/10.1109/TGRS.2002.808236, 2003.
Smith, W. L., Woolf, H. M., and Revercomb, H. E.: Linear simultaneous solution for temperature and absorbing constituent profiles from radiance spectra, Appl. Optics., 30, 1117, https://doi.org/10.1364/AO.30.001117, 1991.
Wakita, H., Tokura, Y., Furukawa, F., and Takigawa, M.: Study of the information content contained in remote sensing data of atmosphere, Acta. Physica. Sinica., 59, 683–691, 2010.
Wang, G., Lu, Q. F., Zhang, J. W., and Wang, H. Y.: Study on method and experiment of hyperspectral atmospheric infrared sounder channel selection, Remote Sens. Technol. Appl.., 29, 795–802, 2014.
Xu, Q.: Measuring information content from observations for data assimilation: relative entropy versus shannon entropy difference, Tellus A., 59, 198–209, https://doi.org/10.1111/j.16000870.2006.00222.x, 2007.
Zhang, J. W., Wang, G., Zhang, H., Huang J., Chen J., and Wu, L. L.: Experiment on hyperspectral atmospheric infrared sounder channel selection based on the cumulative effect coefficient of principal component, J. Nanjing Inst. Meteorol., 1, 36–42, https://doi.org/10.3969/j.issn.16747097.2011.01.005, 2011.
Zhao, X. R., Sheng, Z., Li, J. W., Yu, H., and Wei, K.7 5 J.: Determination of the “wave turbopause” using anumerical differentiation method, J. Geophys. Res.Atmos., 124, 10592–10607, https://doi.org/10.1029/2019JD030754, 2019.
 Abstract
 Introduction
 Channel selection indicator, scheme and method
 Channel selection experiment
 Statistical multipleregression experiment
 Statistical inversion comparison experiments in four typical regions
 Conclusions
 Data availability
 Appendix A
 Author contributions
 Competing interests
 Acknowledgements
 Financial support
 Review statement
 References
 Abstract
 Introduction
 Channel selection indicator, scheme and method
 Channel selection experiment
 Statistical multipleregression experiment
 Statistical inversion comparison experiments in four typical regions
 Conclusions
 Data availability
 Appendix A
 Author contributions
 Competing interests
 Acknowledgements
 Financial support
 Review statement
 References