Bias assessment of lower and middle tropospheric CO 2 concentrations of GOSAT/TANSO-FTS TIR Version 1 product

. CO 2 observations in the free troposphere can be useful for constraining CO 2 source and sink estimates at the surface since they represent CO 2 concentrations away from point source emissions. The thermal infrared (TIR) band of the Thermal and Near Infrared Sensor for Carbon Observation (TANSO) − Fourier Transform Spectrometer (FTS) on board the 15 Greenhouse Gases Observing Satellite (GOSAT) has been observing global CO 2 concentrations in the free troposphere for about 8 years, and thus could provide a dataset with which to evaluate the vertical transport of CO 2 from the surface to the upper atmosphere. This study evaluated biases in the TIR version 1 (V1) CO 2 product in the lower troposphere (LT) and the middle troposphere (MT) (736 − 287 hPa), on the basis of comparisons with CO 2 profiles obtained over airports using Continuous CO 2 Measuring Equipment (CME) in the Comprehensive Observation Network for Trace gases by AIrLiner 20 (CONTRAIL) project. Bias-correction values are presented for TIR CO 2 data for each pressure layer in the LT and MT


Introduction
CO 2 in the atmosphere is the most influential greenhouse gas (IPCC, 2013 and references therein). Many studies have been conducted to estimate the sources and sinks of atmospheric CO 2 using both observational data and transport models (e.g., Gurney et al., 2002;2004). In CO 2 inversion studies, accurate atmospheric CO 2 observations with spatial representativeness are desirable, which can be obtained from elevated sites such as tall towers and mountains or over the ocean. Patra et al. 5 4 CONTRAIL is a project to observe atmospheric trace gases, such as CO 2 and CH 4 , using two types of instruments installed on commercial aircraft operated by Japan Airlines (JAL) starting in 2005. Of the two instruments, CME can observe CO 2 concentrations more frequently over a wide area (Machida et al., 2008). See Machida et al. (2008) and Machida et al. (2011) for details about CME CO 2 observations. This study used CO 2 data obtained with CME during the ascent and descent flights over several airports from 2010 to 2012. Figure 1 shows the locations of the airports used here, which fall in the latitude 5 range of 40°S to 60°N.

NICAM-TM CO 2 data
We used atmospheric CO 2 data simulated by NICAM-TM (Niwa et al., 2011b) for global comparison with TANSO-FTS TIR CO 2 data. NICAM has quasi-homogeneous grids, with horizontal grids generated by recursively dividing an icosahedron. The NICAM simulations used in this study were performed with a horizontal resolution of around 240 km, 10 which corresponds to the horizontal resolution when an icosahedron is divided five times ("glevel-5"). See Tomita and Satoh (2004) and Satoh et al. (2008Satoh et al. ( , 2014 for details of NICAM. The transport model version of NICAM, NICAM-TM, has been developed and used for atmospheric transport and source/sink inversion studies of long lived species such as CO 2 (Niwa et al., 2011a(Niwa et al., ,b, 2012(Niwa et al., , 2017. In this study, simulation of NICAM-TM used inter-annually varying flux data of fossil fuel emissions (Andres et al., 2013) 15 and biomass burnings (van der Werf et al., 2010), and the residual natural fluxes from the inversion of Niwa et al. (2012), which mostly represent fluxes from the terrestrial biosphere and oceans. The inversion analysis of Niwa et al. (2012) was performed for 2006−2008 and the three-year-mean fluxes were used in this study. In the inversion analysis, CONTRAIL CO 2 data obtained during ascending, descending, and cruise level flights were categorized into four vertical bins: 575-625, 475-525, 375-425, and 225-275 hPa, and the binned CONTRAIL CO 2 data were then incorporated into the inverse model, 20 in addition to surface CO 2 data (Niwa et al., 2012). Niwa et al. (2012) showed that incorporating the CONTRAIL CO 2 data into the surface flux inversion model improved CO 2 concentration simulation compared with a simulation using surface CO 2 data only. They also demonstrated that the simulated CO 2 concentrations based on CONTRAIL CO 2 data showed better agreement with independent upper atmospheric CO 2 data obtained in the Civil Aircraft for the Regular Investigation of the atmosphere Based on an Instrument Container (CARIBIC) project (Brenninkmeijer et al., 2007). Furthermore, the CO 2 25 forward simulation of NICAM-TM for 2010−2012 showed a good agreement with in-situ CO 2 observations not only in seasonal cycles but also in trends in spite of using the fluxes optimized for 2006−2008; the simulated growth rate at the Minamitorishima station (e.g., Wada et al., 2011), which is one of the global stations of the Global Atmospheric Watch (GAW), was 2.4 ppm/yr for 2010−2012, while the growth rate based on in-situ observations was 2.2 ppm/yr.

Bias assessment of TIR CO 2 data using CME observations
Vertical distribution of CO 2 concentrations can be obtained by CME during the ascent flights from departure airports and the descent flights to destination airports. Figure 2 shows the flight tracks of CME ascending and descending observations over Narita airport, Japan (35.8°N, 140.4°E) in 2010. CME CO 2 data were regarded as part of CO 2 vertical profiles, with 5 maximum altitudes around 12 km, and were obtained within 3−4° of latitude and longitude of the airport. Therefore, we set the threshold for selecting coincident pairs of TANSO-FTS TIR and CME CO 2 profiles for comparison to be a 300-km distance from each of the airports shown in Figure 1.
For each of the coincident pairs, we calculated the weighted average of discrete CME CO 2 data in a vertical layer, "CME_raw", represented by black circles in Figure 3(a), with respect to the center pressure levels of each of the 28 vertical 10 grid layers of TIR CO 2 data. When there were no corresponding CME CO 2 data in lower retrieval grid layers, CO 2 concentration at the lowest altitude observed by CME was assumed to be constant down to the lowest retrieval grid layer.
Similarly, the uppermost CO 2 concentration observed was assumed to be constant up to the center pressure level of the retrieval grid layer including the tropopause, identified based on temperature lapse rates of Global Spectral Model Grid Point Values from the Japan Meteorological Agency (JMA-GPV) interpolated to the location of CME measurement. In retrieval 15 grid layers above the tropopause, CO 2 concentrations were determined based on CO 2 concentration gradients calculated from NICAM-TM CO 2 data near a CME measurement location. We collected eight NICAM-TM CO 2 data points from four model grids adjacent to a CME measurement location at times before and after CME measurement, and linearly interpolated them to the CME measurement location and time. The red line in Figure 3(a) shows a CO 2 vertical profile determined in this manner. This CO 2 vertical profile was designated as "CME_obs." profile. Observations by satellite-borne nadir-viewing 20 sensors like TANSO-FTS have much lower vertical resolution than aircraft observations. Therefore, we smoothed the CME_obs. profile to fit its vertical resolution to the vertical resolution of corresponding TIR CO 2 profile by applying TIR CO 2 averaging kernel functions (AK) to the CME_obs. profile, as follows (Rodgers and Connor, 2003): (1) Here, x CME_obs. and x a priori are the CME_obs. and a priori CO 2 profiles, respectively. CME_obs. data with TIR CO 2 averaging 25 kernels was designated as "CME_AK", as indicated by the blue line in Figure 3(a).
We set two different criteria for the time difference between TANSO-FTS TIR and CME CO 2 profiles used for selection of coincident pairs: a 24-h difference and a 72-h difference. Figure 4 shows a comparison of the results over Narita airport for coincident pairs with a 24-or 72-h time difference. Both averages and 1-σ standard deviations of differences between TIR and CME CO 2 data selected using the 24-and 72-h thresholds were comparable, as shown in Figure 4, which means that the 30 use of these two time difference criteria does not alter any conclusions drawn from comparisons of TIR and CME CO 2 data.
The same was generally applied to comparisons over the other airports shown in Figure 1. Hence, we adopted a 72-h time difference between TIR and CME CO 2 measurement times for selecting coincident pairs to increase the number of pairs available.
We selected coincident pairs of TIR and CME_AK CO 2 profiles by applying the thresholds of a 300-km distance and a 72-h time difference and calculated the difference in CO 2 concentrations (TIR minus CME_AK) for each retrieval grid layer. All the airports we used were then divided into four latitude bands (40°S−20°S, 20°S−20°N, 20°N−40°N, and 40°N−60°N), and 5 average differences were calculated for each latitude band, retrieval layer, and season (northern spring, MAM; northern summer, JJA; northern fall, SON; and northern winter, DJF). The signs of the calculated average differences were flipped and defined as "bias-correction values" for the 28 retrieval grid layers, four latitude bands, and four seasons.

Comparison of TIR CO 2 data with NICAM-TM CO 2 data
In this study, we compared monthly averaged TANSO-FTS TIR and NICAM-TM CO 2 data. We used 2.5° grid data from NICAM-TM glevel-5 CO 2 simulations, and calculated monthly averaged TIR and NICAM-TM CO 2 data for each of these 2.5° grids. Here, we interpolated the NICAM-TM CO 2 data from 40 vertical levels into CO 2 concentrations at the 28 20 retrieval grid layers of TIR CO 2 data. Besides TIR CO 2 data, a priori CO 2 data and TIR CO 2 averaging kernel functions data were also averaged for each month and each 2.5° grid. For each of the 2.5° grids, we applied the monthly averaged TIR CO 2 averaging kernel functions to the corresponding monthly averaged NICAM-TM CO 2 profiles using expression (1) with the corresponding monthly averaged a priori CO 2 profiles. We then calculated differences in CO 2 concentrations between monthly averaged TIR data and monthly averaged NICAM-TM data with TIR averaging kernel functions for each grid. Here, 25 two types of differences were calculated between TIR CO 2 data and NICAM-TM CO 2 data with TIR CO 2 averaging kernel functions: (1) the difference with respect to the original TIR CO 2 data and (2) the difference with respect to bias-corrected TIR CO 2 data to which the bias-correction values described above were applied. TIR CO 2 averaging kernel functions depend on TIR measurement spectral noise, a priori CO 2 profile variability, and CO 2 Jacobians. Of these three parameters, covariance matrices of the TIR measurement noise and a priori CO 2 profile were set in 30 the same manner for all TIR V1 L2 CO 2 data (Saitoh et al., 2016). The CO 2 Jacobians depend on temperature and CO 2 profiles, and therefore change with location and time. However, TIR CO 2 averaging kernel functions showed nearly identical structures with each other when collected for each 2.5° grid in one month, which means that applying the monthly averaged TIR CO 2 averaging kernel functions did not affect the conclusions of this study. Figure 5 presents a comparison between TANSO-FTS TIR V1 and CME_AK CO 2 profiles over Narita airport in each season 5 in 2010. In all seasons, TIR CO 2 data in the LT and MT regions had negative biases against CME_AK CO 2 data. The largest negative biases in TIR CO 2 data were found in the MT region centered at 500−400 hPa. The peak of the negative biases in spring and summer occurred at ~400 hPa, slightly higher than the peak pressure level in fall and winter (~500 hPa), which corresponds to the pressure level at which the TIR CO 2 averaging kernels exhibited their highest sensitivity in each season. Saitoh et al. (2016) showed that TIR V1 CO 2 data agreed well with CME level flight CO 2 data in the UT region (287−196 10 hPa). As indicated by the solid black lines in Figure 5, the negative biases in TIR CO 2 data against CME ascending and descending flight CO 2 data decreased as altitude increased, which is consistent with the results of Saitoh et al. (2016). Figure 6 shows differences between TANSO-FTS TIR V1 and CME_AK CO 2 data in the LT and MT regions for each latitude band and each season. TIR CO 2 data had consistent negative biases of 1−1.5% against CME_AK CO 2 data in all retrieval layers from 736 to 287 hPa, with the largest negative biases at 541−398 hPa (retrieval layers 5−6) for all latitude 15 bands and seasons, except for 40°S−20°S in the DJF seasons of 2011 and 2012. Here, we have omitted a detailed discussion of TIR CO 2 data at pressure levels below 736 hPa (retrieval layers 1−2), because TIR measurements have relatively low sensitivity to CO 2 concentrations in these layers, as shown in Figure 3(b). The largest negative biases, up to 7.3 ppm, existed in low latitudes during the JJA season, as indicated by the red line in the upper panel of Figure 6(b), while there were no coincident pairs of TIR and CME CO 2 data in the same season of 2011 and 2012. As presented in Table 2, the negative 20 biases in TIR CO 2 data were larger in spring (MAM) and summer (JJA) than in fall (SON) and winter (DJF) in northern middle latitudes (20°N−40°N), as was the case for UT comparisons presented in Saitoh et al. (2016). On a global scale, the seasonality of negative biases was not clear, given the relatively large 1-σ standard deviations (horizontal bars in the top panels of Figure 6), although these biases tended to be larger in the spring hemisphere than in the fall hemisphere within each latitude band. Comparing results among the three years, the negative biases in TIR CO 2 data slightly increased over 25 time in some latitude bands and seasons, but not as sharply as in the UT CO 2 comparisons discussed in Saitoh et al. (2017).

Bias of TIR LT and MT CO 2 concentrations
Note that the number of comparison pairs used in Figure 6 varied among latitude bands; the largest number occurred at 20°N−40°N, and the number of coincident profiles decreased in low latitudes and the Southern Hemisphere, where there are fewer airports.

Validity of bias correction based on CME data
Negative biases in TANSO-FTS TIR V1 CO 2 data in the LT and MT regions did not exhibit evident dependence on season or year, as shown in Figure 6. However, it is difficult to discern whether bias assessment using TIR CO 2 data over airports reflects the typical features of each latitude band due to the limited airport locations. Therefore, we validated the applicability of the bias-correction values based on comparisons with CME_AK CO 2 data over the entire area of each 5 latitude band by comparing TIR CO 2 data to NICAM-TM CO 2 data to which TIR CO 2 averaging kernel functions were applied on a global scale. Figure 7 shows the frequency distributions of differences in monthly averaged CO 2 concentrations between TIR and NICAM-TM CO 2 data in all retrieval layers from 736 to 287 hPa in all 2.5° grids over the latitude range of 40°S to 60°N. As shown by the dashed lines in Figure 7, the mode values of the frequency distributions generally corresponded to the median values, indicating that TIR CO 2 data did not have locally distorted biases against NICAM-TM 10 CO 2 data. In addition, negative biases of TIR CO 2 data against NICAM-TM CO 2 data in all seasons slightly increased over time, judging from the mode values presented in the top left boxes of Table 3, although the increase in negative biases was not much evident as in the comparisons over airports shown in Figure 6; this may be partly because of slightly high growth rate of NICAM-TM simulations (2.4 ppm/yr) compared to in-situ observations (2.2 ppm/yr).
The solid lines in Figure 7 show frequency distributions of differences between NICAM-TM CO 2 data and bias-corrected 15 TIR CO 2 data to which the bias-correction values defined for each retrieval layer, latitude band, and season were applied.
The mode values presented in the top right boxes of Table 3, which were nearly identical to the median values, were closer to zero in all three years. In addition, variability in the differences, as indicated by the width of the distribution, between bias-corrected TIR and NICAM-TM CO 2 data was comparable to or smaller than that between the original TIR and NICAM-TM CO 2 data; this can be seen by comparisons in values of frequencies at the mode values between before and after applying 20 the bias-corrections values, presented in Table 3 We divided the frequency distribution in the JJA season of 2010 into three categories based on the retrieval layers: 736−541 hPa (retrieval layers 3−4), 541−398 hPa (retrieval layers 5−6), and 398−287 hPa (retrieval layers 7−8), as shown in Figure 8.
A frequency distribution with a mode of 4 ppm was obtained from bias-corrected TIR CO 2 data in the MT region above 541 30 hPa, especially on 398−287 hPa. That is, TIR CO 2 data on 398−287 hPa in the JJA season of 2010 were clearly overcorrected when applying the bias-correction values defined in this study. In the retrieval layers of 736−541 hPa, the mode value of the frequency distribution after bias-correction was close to zero and the width of the distribution narrowed, demonstrating the validity of the corresponding bias-correction value. For the JJA seasons of 2011 and 2012, bias-correction values could not be determined because there were no coincident pairs between TIR and CME CO 2 data over airports; therefore, we substituted the bias-correction value for the same season of 2010. The frequency distribution of the differences between NICAM-TM and TIR CO 2 data after bias-correction in the JJA season of 2011 had a somewhat bimodal shape, while that in the JJA season of 2012 did not have any bimodal structure, as shown in Figure 7(c). The negative bias of the 5 original TIR CO 2 data against NICAM-TM CO 2 data in the JJA season of 2012 was larger than that in the JJA season of 2010; thus, applying the bias-correction value for 2010 to the 2012 TIR CO 2 data did not lead to any evident overcorrection.
Next, we divided the frequency distribution in the retrieval layers of 398−287 hPa in the JJA season of 2010, shown in Figure 8, into four latitude bands. Judging from the results presented in Figure 9, overcorrection of the negative biases in TIR CO 2 data against NICAM-TM CO 2 data occurred at 20°S−20°N and 40°N−60°N; TIR CO 2 data were markedly 10 overcorrected by the bias-correction value based on comparisons of CME CO 2 data over airports, especially in the latitude band of 20°S−20°N. As shown in the upper panel of Figure 6, negative biases in TIR CO 2 data against CME CO 2 data over airports in low latitudes during the JJA season were clearly larger than the biases found in other latitudes and seasons.
Judging from comparisons of global NICAM-TM CO 2 data, however, applying bias-correction values based on the negative biases observed over airports to TIR CO 2 data over the entire area of 20°S−20°N led to overcorrections in most cases. 15

Discussion
Any uncertainties in a priori data can affect retrieval results. A priori CO 2 data taken from the NIES-TM05 model (Saeki et al., 2013b) was used in the TANSO-FTS TIR V1 CO 2 retrieval processing, and exhibited consistent negative biases against CME CO 2 data in the troposphere and the lower stratosphere. As discussed in Saitoh et al. (2016), the negative biases in a priori CO 2 data were one likely reason for negative biases in retrieved CO 2 concentrations in the UTLS region. The same 20 pattern holds for negative biases in TIR CO 2 data in the LT and MT regions. However, negative biases in retrieved TIR CO 2 data were larger than those of a priori CO 2 data in the LT and MT regions, as shown in Figure 5. Furthermore, the vertical and latitudinal structures of the negative biases in TIR CO 2 data did not always correspond to those in a priori CO 2 data.
Although negative biases in a priori CO 2 data surely contribute to negative biases in TIR V1 CO 2 data in the LT and MT regions, there are likely other considerable sources of TIR CO 2 negative biases. 25 Uncertainty in atmospheric temperature data could affect CO 2 retrievals. As shown in Figure 7(a) of Saitoh et al. (2009), uncertainties in retrieved CO 2 concentrations due to uncertainties in atmospheric temperature were largest in the UT, upper MT, and LT regions; a bias of 1 K in atmospheric temperature can yield up to ~10% uncertainty in retrieved CO 2 concentrations in the MT and LT regions. However, simultaneous retrieval of atmospheric temperature in the V1 CO 2 retrieval algorithm could decrease the effect on CO 2 retrieval results. In addition to that, no evidence has been reported that 30 the JMA-GPV temperature data used as initial values (equal to a priori values) in the TIR V1 CO 2 retrieval processing have biases over such wide latitudinal areas, as in this study. Thus, uncertainty in atmospheric temperature is not a primary cause of negative biases in TIR CO 2 data in the LT and MT regions. Although the effect of uncertainty in H 2 O data on CO 2 retrieval results could be also decreased by simultaneous retrieval of H 2 O with CO 2 in the TIR V1 algorithm, water vapor is abundant in the tropics, so that we cannot deny the possibility of its effect on CO 2 retrieval results. Similarly, error in the judgement of cloud contamination in low latitudes with high cloud occurrence frequency may affect CO 2 retrieval results.
As shown in Figure 6, the largest negative biases in TIR V1 CO 2 data existed in the MT region in low latitudes (20°S−20°N) 5 during the JJA season. Degrees of freedom (DF) of TIR V1 CO 2 data were highest in low latitudes, exceeding 2.2 in all seasons, which means retrieved CO 2 concentrations there contained more information coming from TANSO-FTS TIR L1B spectra and thus were relatively less constrained to a priori concentrations. Kataoka et al. (2014) reported biases in TANSO-FTS TIR V130.131 L1B radiance spectra, which were a previous version of the V161 L1B data used in TIR V1 L2 CO 2 retrieval, on the basis of a double difference method. Similar analysis for the V161 L1B spectra is in progress. Kuze et al. In the TIR V1 CO 2 retrieval algorithm, we simultaneously retrieved surface temperature and surface emissivity with CO 2 15 concentration as a correction parameter for radiance biases in the V161 spectra, as explained in Saitoh et al. (2016). In the CO 2 retrieval, these surface parameters were retrieved to correct the radiance biases separately in the three spectral regions of the 15-μm (690−715 cm -1 , 715−750 cm -1 , and 790−795 cm -1 ), 10-μm (930−990 cm -1 ), and 9-μm bands (1040−1090 cm -1 ). As reported in Saitoh et al. (2016), the simultaneous retrieval of surface parameters for correction of radiance biases increased the number of normally retrieved CO 2 data (by roughly 1.5 times over Narita airport). This demonstrates a certain level of 20 validity for the correction of radiance biases through simultaneous retrieval of surface parameters for the V161 spectra.
However, we note that retrieving surface parameters for radiance bias correction at each wavelength band may affect retrieved CO 2 concentrations, and remaining radiance biases after correction at each wavelength band may also affect retrieved CO 2 concentrations.
To examine the effect of the simultaneous retrieval of surface parameters at each of the three wavelength bands on retrieved 25 CO 2 concentrations, we performed test retrievals of CO 2 concentrations using V161 spectra in four cases: using all three of these bands, in the same manner as the V1 algorithm; using two bands, 15-μm and 10-μm; using two bands, 15-μm and 9μm; and using the 15-μm band only. Figure 10 shows the CO 2 retrieval results for two TANSO-FTS observations over Narita airport in April 2010. As shown in Figure 10(a), negative biases in TIR CO 2 concentrations against nearby CME CO 2 concentrations in the LT and MT regions became notably smaller when using the 15-μm and 9-μm bands (black dashed line) 30 and the 15-μm band only (black dashed-dotted line), both conditions that did not use the 10-μm band. It is clear that using the 9-μm band did not contribute to negative biases in retrieved CO 2 concentrations, judging from the minor difference in CO 2 concentrations between the use of all three bands (solid line) and the use of the 15-μm and 10-μm bands (dotted line). In addition, there were no major differences in retrieved CO 2 concentrations among the four retrieval cases when the original V1 CO 2 profile did not have distinct negative biases, as shown in Figure 10(b). According to theoretical calculations shown in Figure 13 in Kuze et al. (2016), there were no distinct radiance biases in the 10-μm band in the latest version of the TANSO-FTS TIR spectra. If it is true for observed TIR radiances, our test retrievals imply that simultaneous retrieval of surface parameters for TIR spectra at the 10-μm band with less radiance bias worsened CO 2 retrieval results. The test retrieval results demonstrate that using the 10-μm band in conjunction with the 15-μm and 9-μm bands in the V1 retrieval 5 algorithm is a probable cause of the negative biases in retrieved CO 2 concentrations in the LT and MT regions, although this cannot fully explain the biases.
CO 2 absorption at 15 μm is considerably larger than that at 9 or 10 μm. However, measurements in the 9-μm and 10-μm bands are most sensitive to CO 2 concentrations in the LT and MT regions; the peak sensitivity of the 9-μm and 10-μm bands occurred on 736−541 hPa and 541−398 hPa, respectively, judging from CO 2 Jacobian values. Therefore, using the 9-μm and 10 10-μm bands in conjunction with the 15-μm band should be useful for retrieving CO 2 vertical profiles. In fact, in the case of the retrieval result shown in Figure 10(a), the degree of freedom of CO 2 retrieval was 1.93 when using the 15-μm band only, and it increased to 1.94, 1.95, and 1.96 when adding the 9-μm band, the 10-μm band, and both the 9-μm and 10-μm bands, respectively. In the next update of the CO 2 retrieval algorithm for TANSO-FTS TIR spectra, we should consider an improved method for correcting radiance biases in CO 2 retrieval processing or adopting the correction of TIR L1B spectra 15 themselves proposed by Kuze et al. (2016).
Bias-correction values determined based on comparisons of CME CO 2 data over airports overcorrected negative biases in TIR CO 2 data in the upper MT region from 398 to 287 hPa in low latitudes (20°S−20°N) during the JJA season, as shown in Figure 9. The CME data that determined the bias-correction values of the 20°S−20°N latitude band were concentrated in Southeast Asia, as illustrated in Figure 1: BKK (Bangkok), SIN (Singapore), and CGK (Jakarta). In addition, the bias-20 correction values for the 20°S−20°N latitude band after the SON season of 2010 were determined from comparisons of CME data at 0°−20°N, because no data were collected at 20°S−0° after September 2010, as mentioned above. Figure 11 shows differences between TIR CO 2 data with no bias correction and NICAM-TM CO 2 data with TIR CO 2 averaging kernel functions on 682 hPa and 314 hPa in July 2010. As shown in the lower panel of Figure 11, TIR CO 2 data on 314 hPa had negative biases against NICAM-TM CO 2 data in most areas at 0°−20°N, and the negative biases were largest near airport 25 locations in Southeast Asia. At 20°S−0°, on the other hand, TIR CO 2 data on 314 hPa were closer to NICAM-TM CO 2 data than at 0°−20°N. Relying on NICAM-TM CO 2 data, which incorporates CONTRAIL CO 2 data in the inversion, application of bias-correction values determined mainly from comparisons of CME CO 2 data in the MT region at 0°−20°N to TIR CO 2 data over the entire area of low latitudes including 20°S−0° produced widespread overcorrection.
In general, there are few areas where we can obtain reliable in situ CO 2 data for validation analysis. In particular, there are 30 very few in situ CO 2 data in the free troposphere where TIR observations are most sensitive, compared to the surface. In low latitudes, there are relatively strong updrafts, and thus there are larger uncertainties among models than in other areas due to differences in the parameterization of vertical transport. Therefore, a priori CO 2 concentrations taken from the NIES-TM05 model (Saeki et al., 2013b) probably have larger uncertainties in the MT region in low latitudes. As retrieved TIR CO 2 concentrations were to some extent constrained by a priori concentrations, they possibly had more biases attributed to the a priori uncertainties in the MT region in low latitudes. More in-situ CO 2 data in the upper atmosphere in low latitudes are needed to validate both satellite data and model results. Although HIAPER Pole-to-Pole Observations (HIPPO) data (Wofsy et al., 2011) are not suitable for a comprehensive validation study as in this study due to their limited observation periods, HIPPO CO 2 data are useful to validate CO 2 vertical profiles observed by satellite-borne sensors and 5 simulated in models (Kulawik et al., 2013). In addition, there may also be large biases in retrieved CO 2 data in local source and sink regions, where model data are more variable depending on the surface flux dataset. In such areas, it is difficult to determine bias-correction values that can be applicable over a vast area; it is true in the case of 40°N−60°N. In conclusion, comprehensive validation analysis of satellite data is still needed to evaluate accuracy both in background regions and in regions with high CO 2 variability. Reconsideration of the setting of retrieval grid layers is also needed so that measurement 10 information should be included more prominently in TIR CO 2 retrieval results.
Overall, the bias-correction values evaluated in each retrieval layer, latitude band, and season ( Figure 6) can be applied to corresponding TIR CO 2 data, except at 20°S−20°N during the JJA seasons of 2011 and 2012, when bias-correction values were not determined due to a lack of coincident CME CO 2 data. In these two cases, we recommended applying biascorrection value 0.5 ppm and 1.0 ppm larger than the corresponding bias-correction value for 2010 to TIR CO 2 data for 2011 15 and 2012, respectively, judging from comparison results between the original TIR and NICAM-TM CO 2 data.

Summary
We evaluated biases of the GOSAT/TANSO-FTS TIR V1 L2 CO 2 product in the LT and MT regions (736−287 hPa) by comparing the TIR CO 2 profiles with coincident CONTRAIL CME CO 2 profiles over airports from 2010 to 2012.
Coincident criteria for comparisons of a 300-km distance and a 72-h time difference yielded a sufficient number of 20 coincident pairs, except in low latitudes (20°S−20°N) during JJA seasons of 2011 and 2012. Comparisons between TIR CO 2 profiles and CME CO 2 profiles to which TIR CO 2 averaging kernel functions were applied showed that the TIR V1 CO 2 data had consistent negative biases of 1−1.5% against CME CO 2 data in the LT and MT regions; the negative biases were the largest on 541−398 hPa (retrieval layers 5−6), and were larger in spring and summer than in fall and winter in northern middle latitudes, as is the case in the UT region (287−196 hPa). Our test retrieval simulations showed that using the 10-μm 25 CO 2 absorption band (930−990 cm -1 ), in addition to the 15-μm (690−750 cm -1 and 790−795 cm -1 ) and 9-μm (1040−1090 cm -1 ) bands, increased negative biases in retrieved CO 2 concentrations in the LT and MT regions, suggesting that simultaneous retrieval of surface parameters for radiance bias correction at the 10-μm band worsened CO 2 retrieval results.
We then performed global comparisons between TIR V1 CO 2 data and NICAM-TM CO 2 data with considering TIR CO 2 averaging kernel functions to confirm the validity of the bias assessment over airports. Differences in CO 2 concentrations 30 between TIR and NICAM-TM data approached an average of zero after application of the bias-correction values to TIR CO 2 data, demonstrating that the bias-correction values evaluated over airports in limited areas are applicable to TIR CO 2 data for   2012 -2.2/-3.1 -2.9/-3.4 -3.9/-3.9 -5.6/-5.7 -3.9/-3.8 -5.8/-5.9 -4.3/-4.6 -5.3/-5.5 -4.9/-4.9 -5.3/-5.5 − -5.9/-5.7 -5.8/-6.3 -5.2/-4.9 -6.4/-6.5 -6.4/-6.7   . Bias profiles of GOSAT/TANSO-FTS TIR CO 2 data and a priori CO 2 data against CME_AK CO 2 data over Narita airport and the 1-σ standard deviations for each retrieval layer and season in 2010. The CME_AK CO 2 data are CME CO 2 5 data to which TIR CO 2 averaging kernel functions are applied. Solid black and gray lines indicate the biases of TIR and a priori CO 2 data, respectively, and dotted black and gray lines show their 1-σ standard deviations. Cross symbols indicate the center pressure level of each retrieval layer: (a) JF, (b) MAM, (c) JJA, and (d) SON.    upper and lower panels show the results on 682 hPa (retrieval layer 3) and 314 hPa (retrieval layer 8), respectively. There are no GOSAT/TANSO-FTS TIR CO 2 data in gray-shaded areas.