Deriving column-integrated thermospheric temperature with the N2 Lyman–Birge–Hopfield (2,0) band

This paper presents a new technique to derive thermospheric temperature from space-based disk observations of far ultraviolet airglow. The technique, guided by findings from principal component analysis of synthetic daytime Lyman–Birge–Hopfield (LBH) disk emissions, uses a ratio of the emissions in two spectral 10 channels that together span the LBH (2,0) band to determine the change in band shape with respect to a change in the rotational temperature of N2. The two-channel ratio approach limits representativeness and measurement error by only requiring measurement of the relative magnitudes between two spectral channels and not radiometrically calibrated intensities, simplifying the forward model from a full radiative transfer model to only a vibrationalrotational band model. It is shown that the derived temperature should be interpreted as a column-integrated 15 property as opposed to a temperature at a specified altitude without utilization of a priori information of the thermospheric temperature profile. The two-channel ratio approach is demonstrated using NASA GOLD Level 1C disk emission data for the period of 2–8 November 2018 during which a moderate geomagnetic storm has occurred. Due to the lack of independent thermospheric temperature observations, the efficacy of the approach is validated through comparisons of the column-integrated temperature derived from GOLD Level 1C data with the GOLD 20 Level 2 temperature product as well as temperatures from first principle and empirical models. The storm-time thermospheric response manifested in the column-integrated temperature is also shown to corroborate well with hemispherically integrated Joule heating rates, ESA SWARM mass density at 460 km, and GOLD Level 2 column O/N2 ratio. 25

showed that disk observations of LBH bands could be used for global monitoring of thermospheric temperature.
These authors fit LBH laboratory spectra to observed emissions using an optimal estimation routine with varying 40 parameters such as the N2 rotational temperature, population rates of each vibrational band, the N I 149.3 nm line emission intensity, O2 photoabsorption, and background emission rates. GOLD became the first mission to provide a Level 2 data product of thermospheric temperature ( !"#$ ) using LBH disk emissions between ~132-162 nm with a similar retrieval implementation (Eastes et al., 2017). Thermospheric temperatures have also been derived from TIMED GUVI observations (Zhang et al., 2019) using an intensity ratio between the (0,0) band and (1,0) band that 45 the authors found to be quasi-linearly dependent on the N2 rotational temperature. The authors attributed the temperatures to the altitude at the peak of the LBH contribution function (~155 km) based on radiative transfer calculations.
This paper presents a new technique to derive thermospheric temperature from spectrographic measurements of FUV airglow. The technique, unlike in past work, uses the ratio of two spectral channels that span 50 a single LBH band to determine the change in band shape with respect to a change in the rotational temperature of N2. Section 2 provides background and exploration of the LBH temperature signal with principal component analysis (PCA) to motivate the new technique. Section 3 details the techniques implementation and provides a discussion on the error sources and a rationale behind our interpretation of the derived temperature as a columnintegrated property that we refer to as column-integrated thermospheric temperature, %& . Section 4 presents the 55 demonstrative results of applying the technique to GOLD Level 1C radiance data for the period of 2-8 November 2018; during which a moderate geomagnetic storm event has occurred. The derived thermospheric temperatures are compared to GOLD Level 2 version 3 !"#$ data product over the same period. Due to the lack of independent remotely sensed or in situ temperature measurements in the lower to middle thermosphere, the derived columnintegrated temperatures are also compared to (1) synthetically generated column-integrated temperatures from 60 model simulations by NOAA's Whole Atmosphere Model (WAM) (Akmaev et al., 2011), (2) Naval Research Lab Mass Spectrometer and Incoherent Scatter Radar Extended (NRLMSISE-00) empirical model temperatures (Picone et al., 2002), and (3) observations of other thermospheric states, including GOLD Level 2 SO/N2 data product (Correira et al., 2018) and mass density by ESA's SWARM constellation (Astafyeva et al., 2017), as well as, hemispherically integrated Joule heating rates estimated from SuperDARN and ground-based magnetometer data by using the Assimilative Mapping of Geospace Observations (AMGeO) (AMGeO Collaboration, 2019;Matsuo, 2020).

Thermospheric Temperature Signal in LBH Emissions
The thermospheric temperature signal exists in the rotational structure of the N2 LBH bands. In the case of 70 N2, the rotational temperature is equivalent to the ambient neutral temperature (Aksnes et al., 2006). To motivate the new approach for extracting this signal from the N2 LBH (2,0) band, this section presents results from PCA performed on simulated LBH emissions. Synthetic LBH emission data are generated by forward modeling WAM simulation results for the period of 2-8 November 2018. WAM simulation experiments are executed with solar and geomagnetic forcing conditions specified according to the actual values of the F10.7, Kp, and hemispheric power indices, solar wind velocities and densities, and interplanetary magnetic fields. Section 2.1 discusses forward modeling of LBH emissions and Section 2.2 presents the PCA results.

Forward Modeling of LBH Emissions
The forward model used to produce synthetic LBH emissions is built with the Global Airglow Model 80 (GLOW) and a radiative transfer model (Solomon, 2017). GLOW computes LBH volume emission rates as a function of altitude that are input into the radiative transfer model to produce line-of-sight emissions of the LBH band system. The most important component of the forward model for the purposes of deriving thermospheric temperatures is the LBH vibrational-rotational band model (Budzien et al., 2001). The band model is a look-up table of laboratory spectra that specifies, for a given temperature, a unique spectrum for the upper vibrational states v'=0-85 9 of N2. In the current implementation of the forward model, the v'=0-9 vibrational population rates are those provided in Ajello et al. (2020) that are based on GOLD observations and are held constant. The population rate distribution can vary with the energy distribution of the electron flux in addition to variation in excitation sources other than direct excitation such as radiative cascade and collision-induced electronic transition (Ajello et al., 2020, Eastes et al., 2000aAjello et al., 1985). Ajello et al. (1985) states that excitation thresholding should be included 90 in airglow models to accurately reproduce LBH band intensity. However, as discussed in the following section, absolute band intensity is not needed to extract the N2 rotational temperature.

PCA of Simulated LBH Emissions
PCA is a data reduction technique that is useful for identifying the dominant orthogonal modes of 95 variability from data. PCA is applied here using eigenvalue decomposition of a sample covariance matrix, , of simulated LBH emissions, ()* + , at wavelengths, , computed from aggregated data sets of simulated emissions of the LBH band system during 2-8 November 2018 for a total of N = 8.1´10 4 samples. ( , r / , t & ) = c 2 (r / , t / ) 2 ( ) + c 3 (r / , t / ) 3 ( ) + … + c 4 (r / , t / ) 4 ( ) + , ( , r / , t / ) where , ( , , t) is the residual after subtracting the mean and the sum of n weighted modes from ()* ! + . The total variance of c matches σ 3 for that mode. Figure 1 shows the mean of simulated LBH radiance, ,,,,,,, , between 138-162 nm generated with a spectral pixel size of 0.04 nm and a spectral resolution of 0.19 nm full width at half maximum (FWHM). The first two leading modes of variability in the spectrum, vB and vT, scaled by their eigenvalues or total standard deviations, 9 and : , are also shown. The leading mode vB is identified as the overall scaling of the LBH intensity. The value of 9 3 suggests that this mode accounts for 98.3% of the total variability in the simulated LBH spectra. The second leading mode vT is identified as the temperature signal. According to the value of : 3 this secondary mode accounts for 1.6% of the total variability in the simulated LBH spectra. The correlation coefficient, R, between time-115 dependent coefficients for this temperature mode cT and the simulated WAM temperatures at 155 km altitude over the course of 2-8 November 2018 is 0.71. Together these two principal components account for 99.9% of the variability in the simulated LBH spectra suggesting the LBH system is highly compressible. vT. This observation substantiates an approach of binning the LBH (2,0) band into two channels using 138.58 nm as a boundary to preserve how the temperature signal manifests in the LBH emission's morphological shape. Channel A is defined as the sum of all wavelengths negatively correlated with temperature (∑ .

.
), and channel B contains all wavelengths positively correlated with temperature (∑ .

.
). The two channel ratio, B A ⁄ , is a function of temperature. A similar two-channel ratio approach was adopted in Cantrall et al. (2019) for testing the 130 feasibility of assimilating GOLD Level 1C data into the WAM but a justification of such an approach was not provided.

Determination of Column-integrated Temperature from the LBH (2,0) Band
This section details the derivation of column-integrated thermospheric temperature, %& , from the N2 rotational 135 structure observed in top-of-atmosphere LBH emissions using the ratio of two channels that together span the LBH (2,0) band as motivated in Section 2. Section 3.1 explains the step-by-step procedure, followed by a discussion on potential error sources of %& in Section 3.2 and analysis in Section 3.3 that supports the interpretation of %& as a column-integrated temperature rather than a temperature attributed to a specific altitude.

Procedure
The procedure to determine %& using the two-channel ratio consists of four steps as follows: 1. Generate a set of synthetic LBH (2,0) bands at the instrument's spectral pixel size for a range of temperature using the vibrational-rotational band model (Budzien et al., 2001).
2. Apply an instrument model on each synthetic band to account for the instrument's spectral 145 resolution and spectral registration.
3. Bin each band into channels A and B and fit the ratio, B A ⁄ , to temperature by least-squares.
4. Compute the ratio, B A ⁄ , from the observed LBH (2,0) band and determine %& by regressing the observed ratio on the predetermined relationship between the B/A ratio and temperature.
The two-channel ratio has a number of benefits, most importantly, it can limit the impact of the following 150 uncertainties: (1) uncertainty associated with LBH excitation and extinction processes that affect the absolute intensity of each band, and (2) uncertainty associated with instrument performance variations across the LBH band system. This technique to derive %& only requires measurement of the relative magnitudes between two spectral channels (spectral resolution ~0.5 nm FWHM) and a vibrational-rotational band model to map temperature to measurements. Measurement of a fully resolved, radiometrically calibrated LBH band system is not required nor is a 155 forward model to produce absolute LBH intensity.

Sources of Error
There are two categories of error associated with determining physical parameters from observations: measurement error and representativeness error. Measurement error is the error associated with the measuring 160 device while representativeness error is the difference between the observation and the physical model's representation of the observation (Rodgers, 2000). There are two dominant sources of systematic measurement error in %& stemming from variations in the instrument's spectral registration and resolution. Figure 3 shows the error in %& as a function of the error in the modeled spectral registration and the error in the spectral resolution. It is apparent in Fig. 3 that a significant temperature error of about 50 K (5-10%) can occur if the errors exceed a 165 hundredth of a nanometer level for the spectral registration and a tenth of a nanometer level for spectral resolution.
A discussion on mitigating these two sources of systematic measurement error when deriving %& from GOLD data is provided in Section 4.1.
The predominant source of random measurement error that determines the precision in %& is shot noise.
The %& random measurement error due to shot noise is quantified using Monte Carlo samples of simulated %& 170 derivations considering the instrument performance (McClintock et al., 2020a,b). Particle background counts is at times an additional random noise source. For the case study with GOLD data, the particle backgrounds were low as indicated by the "High_Background" flag in the Level 1C data and therefore this error source is not considered. The statistics of background counts and the associated temperature errors should be quantified for the general application of this technique to any period.

175
Sources of representativeness error in deriving %& are those that cause relative differences in the channel intensity other than temperature that are not captured in the vibrational-rotational band model. Photoabsorption by O2 is one source to consider. There is only a 1.5% difference in the mean O2 absorption cross section between the two channels that corresponds to a negligible difference in transmittance along the line-of-sight considering the O2 absorption cross section variation with temperature. Another source of representativeness error associated with the (2,0) band is due to the overlap of the bright (2,0) transition and the weak (5,2) transition. Inaccurate specification of the v'=2 and v'=5 vibrational population rates cause a slight change in shape of the band with respect to the observations that could be interpreted as a change in the rotational temperature. Figure 8 in Ajello et al. (2020) provides the v'=0-6 population rates and their uncertainties. These uncertainties are used to determine the associated error in the derived temperatures using the (2,0) band due to inaccurate specification of the v'=2 and v'=5 185 population rates. It is important to note that this representativeness error does not exist if the (1,1) or (2,3) bands are used in the derivation instead of the (2,0) band because the (1,1) and (2,3) bands are isolated from other LBH bands.
However, these bands are also much weaker and suffer from significantly larger random error due to shot noise. Figure 4 shows the total random measurement error and representativeness error in %& using the (2,0) band. The representativeness error is a function of temperature and can range from 15 K at %& = 400 K to 48 K at %& = 1200 190 K. Random measurement error from shot noise is a function of the (2,0) band intensity with values of 20 and 50 K for a photon counts of 2500 and 500, respectively.

Interpretation of Column-integrated Temperature
Interpretation of column-integrated temperature, %& , is addressed using synthetic LBH disk emission 195 observations generated by forward modeling WAM simulation results. The column-integrated temperature computed from synthetic observations is hereafter denoted as %& F to contrast to %& G computed from GOLD LBH disk emission data that is introduced later. To examine if %& F can be attributed to a certain pressure we compare the WAM pressure level with the temperature that most closely matches %& F , denoted as : "# $ , to the pressure level at the peak of the LBH contribution function, H12 , where the LBH optical depth, , is unity. The contribution function acts as an averaging kernel for temperature over these large vertical widths that tends to reduce the SZA effect. The net result is derived temperatures that are generally hotter than temperatures at p I12 (p -%! & < p I12 ) for low SZA and temperatures that are generally cooler than temperatures at p I12 (p -%! & > p I12 ) for 210 high SZA. Figure 6 also shows variability in p -%! & (up to 1.5 × 10 JK hPa or ~10 km for the simulation conditions) at a given SZA that reflects considerable variability in the vertical temperature structure within the width of the contribution function given varying forcing conditions. Figure 6 reinforces that %& derived from the procedural steps specified in Section 3.1 is a columnintegrated quantity, containing information from a larger altitude range of the lower-middle thermosphere than just at H12 . Perhaps, %& can be justified to be attributed to H12 when measurement and representativeness errors exceed the gap between %& and the temperatures at H12 at a given SZA and OZA. In general, specific pressure or altitude attribution of %& requires additional a priori knowledge of the thermospheric temperature profile.

220
The two-channel ratio approach to derive the column-integrated temperature is demonstrated using NASA  Table 1 defines each of the variable 235 symbols introduced above.

GOLD LBH Disk Emission Data
GOLD observes the daytime FUV airglow from ~134-162 nm on Earth's disk between 6 and 23 UT from geostationary orbit at 47.5ºW longitude (Eastes, 2020). GOLD produces a full disk image every ~30 minutes at a 240 spatial resolution of 125´125 km by alternating between scans of the Northern and Southern hemisphere. The GOLD Level 1C radiance data with a spectral sampling of 0.04 nm are used to derive %& G in this study. The GOLD Level 1C data is spatially binned by 2´2 (250´250 km spatial resolution) to improve the SNR by a factor of 2. Prior to deriving %& G , efforts were made to reduce the impact of systematic biases that are present in version 3 of the GOLD Level 1C data product. Variations in spectral resolution along the GOLD detector are identified with the 245 FWHM of the OI 135.6 doublet through fitting a 2-gaussian distribution. Variations in the spectral registration are identified by differencing the modeled peak wavelength given the fitted OI 135.6 doublet FWHM by the peak wavelength determined by fitting a log-normal distribution to the (2,0) band. Note that the degradation of the detector due to the strength of the OI 135.6 doublet can cause errors in the spectral resolution estimate, but significant degradation had not occurred by 2-8 November 2018. Corrections for spectral registration and resolution 250 are incorporated into Step 2 of the %& algorithm (see Section 3).  (1-3%) increasing to a maximum of ~40 K (4-8%) near the disk edge. The slope of %& G − !"#$ with respect to latitude and longitude indicates %& G has a stronger south-north and west-east temperature gradient than !"#$ . There 265 is also agreement in the temperature morphology over the disk between %& G and %& F prior to the storm, but the stormtime response simulated by WAM, as manifest in %& F , shows considerably higher temperatures in the mid-and highlatitudes and a longer post-storm recovery time in comparison to %& G and !"#$ . %& G − %& F displays a similar west-east slope to %& G − !"#$ except for the region just west of the sub-solar point (-80º−-50º longitude) where %& F is ~25 K cooler than %& G . L#"# and %& G show agreement in the temperature morphology over the disk but L#"# displays 270 cooler temperatures, particularly just west of the subsolar point at low-and mid-latitudes (up to 60 K), and a stronger west-east temperature gradient.

Comparing
The %& G and !"#$ comparison is expanded in Fig. 9 to include all times in the range 7-22 UT for the period of 2-8 November 2018. Figure 9 shows that %& G and !"#$ have different behavior with the viewing conditions determined by SZA and OZA. %& G increases with both SZA and OZA with a stronger trend for SZA. !"#$ increases 275 with OZA but remains relatively uniform with SZA even decreasing slightly for SZA > 25º. There are two likely explanations for the dependence of the derived temperatures on viewing conditions: (1) The derived temperatures reflect real temperature changes with viewing conditions because of the contribution function peaking at different pressures (Fig. 5). (2) The derived temperatures reflect temperature biases with viewing conditions because of changes in the LBH emission intensity. Intensity decreases with increasing SZA due to reduced LBH excitation but period is suggestive that %& G is more sensitive to real temperature changes as the probed pressures change with viewing conditions and less susceptible to biases due to a change in LBH intensity with viewing conditions. This is attributed to the fact that %& G derivation does not require measurement of a fully resolved, radiometrically calibrated 290 LBH band system nor a forward model to produce absolute LBH intensity. There is likely still biases in %& G with LBH intensity as indicated by the weak-moderate correlation (R = -0.32), particularly at low intensities (high SZA) where shot noise can lead to positive biases up to 15 K in the two-channel ratio approach.

Storm Time Response
295 Figure   to be on the order of 2-3 days.

310
A new technique to derive thermospheric temperature from space-based disk observations of FUV airglow is presented. The technique uses a ratio of the emissions in two spectral channels that together span the Lyman-Birge-Hopfield (LBH) (2,0) band to determine the change in band shape with respect to a change in the rotational temperature of N2. While this study focused on the LBH (2,0) band to derive thermospheric temperature, the described technique can be applied to any LBH band or combination of bands. The derived temperature from this 315 technique is shown to be a column-integrated property referred to as column-integrated thermospheric temperature, %& . %& should not be attributed to the peak of the LBH contribution function without consideration of the viewing conditions and %& derivation uncertainty. The definition of column-integrated thermospheric temperatures and other parameters used for comparison in the paper is given in Table 1. Specific findings of this work are as follows.
The LBH spectrum quantified with PCA of synthetic daytime LBH disk emission data is found to be highly 320 compressible (two principal components explain 99.9% of the variability). Analysis of the secondary principal component mode, that characterizes how the LBH temperature signal manifests as the change in band shape, substantiates the approach to bin a LBH spectral band into two channels such that the temperature-induced band shape change is best preserved. The study has shown that thermospheric temperatures can be derived from an observed two-channel ratio by using a precomputed relationship of the ratio to temperature from an LBH vibrational-rotational band model. In this two-channel ratio approach, representativeness errors originating from forward modeling are reduced because radiometrically calibrated LBH band intensities are not required in the derivation procedure, and negative impact of systematic measurement errors, stemming from variations across the band system in the instrument's spectral registration and resolution, are reduced because a fully resolved LBH band system is not required.

330
The derived temperature from the two-channel approach can have significant systematic biases of about 50 K (5-10%) if the spectral registration and resolution are not known to the hundredth of a nanometer level and tenth of a nanometer level, respectively, as shown in Fig. 3. In addition to these known sources of systematic biases, there is intrinsic random error in %& due primarily to shot noise and representativeness error due to misspecification of the v'=2 and v'=5 population rates in the vibrational-rotational band model. Random measurement error is estimated to 335 be 20-60 K (3-9%) and representativeness error is estimated to be 15-30 K (2-5%) for the case study with GOLD L1C data.
For the period of 2-8 November 2018 during which a moderate geomagnetic storm has occurred, the temperatures derived from observations (i.e., %& G and !"#$ ) exhibit globally similar temperatures. %& F is in good agreement with %& G and !"#$ at low latitudes but exhibits considerably higher temperatures at mid-and high-340 latitudes during the storm response. L#"# exhibits globally cooler temperatures to the observations. However, there are clear differences between %& G and !"#$ with respect to viewing conditions. There is stronger correlation between %& G and p I12 (R=-0.86) compared to !"#$ and p I12 (R=-0.14) and weaker correlation between %& G and LBH intensity (R=-0.32) compared to !"#$ and LBH intensity (R=0.72) over the analysis period. These differences highlight a potential benefit of the two-channel ratio approach to reduce representativeness error by measurement of the relative 345 intensities between two channels that only requires a vibrational-rotational band model for the forward model instead of a full radiative transfer model. The temporal evolution of global %& corroborates well with temporal changes of hemispherically integrated Joule heating rates MN , SWARM mass density at 460 km #OPQL RST UV , and GOLD SO/N 3 G , which is consistent with known storm time responses of thermospheric variables.

Data Availability
The lookup