Intercomparison of thermal–optical carbon measurements by Sunset and Desert Research Institute (DRI) analyzers using the IMPROVE_A protocol

Thermal–optical analysis (TOA) is a class of methods widely used for determining organic carbon (OC) and elemental carbon (EC) in atmospheric aerosols collected on filters. Results from TOA vary not only with differences in operating protocols for the analysis, but also with details of the instrumentation with which a given protocol is carried out. Three models of TOA carbon analyzers have been used for the IMPROVE_A protocol in the past decade within the Chemical Speciation Network (CSN). This study presents results from intercomparisons of these three analyzer models using two sets of CSN quartz filter samples, all analyzed using the IMPROVE_A protocol with reflectance charring correction. One comparison was between the Sunset model 5L (Sunset) analyzers and the Desert Research Institute (DRI) model 2015 (DRI-2015) analyzers using 4073 CSN samples collected in 2017. The other comparison was between the Sunset and the DRI model 2001 (DRI-2001) analyzers using 303 CSN samples collected in 2007. Both comparisons showed a high degree of inter-model consistency in total carbon (TC) and the major carbon fractions, OC and EC, with a mean bias within 5 % for TC and OC and within 12 % for EC. Relatively larger and diverse inter-model discrepancies (mean biases of 5 %–140 %) were found for thermal subfractions of OC and EC (i.e., OC1–OC4 and EC1–EC3), with better agreement observed for subfractions with higher mass loadings and smaller within-model uncertainties. Optical charring correction proved critical in bringing OC and EC measurements by different TOA analyzer models into agreement. Appreciable inter-model differences in EC between Sunset and DRI-2015 (mean bias ±SD of 21.7%± 12.2 %) remained for ∼ 5 % of the 2017 CSN samples; examination of these analysis thermograms revealed that the optical measurement (i.e., filter reflectance and transmittance) saturated in the presence of strong absorbing materials on the filter (e.g., EC), leaving an insufficient dynamic range for the detection of carbon pyrolysis and thus no optical charring correction. Differences in instrument parameters and configuration, possibly related to disagreement in OC and EC subfractions, are also discussed. Our results provide a basis for future studies of uncertainties associated with the TOA analyzer model transition in assessing long-term trends of CSN carbon data. Further investigations using these data are warranted, focusing on the demonstrated inter-model differences in OC and EC subfractions. The withinand inter-model uncertainties are useful for model performance evaluation.


3218
X. Zhang et al.: Intercomparison of thermal-optical carbon measurements EC split is sensitive to details of the heating sequence and atmosphere, as well as the optical correction procedure.
The Chemical Speciation Network (CSN) was created to support implementation of the 1997 PM 2.5 National Ambient Air Quality Standards (NAAQS) (EPA, 1997). Within the network, 24 h PM 2.5 samples are collected on different filter media (e.g., PTFE, nylon, and quartz) at approximately 160 sites across the US, most of which are located in urban areas, and analyzed for PM 2.5 chemical components. Since its inception, CSN has been using the TOA method for carbon analysis on quartz filters but with evolving sample collection methods, thermal-optical analytical protocols, and instrumentation (Spada and Hyslop, 2018). Prior to 2007, CSN used varied sampler designs for collecting carbon samples on 47 mm diameter quartz filters, from which OC and EC were determined by Sunset analyzers that implemented the NIOSH thermal-optical transmittance (TOT) protocol (Birch and Cary, 1996). During the years 2007-2009, the network transitioned to using URG-3000N samplers to collect carbon samples on 25 mm diameter quartz filters, coinciding with the change in the analytical protocol from NIOSH TOT to IMPROVE_A thermal-optical reflectance (TOR) , to be more consistent with the US Interagency Monitoring of PROtected Visual Environments (IMPROVE) network. Since late 2009, no change has occurred in the analytical protocol or sample collection, but there were two TOA analyzer model transitions. As shown in Fig. 1, at the beginning of 2016, TOA carbon analysis for CSN transitioned from using Desert Research Institute (DRI) model 2001 analyzers (termed "DRI-2001" hereinafter) to DRI model 2015 multiwavelength analyzers (termed "DRI-2015" hereinafter) as a result of instrument upgrade. Again in October 2018, CSN TOA transitioned from using DRI-2015 analyzers to Sunset Laboratory model 5L analyzers (termed "Sunset" hereinafter) due to change in the analytical laboratory (from DRI to UC Davis). In addition to the abovementioned changes, the network started blank subtraction on carbon data in November 2015.
While measurement differences among thermal protocols (e.g., IMPROVE_A, NIOSH, EUSAAR) and between optical corrections (e.g., reflectance vs. transmittance) have been extensively studied and documented in the literature (e.g., Conny et al., 2003;Chow et al., 2004;Watson et al., 2005;Khan et al., 2012;Chan et al., 2019), less attention has so far been given to possible differences in OC-EC splits produced by nominally identical analytical protocols carried out on differently designed and manufactured instrument systems. Some comparisons were focused on examining variations between different units of the same model (e.g., Schauer et al., 2003;Ammerlaan et al., 2015). A previous study by Chow et al. (2015) compared results from the 2001 and 2015 models of the DRI analyzers using 67 urban (from Fresno Supersite) samples and 73 rural (from IMPROVE network) samples and concluded that no significant difference was found in EC or OC reported by the two models. Wu et al. (2012) compared a Sunset analyzer and a DRI-2001 analyzer using ∼ 100 ambient samples collected in the Pearl River Delta in China and reported similar consistency for OC and EC. While these studies provided insights on the inter-model comparison of different TOA analyzers, their sample sizes were limited.
The goal of this study is to characterize the consistency and differences in the results reported from the three TOA models successively deployed in the CSN running the same protocol, IMPROVE_A with reflectance charring correction. Two models, the DRI-2001 (manufactured by Atmoslytic, Inc.) and the DRI-2015 (manufactured by Magee Scientific), were designed by DRI. The third model, the Sunset model 5L (designed and manufactured by Sunset Laboratory, Inc.), is equipped with dual optical units and is capable of running multiple protocols, including the NIOSH and IMPROVE_A protocols. For each model type there have been multiple units dedicated for CSN carbon analysis in the past decade, including eight DRI-2001 units , 13 DRI-2015 units, and five Sunset units. In this study, two sets of 25 mm diameter quartz filter samples from the CSN were analyzed, each by a pair of models, for TC, OC, EC, and thermal subfractions (OC1-OC4, EC1-EC3, and OP). These samples, which were collected during September to October of 2007 (Set 1) and May to September of 2017 (Set 2), covered a great variety of emission sources and meteorological conditions given the wide spatial coverage of the CSN, ensuring statistically robust comparison among the three instrument models. Findings from these two comparisons provide a basis for accounting for TOA model transitions in future studies of CSN carbon long-term trends. Statistics such as within-and intermodel uncertainties between Sunset and DRI analyzers are presented and are useful for studies evaluating model predictions against CSN data (e.g., Emery et al., 2017), as well as source apportionment studies using speciated PM 2.5 carbon data (e.g., Kim and Hopke, 2005;Liu et al., 2006). Table 1 lists the major differences among the three TOA carbon analyzer models used in the intercomparisons. The differences in laser source, carbon detection, and temperature calibration method are discussed in more detail as follows.

Instrumentation in comparison
Laser source. The Sunset and DRI-2001 analyzers employ a single-wavelength laser source for measurement of filter reflectance and transmittance. The Sunset analyzer uses a diode laser at 658 nm, whereas DRI-2001 employs a helium-neon (He-Ne) laser at 633 nm. DRI-2015 employs seven diode lasers with differing wavelengths from 405 to 980 nm . For CSN samples analyzed by DRI-2015, the 635 nm EC data, reported as EC by reflectance, are considered equivalent to the 633 nm data reported by the DRI-2001    and are therefore used in this study for comparison with the Sunset measurements. Carbon detection. Both DRI-2001 and Sunset analyzers use a flame ionization detector (FID) that quantifies CH 4 , whereas DRI-2015 uses a non-dispersive infrared (NDIR) detector to quantify CO 2 . These two types of detectors have distinct responses to interference and noise levels; thus, different signal integration methods are used (further discussed in Sect. 3.2.2).
Temperature calibration. Temperature calibration in TOA refers to the method used to adjust oven temperatures measured by the integrated thermocouple based on the response of an external temperature-indicating device. Sunset and DRI analyzers adopt fundamentally different methods to calibrate the temperature plateaus in the IMPROVE_A protocol. In a Sunset analyzer, a thermocouple, positioned ∼ 2 cm downstream of the sample filter holder, is used to monitor the sample temperature at each IMPROVE_A temperature set point during an analysis. The distance between the thermocouple and sample punch is accounted for in temperature calibration by placing another thermocouple at the sample punch position, measuring the difference between the readings from the two probes, and adjusting the settings in the thermal an-alytical protocol accordingly (i.e., temperature offsets). The temperature offsets in Sunset analyzers can vary greatly per temperature step depending on the heat dissipation inside the oven (Panteliadis et al., 2015;Phuah et al., 2009). On the other hand, DRI used Tempilaq • G, a type of quickdrying chemical, as temperature indicators in the temperature calibration for both analyzer models (DRI, 2016;Chow et al., 2005). Briefly, six Tempilaq • G liquids that change optical properties at 121, 184, 253, 510, 704, and 816 • C were used in calibrating the six IMPROVE_A temperature plateaus (140, 280, 480, 580, 740, and 840 • C). During the analysis of each Tempilaq • G sample, the oven temperature is slowly incremented to a narrow range near the temperature at which the specific Tempilaq • G changes color, while the laser reflectance and transmittance are monitored for a sharp rise in response to the change. The sample oven temperature values are regressed on the corresponding Tempilaq • G temperatures and are interpolated and/or extrapolated to the IM-PROVE_A temperatures based on the linear regression slope and intercept. Two sets of CSN carbon samples collected on 25 mm diameter quartz filters were respectively analyzed in the two intermodel comparisons. Set 1 consists of 303 CSN filters sampled in September and October 2007 that were previously analyzed by DRI with the DRI-2001 models in the year 2008 (Fig. 1). These filters were retrieved from cold storage and reanalyzed by UC Davis using Sunset analyzers in 2017/18. Set 2 consists of 4073 CSN samples and 622 CSN field blanks collected between May and September 2017, which were sequentially analyzed by the Sunset analyzers at UC Davis and by the DRI-2015 analyzers at DRI within a year after sample collection. Both sets cover a variety of emission sources given the wide spatial coverage of the CSN.
Owing to the destructive nature of the TOA method and the limited sample deposit area of the filter (3.53 cm 2 ), only a maximum of three 0.5-0.6 cm 2 circular punches can be taken from one filter sample. No replicate measurements were available in Set 1 due to sample unavailability. A subset of filters within Set 2 were replicated by both Sunset and DRI-2015 analyzers to evaluate the within-model uncertainty (detailed in Sect. 2.2.3).

Thermal-optical analysis with IMPROVE_A protocol
Thermal-optical carbon analysis with the IMPROVE_A protocol was carried out by placing a filter punch in the sample oven of a carbon analyzer. Following the thermal program set by IMPROVE_A, the filter punch is first heated in an inert (100 % He) atmosphere where various OC subfractions volatilize at 140 • C (OC1), 280 • C (OC2), 480 • C (OC3), and 580 • C (OC4). The system is then switched to an oxidizing atmosphere (He with a fixed amount of O 2 ) where EC subfractions combust at 580 • C (EC1), 740 • C (EC2), and 840 • C (EC3). The liberated carbon compounds are converted to either carbon dioxide (CO 2 ) or methane (CH 4 ), followed by infrared absorption (CO 2 ) or flame ionization (CH 4 ) detection.
During the thermal analysis, a fraction of OC pyrolyzes or chars under the inert He atmosphere into EC-like substances and is accounted for using optical correction by reflectance. Specifically, sample filter reflectance is monitored throughout the analysis using a laser source (Table 1). The filter reflectance decreased in response to the formation of OP and then increased as the OP was combusted. The split between OC and EC is defined as the point when reflectance returned to its initial reading before the heating started.
Equations (1)-(4) show how carbon fractions are related with and without charring correction applied. The uncorrected OC, termed OC 1+2+3+4 , is the sum of all volatilized carbon under the inert atmosphere, whereas the uncorrected EC, termed EC 1+2+3 , is the sum of all oxidized carbon that includes both native EC and charred OC. Unless otherwise noted, the OC and EC data discussed below refer to those corrected by reflectance. (1) Sunset and DRI-2015 data are reported in mass loadings (µg cm −2 ). Sunset raw data were processed using a custom R computing package developed by UC Davis (hereinafter referred to as "UCD-Sunset data processing"). The program slightly modifies the algorithms provided by the Sunset calculation software (version 423) in that (1) premature EC evolution was not considered and (2) no correction was made for the dependency of laser reflectance on temperature. DRI-2015 data were calculated by the program supplied by the manufacturer. DRI-2001 data from the archived 2007 CSN samples were downloaded from the EPA Air Quality System (AQS) database (https://aqs.epa.gov/aqsweb/ documents/data_api.html, last access: 28 April 2021). The concentration data (µg m −3 ) were converted to mass loadings (µg cm −2 ) using a nominal sample volume (33 m 3 ) and filter area (3.53 cm 2 ) for direct comparison against the Sunset data. The use of nominal instead of the actual sample volume adds little uncertainty, given the stringency of CSN operational tolerances for flow rate and sample duration.

Quality control
Blank measurement Table 2 summarizes the mean and standard deviation of the carbon mass loadings from measurements of 622 CSN field blanks by the Sunset and DRI-2015 analyzers. OC and EC levels on blank filters are minimal. The difference between analyzers is also trivial. Sunset and DRI-2015 mass loading data were not blank-subtracted to allow for direct comparison with the DRI-2001 data.

Calibration
The FID and NDIR detector responses are normalized to a known amount of CH 4 gas (i.e., 5 % CH 4 in helium gas by mixing ratio) that is injected at the end of each sample analysis. In addition, the detector linearity was verified and calibrated by a set of carbon-containing aqueous solutions. Specifically, sucrose (C 12 H 22 O 11 ) standards with a concentration spanning from 2 to 210 µ g C cm −2 were used to calibrate the Sunset analyzers (UCD, 2019). The two DRI models were calibrated using 5 to 20 µL of 1800 ppm of sucrose and KHP (C 8 H 5 KO 4 ) solution (DRI, 2012(DRI, , 2016. The split between OC and EC cannot be calibrated or verified due to the lack of reference material for EC (Baumgardner et al., 2012).

Measurement uncertainty
The measurement uncertainty of the Sunset and DRI-2015 analyzers was estimated separately utilizing data from replicate analyses (i.e., two analyses on the same filter sample by the same analyzer model The SRD, which equals the relative difference (RD) divided by √ 2, is chosen over RD because it is the normalized relative difference between two measurements, accounting for the presence of equal and independent errors in both original and replicate measurements (Hyslop and White, 2009). The mean value SRD provides an estimate of the within-model replication bias, which was negligible, and the standard deviation (1σ ) of SRD provides an estimate for the within-model measurement uncertainty (Unc). Figure 2 illustrates the relationship between SRD and mass loading for TC, OC, and EC measured by Sunset (Fig. 2ac) and DRI-2015 ( Fig. 2d-f). As expected, the within-model replication bias is close to zero for both Sunset and DRI-2015 because the replicate and original analyses are essentially identical. For all three components, and particularly for EC for which some measurements are near the method detection limit (MDL) (illustrated by the vertical dashed line at 0.2 µg cm −2 in the plots), SRD decreases with increasing mass loading. While all analysis pairs are included in Fig. 2, those with a mean mass loading less than 3 times the MDL are excluded from the calculations of Unc to obtain a stable estimate of measurement uncertainty.
Assuming the within-model uncertainties are independent, the combined inter-model uncertainty (Unc inter ) can be predicted by Eq. (6a), where Unc DRI and Unc Sunset are the within-model uncertainties determined for DRI-2015 and Sunset analyzers using replicate analyses, respectively.
The overall measurement bias and uncertainty for all carbon components are summarized in Table 3, which provide benchmarks for the inter-model comparison discussed in the following sections. For most components, uncertainties estimated for the Sunset and DRI-2015 analyzers were comparable, except for OP and OC1, for which DRI-2015 uncertainties were a factor of 2-4 larger.
3 Results and discussion

Inter-model comparison of carbon measurements
This section presents results from the two inter-model comparisons for bulk TC, OC, and EC, as well as for individual thermal subfractions (OC1, OC2, OC3, OC4, OP, EC1, EC2, and EC3). Arithmetic differences (ADs) (Eq. 7) and scaled relative differences (SRDs) (Eq. 8) are calculated between results from Sunset and DRI-2001 analyzers using Set 1 and between results from Sunset and DRI-2015 using Set 2. In calculating SRD, the underlying assumption is that the observed differences are equally allocated to measurements from the two models in comparison; because no standard reference materials are available for the TOA measurement technique, there is no way to allocate the errors to a particular laboratory or analyzer model. In both cases, a positive AD or SRD value occurs if the Sunset measurement is higher than the DRI measurement. Figure 3 shows the probability density curves of SRDs for Sunset vs. DRI-2001 (purple) and Sunset vs. DRI-2015 (orange). The location of the peak relative to the x-axis center (or as measured by the mean of the SRDs) indicates systematic inter-model bias that occurred for the majority of the  data points, while the spread of the curve (or as measured by the standard deviation of the SRDs) represents the variability and coherence of these biases. Also shown in Fig. 3 are the within-model uncertainties determined from replicate analyses (Table 3), although only available for Sunset vs. DRI-2015, to assist in the interpretation of the inter-model SRDs.
R 2 values are tabulated as an indicator of the degree of linear correlation between the two models. The means and standard deviations of ADs and SRDs are summarized in Table 4.

Bulk TC, OC, and EC
TC and the major carbon fractions, OC and EC, exhibited good agreement in both comparisons, with the smallest SRDs and highest R 2 values found for TC (SRDs = −1.6 ± 5.4 % and R 2 = 0.98 for Sunset vs. DRI-2001; SRDs = −0.9 ± 6.0 % and R 2 = 0.99 for Sunset vs. DRI-2015). Between Sunset and DRI-2015, the ADs of TC (e.g., −0.5 ± 2.0 µg cm −2 ) were comparable to the difference in TC measured from the blank filters (Table 2). The consistency in the TC measurements over a wide temporal range, indicated by the similar TC mass loadings from the original analysis by DRI-2001 and the reanalysis by Sunset 10 years after sample collection, suggests good measurement reproducibility for TC as well as sample stability in long-term cold storage for bulk carbon fractions.
Relative to TC, similar but slightly weaker inter-model correlations were found for OC (R 2 = 0.95 for Sunset vs. DRI-2001 and 0.98 for Sunset vs. DRI-2015) and EC (R 2 = 0.95 for Sunset vs. DRI-2001 and 0.90 for Sunset vs. DRI-2015) (Fig. 3b and c). Sunset OC was lower than those determined by the two DRI analyzers by similar amounts, with an average AD of ∼ 1.5 µg cm −2 and SRD of ∼ 4 % (Table 4). Sunset EC was higher when compared to the two DRI analyzers, and the inter-model difference varied by a factor of 2 in terms of SRD (6.5 ± 8.3 % and 11 ± 15 % relative to DRI-2001 andDRI-2015, respectively). Mean SRDs, or the inter-model bias, of all three carbon components did not exceed the combined inter-model uncertainties for Sunset vs. DRI-2015; the mean SRD of EC (11 %) was the largest and closest to its inter-model uncertainty (12 %), suggesting the results are not statistically different. The consistently opposite inter-model biases of OC and EC from the two pairs of comparisons suggested disagreement in the OC-EC split by Sunset and DRI analyzers.

Thermal OC and EC subfractions
An examination of individual thermal OC and EC subfractions revealed large and diverse inter-model differences in these subfractions, a phenomenon referred to as "carbon migration" by some previous studies (e.g., Chow et al., 2007). In general, subfractions with higher mass loadings (e.g., OC2, OC3, and EC1) showed better inter-model agreement, with mean SRDs within ∼ 20 % and R 2 above ∼ 0.8 (Fig. 3); these subfractions also had smaller within-model uncertainties (Table 3). Relatively larger inter-model SRDs were observed for OC1, OC4, EC2, and EC3, coinciding with their lower mass loadings. EC3, the smallest subfraction in terms of mass loading (Table 4), showed the lowest degree of intermodel agreement among all OC and EC subfractions. DRI analyzers reported many more EC3 data points below the MDL than Sunset, leading to some SRD values far beyond 100 % (Fig. 3k). The most volatile subfraction, OC1, exhibited the largest inter-model SRDs among all four OC subfractions. Evaporative loss during handling and storage of the samples could artificially reduce the mass loading of OC1. Although good sample stability was demonstrated for bulk TC, it is possible that the 82 % bias of Sunset OC1 relative to DRI-2001 was primarily due to evaporation of OC1 during long-term storage. Systematic inter-model biases (as measured by the mean SRDs) diverged in terms of both magnitude and direction across different thermal subfractions. Relative to DRI analyzers, Sunset measured lower OC1, OC3, and OC4 and higher OC2. Despite the small average mass loadings of OC1 and OC4, they showed much higher ADs than OC2 and OC3 (Table 4). In contrast to the OC subfractions, all three EC subfractions were measured lower by the two DRI analyzers. The degree of inter-model difference varied greatly with subfraction and model pair, from 5.4 % for EC1 between Sunset and DRI-2001 up to 137 % for EC3 between Sunset and DRI-2015.

Understanding inter-model differences in TOA results
In this section, we further investigate the causes of the intermodel differences, with a focus on the role of optical charring correction in the final OC-EC split, as well as the instrument differences that are possibly related to the observed migration among carbon subfractions.

Optical charring correction
Optical correction is an essential component of the TOA method to remove measurement artifacts in OC and EC caused by charring of some OC components. As Eqs.
(1)-(4) show, without correction, OP, the charred fraction of OC, would be reported as part of EC, leading to an overestimate of EC and an underestimate of OC by the same amount that equals the mass of OP. Shown in Fig. 4 are distributions of SRDs in uncorrected and corrected OC and EC between Sunset and DRI-2015 measurements as a function of their average mass loadings, binned into 20 groups (5th percentiles). Charring correction brought results into better agreement with reduced SRDs across their whole range of mass loadings for both OC and EC, which is not surprising given the large ADs in OP that were equivalent to 67 % and 76 % of ADs of OC 1+2+3+4 and EC 1+2+3 , respectively. The remaining inter-model differences in EC, which are larger than those of OC, and the varying EC SRDs across its mass loading range are worth noting. In particular, the highest (95th percentile and above) EC mass loadings had a median SRD of 19 %, which is almost double the median SRDs in the lower mass loading   percentiles. In investigating this anomaly, we found that EC SRDs were larger for samples with no instrumentally detected OP (i.e., OP = 0) by the Sunset and DRI-2015 analyzers (Fig. 5a), with a median value of 20.9 %, which far exceeds the inter-model uncertainty for EC determined from the replicate analyses (Table 3). Figure 5b further revealed that the percentage of samples with OP equaling zero generally increases with increasing EC mass loadings. In the highest EC mass loading bin, approximately 30 % of the samples have no reflectance charring correction on the final reported mass loadings of EC or OC, driving the average EC bias high within that bin. In total, out of the 4073 CSN samples analyzed by Sunset and DRI-2015, 179 samples had no reflectance charring correction determined by either analyzer, with an additional 324 samples having no reflectance charring correction determined by only the DRI-2015 analyzers. As shown in Fig. 5c, for the 179 samples with no charring correction from both models, considerable correlation was found between the inter-model differences of EC and OC 1+2+3+4 . This suggests that, in the absence of charring correction, much of the observed bias in EC between the two models is essentially coming from the inconsistency in the quantified OC subfractions by the two models. In contrast, samples with charring correction (i.e., OP > 0) showed little correlation between the inter-model biases of EC and OC 1+2+3+4 . The prevalence of CSN samples with no instrumentally detected OP, especially samples with high EC loadings, is in- Figure 4. Distribution of scaled relative difference between Sunset and DRI model 2015 in uncorrected OC and EC (i.e., OC 1+2+3+4 and EC 1+2+3 , gray boxes) and corrected OC and EC (green boxes). SRDs are sorted by the average mass loading between Sunset and DRI-2015 measurements of each parameter and are plotted for each 5th percentile bin. The thick horizontal lines indicate the median, and the upper and lower limits of the boxes represent the 75th and 25th percentile, respectively. The whiskers extend to 1.5 × IQR (IQR is the interquartile range, or the distance between the 25th and the 75th percentiles). Outliers are shown as black dots. triguing and was investigated by a close examination of thermograms of all the 2017 CSN samples analyzed by Sunset. Figure 6 illustrates typical thermograms that contain laser reflectance and FID profiles from a Sunset analyzer for a sample with no charring correction (i.e., OP = 0) and a normal sample with charring correction (i.e., OP > 0), along with a blank sample. The blank thermogram shows a constant high laser reflectance and minimal FID signal throughout the course of analysis, indicating the absence of light-absorbing materials on the blank filter. The thermogram of the sample with correction shows a lower starting laser reflectance, indicative of the amount of native light-absorbing materials on the filter, and exhibits a U-shaped trend as OP was formed and accumulated in the inert stage and later liberated in the oxidizing stage; the split between OC and EC was determined as the point when the laser reflectance rose back to its initial level, suggesting complete oxidation of OP. At the end of the analysis, laser reflectance was at a level comparable to that of the blank filter, indicating fully evolved EC from the filter. By comparison, the thermogram of the sample without correction exhibits a number of different attributes. First, the initial reflectance is much lower near the baseline level. As analysis time elapsed and the program advanced to higher temperature set points, the laser signal remained almost unchanged until it started to rise slightly at high oxidizing temperatures (740-840 • C). The much lower final laser reflectance level, along with the long tail of the EC3 peak, suggests that there is substantial unevolved EC remaining on the filter. Filters with this type of optical profile are black in color before analysis and remain gray-black after analysis. For the sample without correction, the OC-EC split was determined as the point when the system switched to the oxidizing stage. In these cases, the complete attenuation of the laser signal led to an insufficient dynamic range for it to re-spond to carbon pyrolysis, regardless of how much OP was formed.
The initial and final readings of laser reflectance are compared among the three groups of samples, i.e., blank (n = 512), OP > 0 (n = 3894), and OP = 0 (n = 179), in Fig. 7a and b. Despite the variations within each of the three groups due to uncontrollable factors (e.g., different units of the same TOA model), the aforementioned desirable attributes of the analysis thermograms of the OP > 0 and blank groups are statistically evident, including consistency between initial and final laser reflectance for the blank samples, as well as the closeness of the final laser reflectance to the blank levels for the OP > 0 group. Also evident were the distinctly different patterns of both initial and final laser reflectance distributions of the OP = 0 group compared to the OP > 0 group. Low initial and final reflectance readings were observed for the OP = 0 group, with the former close to the laser detector baseline and the latter remaining well below the blank levels.
These results led to the following conclusions. First, for ∼ 5 % of the CSN quartz filter samples, undetected OP and lack of charring correction resulted from complete attenuation of the laser signal, leading to large inter-model discrepancies in EC between Sunset and DRI-2015. Second, EC mass loadings from these samples were most likely underestimated by both models, as suggested by residual EC unevolved from the filters at 840 • C, the highest IMPROVE_A temperature plateau. The high occurrence of samples with OP = 0 in CSN likely results from high sampled air volume, small filter surface area, and the closeness of sampling sites to emission sources, leading to concentrated strong absorbing materials (i.e., EC) on filter samples and posing a challenge for TOA.

Instrument differences causing carbon migration
The results presented in Sect. 3.1.2 show notable inter-model differences in the OC and EC subfractions, or carbon migration, caused by differences in instrument configurations between Sunset and DRI analyzers. Diagnosis and comparisons of these instrumental differences are beyond the scope of this work. In the following, we qualitatively discuss the roles of some possible factors to help formulate targeted experimental studies aimed at probing and reconciling such differences. Chow et al. (2015) reported similar inconsistencies when comparing the subfractions between the two DRI models and attributed such discrepancies to the variability (up to a factor of 2) in the trace oxygen levels in the oven of the DRI analyzers , as well as slight differences in the sample temperatures. In our study, when DRI and Sunset analyzers were compared, any difference in the sample temperatures likely resulted not only from the accuracy of the temperature calibration devices, which was typically ±1 %-2 % of the specified temperatures Phuah et al., 2009), but also from the different temperature calibration methods used by these models. As detailed in Sect. 2.1, Sunset analyzers use an external thermocouple that measures filter temperature, and DRI analyzers use color-changing chemicals (i.e., Tempilaq • G) to adjust the oven temperature readings at the IMPROVE_A temperature set points. Although a previous study by Phuah et al. (2009) demonstrated good comparability between the two temperature calibrations, the external calibration thermocouple in the Sunset analyzer used in that study was modified from the commercially available temperature calibration kit (Sunset Laboratory, Inc., OR, US) used in the present study. Chow et al. (2005) found that lowering sample temperatures by 14 to 22 • C in the IMPROVE protocol reduced OC1-OC3 subfractions and increased OC4, OP, and EC subfractions. In our results, the inter-model differences in OC1, OC3, and OC4 were in the same direction, opposite to the differences in OC2 and EC subfractions, suggesting that either the temperature differences between models at each set point were not in the same direction or temperature differences alone cannot fully explain the observed subfraction migration.
In addition, details in instrument configuration and operating parameters set by the analysis control program, often invisible and unalterable to end users, can be distinct among TOA models from different manufacturers. As Chow et al. (2007) explain: "Temperature is ramped to the next step when the FID (or NDIR) response returns to baseline or remains constant for more than 30 s; the residence time at each plateau is longer for more heavily loaded samples." Unremarked differences in implicit tolerances for temperature ramping rates, and for determining "return to baseline" or "constant", undoubtedly contribute some of the differences we observe in different models' reported results. Unfortunately, the time profiles of temperature and evolved carbon for individual samples are not routinely reported by DRI and were not available to us for systematic comparison with those from the Sunset instruments at UCD.

Conclusions and implications
A detailed study is performed to assess the inter-model differences among the three models of carbon analyzers used for CSN TOA carbon analysis during the past decade (2010-2019). Two sets of CSN quartz filter samples were used for comparison, each analyzed by a pair of the three analyzer models. Set 1 includes 4073 samples and 622 field blanks collected in 2017, sequentially analyzed by the Sunset and DRI-2015 analyzers within a year. Set 2 consists of 303 archived samples collected in 2007, originally analyzed by the DRI-2001 analyzers in 2008 and reanalyzed by the Sunset analyzers in 2017/18. By using the same IMPROVE_A protocol with reflectance charring correction, these two comparisons allow for a focused examination of instrumentation differences in the Sunset and DRI analyzers.
Our results provide quantitative evidence of desirable consistency in TC and the major carbon fractions (OC and EC), with mean scaled relative differences (SRDs) within 2 % for TC, 5 % for OC, and 12 % for EC, along with high correlation coefficients above 0.95 for TC and OC and above 0.90 for EC. Underlying the consistency in bulk carbon fractions were relatively larger and diverse inter-model differences in OC1-OC4, EC1-EC3, and OP subfractions. Better inter-model agreement was found for subfractions with relatively high mass loading and smaller within-model uncertainties (e.g., OC2, OC3, and EC1). Sunset EC subfractions were consistently higher, with SRDs varying from 5.4 % for EC1 between Sunset and DRI-2001 up to 137 % for EC3 between Sunset and DRI-2015. Pyrolyzed carbon (OP) formation from charring is found to be highly instrument-dependent, differing by 38 % and 66 % in mean SRD between Sunset and DRI-2001 and between Sunset and DRI-2015, respectively. The observed migration among the thermal subfractions is likely related to slight differences in the instrument thermal parameters and configurations, such as sample temperature, baseline selection, and residence time, between Sunset and DRI analyzers. It should also be noted that the IMPROVE_A protocol allows for some play in details such as temperature ramping rates and criteria for advancing to the next stage. A targeted study of such differences between Sunset and DRI analyzers in the future will further refine the understanding of its role in the differences in the analysis results.
Optical charring correction reduced the inter-model biases in OC and EC relative to those for uncorrected OC 1+2+3+4 and EC 1+2+3 by 56 %-67 % and 75 %-76 %, respectively. The remaining inter-model discrepancy in EC was found to be substantially larger for ∼ 5 % of the 2017 CSN samples that had no instrumentally detected OP. Examination of Sunset analysis thermograms suggested that complete laser signal attenuation was the cause; such samples occur more frequently at higher EC mass loadings and were often associated with residual EC that was resistant to the highest IMPROVE_A temperature plateau (840 • C), suggesting that both models might underestimate the true ambient EC concentrations for a subset of CSN samples. A previous study by Han et al. (2007) found that EC originating from diesel sources had a higher decomposing temperature than EC from biomass burning. Since the vast majority of CSN sites are located in urban areas (Solomon et al., 2014), where the sampled air is heavily impacted by anthropogenic emissions, it is possible that the samples with no instrumentally detected OP were heavily influenced by diesel fuel combustion. While data used in this study were primarily collected during the summer-fall season, future comparisons with data covering a longer sampling period will paint a fuller picture of all seasons.
Our work offers comprehensive information on TOA instrument uncertainty and inter-model differences necessary for future studies to consider in assessing long-term trends in CSN carbon data. Such information will also assist performance evaluation of chemical transport models using CSN data. Additionally, inter-model differences in thermal subfractions of OC and EC shown here suggest that source apportionment studies on multiyear trends that utilize TOA thermal subfractions as input data in source profiles (e.g., Kim and Hopke, 2005) need to take into consideration the consistency and comparability of data from different carbon analyzer models.
Data availability. Processed data collected with the DRI model 2001, DRI model 2015, and Sunset analyzers are being submitted to the Dryad repository and will be publicly available via https://doi.org/10.25338/B8204M (Zhang, 2021) once published. Before then, the data file is downloadable via https://datadryad.org/stash/share/ u-Qo5ilgQPGbQx7u8flMsfunGEfP8UhCwHtuzXxOeSU (last access: 29 April 2021). Raw data (i.e. thermograms) from the Sunset analyzers are available upon request from the authors.
Author contributions. XZ and NPH designed the study. XZ collected the Sunset data with help from KT. SR developed the UCD data processing algorithm and wrote the code. XZ analyzed the data and wrote the paper with help from WW and NPH. All authors were involved in the discussion and commented on the paper.
Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. The conclusions are those of the authors and do not necessarily reflect the views of the sponsoring agency.