Combining airborne gas and aerosol measurements with HYSPLIT: a visualization tool for simultaneous evaluation of air mass history and back trajectory consistency

The history of air masses is often investigated using backward trajectories to gain knowledge about processes along the air parcel path as well as possible source regions. Here, we describe a refined approach that incorporates airborne gas, aerosol, and environmental data into back trajectories and show how this technique allows for simultaneous evaluation of air mass history and back trajectory reliability without the need to calculate trajectory errors. We use the HYbrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model and add a simple semiautomated computing routine to facilitate high-frequency coverage of back trajectories initiated along free tropospheric (FT) flight tracks and profiles every 10 s. We integrate our in situ physiochemical data by color-coding each of these trajectories with its corresponding in situ tracer values measured at the back trajectory start points along the flight path. The unique color for each trajectory aids assessment of trajectory reliability through the visual clustering of air mass pathways of similar coloration. Moreover, marked changes in trajectories associated with marked changes evident in measured physiochemical or thermodynamic properties of an air mass add credence to trajectories. This is particularly true when these air mass properties are linked to trajectory features characteristic of recognized sources or processes. This visual clustering of air mass pathways is of particular value for large-scale 3-D flight tracks common to aircraft experiments where air mass features of interest are often spatially distributed and temporally separated. The cluster-visualization tool used here reveals that most FT back trajectories with pollution signatures measured in the central equatorial Pacific reach back to sources on the South American continent over 10 000 km away and 12 days back in time, e.g., the Amazonian basin. We also demonstrate the distinctions in air mass properties between these and trajectories that penetrate deep convection in the Inter-Tropical Convergence Zone. Additionally, for the first time we show consistency of modeled precipitation along back trajectories with scavenging signatures in the aerosol measured for these trajectories.


Introduction
Determination of the history of air masses sampled during airborne field campaigns has become an important part of current atmospheric research (e.g., Diab et al., 2003;Fuelberg et al., 2003;Lawrence et al., 2003;Methven et al., 2003;Weber et al., 2003;Bertschi et al., 2004;Tu et al., 2004;Taubman et al., 2006;Barletta et al., 2009;Ding et al., 2009;Tilmes et al., 2011;O'Shea et al., 2013) facilitated by the rapid improvement of transport models.However, reliability of backward trajectories computed starting along flight tracks is often assumed without estimating their uncertainty because assessment of total trajectory errors is timeconsuming and in some cases inconclusive.
Published by Copernicus Publications on behalf of the European Geosciences Union.
An earlier review of publications on trajectory reliability (Stohl, 1998) found trajectory deviations from the actual air mass path to accumulate to about 20 % of the traveled distance (total trajectory length) on average.Although this may support the utility of general trajectory analysis, a back trajectory could exhibit a deviation up to 100 % if the actual measured air mass encounters atmospheric flow patterns not resolved by the transport model (e.g., Stohl et al., 2002).These include subgrid-scale processes as turbulent mixing and convection but may also involve synoptic-scale features (i.e., wind shear).This is because the latter pattern could deform the measured air volume, while a single back trajectory represents the path of a particle or an infinitesimally small air parcel, which cannot be stretched.Consequently, the actual air mass path should rather be considered as an air mass corridor.
We note that Lagrangian particle dispersion models (LPDMs) are designed to handle turbulent mixing and convection stochastically and via parameterization (e.g., Stohl et al., 2002), respectively, and hence are considered superior to traditional back trajectory analysis but that these models still require larger computational resources and might hence not be appropriate for a rapid "off-the-shelf" air history analysis as discussed below.
Computing numerous back trajectories by slightly varying initial starting conditions in space and time can aid in assessing wind shear-induced deformation (Merrill et al., 1985) and increase the likelihood of a correct path if the average path is taken.This so-called ensemble method is typically used for back trajectories starting at a fixed point on Earth's surface (e.g., Cape et al., 2000;Wehner et al., 2008;Davis et al., 2010;Makra et al., 2010;Lu et al., 2012), but has also been applied to airborne measurements.For the latter, no additional position errors are introduced if back trajectories start directly along the flight track.Hence, one goal of this study is to further evaluate the utility of large numbers of trajectories initialized along flight tracks as a means to add credence to modeled air mass history.
Previous studies (e.g., Taubman et al., 2006) have utilized large numbers of trajectories starting along flight tracks.They combined similar trajectories with the help of numerical cluster techniques to create ensembles based on homogeneity of complete pathways over longitude, latitude, and altitude.These trajectory ensembles were then compared to in situ data.A similar method was also studied (e.g., Diab et al., 2003;Lewis et al., 2007) clustering airborne in situ measurements based on distinct features and modeling trajectories for each in situ data cluster.However, employing numerical clustering techniques is time-consuming and can induce additional uncertainties related to the chosen cluster method.A detailed review of these and analogous methods to establish links between back trajectories and in situ measurements can be found in Fleming et al. (2011).
An additional difficulty exists in choosing to cluster either in situ observations or back trajectories.Because no direct connection is established between measurements and trajectories during computation, resulting ensembles can overlap.Hence, these ensembles may not be as instructive as clusters derived from the combined data (Methven et al., 2003).The latter can be accomplished numerically but may not be suitable for an "off-the-shelf" air history analysis.
As an alternative, simultaneous consideration of both data sets can also be achieved by color-coding trajectories with in situ tracers to visually study variability in both data sets and as such air mass history (e.g., Parrish et al., 2000;Methven et al., 2003;O'Shea et al., 2013).Although these studies use trajectory ensembles to help assess possible distortions of an air parcel during transport, only the Methven et al. (2003) study considers high-frequency coverage by initializing back trajectories in 10 s intervals along flight tracks.However, both trajectory information and in situ observations were merged to air mass averages (e.g., stratospheric air) for interpretation using kernel density estimates.While this is suitable for research flights designed to measure the same air mass over an extended period of time, flight paths often only intersect an air mass of interest for short time periods, particularly during research flight profiles.Hence, both in situ fluctuations within an air mass and observations made in transition regions between two distinct air mass features would be masked.Such measurements, however, can become important over the narrow range of a research flight profile if changes evident in in situ tracers are reflected in trajectory features characteristic of recognized sources or processes; this will add credibility to the trajectory ensemble for that profile.
Our approach assesses trajectory consistency through visually identifying trajectory ensembles associated with any assimilated in situ physiochemical observation when colorcoded onto the trajectory.The high number of back trajectories needed for this approach is calculated using NOAAs HYSPLIT (National Oceanic and Atmospheric Administration's HYbrid Single Particle Lagrangian Integrated Trajectory) model (Draxler andHess, 1997, 1998;Draxler, 1999).
Back trajectories are initiated along the flight track every 10 s (Methven et al., 2003;Lewis et al., 2007) to study synoptic-scale features possibly deforming the measured air volume as discussed above but also to account for the vertical extent of tropospheric layers as observed during the Pacific Atmospheric Sulfur Experiment (PASE) in the central equatorial Pacific.Such tropospheric layers have distinct aerosol and gas signatures but are often only a few hundred meters thick (Stoller et al., 1999;Newell et al., 1999).Hence, because PASE flight climb and descent rates were around 300 m min −1 , the resulting 50 m resolution provides a minimum needed to resolve such layers with several trajectories for our cluster-visualization approach.Newell et al. (1999) examined numerous flight profiles from around the world and showed that these tropospheric layers are quasi-horizontal and almost ubiquitous in space and time.The same group of researchers (Stoller et al., 1999) also identified various types of such layers (e.g., pollution and deep convective signatures) over the Pacific studying the behavior of ozone (O 3 ), carbon monoxide (CO), and water vapor mixing ratio (MR).Their findings are of particular interest for our study since measurements employed were sampled in the central equatorial Pacific.Hence, we will briefly summarize results from literature below regarding general characteristics associated with the formation of these layers as observed in this region (Stoller et al., 1999;Clarke et al., 2013).This will help to understand the behavior of aerosol and gas tracers to be selected later for these features.
Large convective systems, as found in the Inter-Tropical Convergence Zone (ITCZ), can be sources of volatile aerosol number concentration through new particle formation in cloud outflow.This is commonly associated with the cloud anvil as earlier measurements over the Pacific have revealed (Clarke et al., 1998(Clarke et al., , 1999)).Here, conditions for heterogeneous nucleation are favorable because pre-existing aerosol surface area concentration is low due to effective aerosol scavenging by precipitation in these clouds (Clarke et al., 2013).High humidity and enhanced ultraviolet (UV) fluxes in the anvil region further favor formation of sulfuric acid and nucleation of sulfate particles linked to ocean-released dimethylsulfide (DMS) and sulfur dioxide (SO 2 ) lofted in the clouds from the marine boundary layer (MBL) (Hegg, 1990;Clarke, 1993;Clarke et al., 1999;Hoppel et al., 1994;Perry and Hobbs, 1994).DMS is a typical precursor of MBL SO 2 , which is converted to sulfate (SO 2− 4 ) via aqueous phase reactions in cloud droplets (e.g., Faloona et al., 2009, and references therein).
Since the latter reaction can consume O 3 , this provides an explanation for lower O 3 concentrations frequently observed in recent convective cloud outflow signatures along with pumping of lower O 3 containing MBL air into the anvils of these clouds.For the same reason, fresh outflow layers typically contain higher MR in comparison to adjacent free troposphere (FT) air.CO, on the other hand, does not change during cloud passage because it has a lifetime of almost 2 months in the central Pacific (Staudt et al., 2001;Allen et al., 2004).However, CO values in cloud outflow layers could drop in comparison to adjacent FT air if remote MBL air with low CO is lofted by clouds.
Combustion layers typically include large amounts of non-volatile aerosol (e.g., soot and non-volatile organics) in addition to volatile particles (Clarke and Kapustin, 2010;Thornberry et al., 2010), whereas trace gases such as O 3 and CO commonly trend with aerosol signatures in the absence of cloud passage (and precipitation scavenging).Such aerosol and gas fluctuations over the Pacific have been previously associated with biomass-burning events over South America (Kim and Newchurch, 1996), which are known to be a major source of atmospheric trace gases as O 3 and CO, and volatile and non-volatile particles (Crutzen and Andreae, 1990;Martin et al., 2010;Gunthe et al., 2009;Artaxo et al., 1998;Andreae and Merlet, 2001).However, Andreae et al. (2001) showed that Amazonian biomass burning emissions when lifted to higher altitudes by deep convection, up to 10 km and beyond, could lose 80-95 % of their accumulated mode aerosol concentration due to precipitation scavenging.Hence, continental pollution when observed over the Pacific typically exhibits much lower aerosol concentrations relative to CO compared to those measured at the source due to precipitation scavenging but also due to dilution through mixing with cleaner air during transport.
Precipitation scavenging of polluted air masses by continental deep convection can also yield recently nucleated particles and provides a difficulty for discriminating such features from recent cloud outflow from the ITCZ based on volatility alone.However, for observations in the central equatorial Pacific, pollution features are expected to generally retain higher O 3 and CO values compared to lower concentrations of these gas tracers present in clean remote MBL air lofted in the deep ITCZ convection.Additionally, observations show (Stoller et al., 1999) that MR values within pollution layers are generally lower in comparison to the adjacent FT environment because these features originate from the continental boundary layer whereas increased MR for fresh convective outflow is linked to the MBL.
The next section provides an overview of the measurement campaign and prevailing synoptic patterns observed.We will also describe instrumentation aboard the aircraft and in situ aerosol and gas data utilized here.Section 3 explains our approach and outlines the preparation of aircraft data for incorporation into HYSPLIT followed by an evaluation of HYS-PLIT performance during PASE.This includes a review of trajectory errors and their behavior for distinct synoptic flow situations observed during PASE.Results from this evaluation are utilized in Sect. 4 to study both the clustering of trajectories and in situ data in order to test the significance of air mass distinctions visually derived from the combined data.Our cluster-visualization approach is then demonstrated and discussed for two case studies in Sect. 5 followed by a summary and conclusions in Sect.6.

The PASE campaign
PASE took place in August/September 2007 (Bandy et al., 2012) over the equatorial Pacific near Christmas Island (2 • N, 157 • W).During the five-week campaign sulfur chemistry (Conley et al., 2009;Faloona et al., 2009) and its role in aerosol formation and evolution was measured including potential influences on cloud condensation nuclei (Clarke et al., 2013).These data were obtained during a series of 14 ninehour research flights in the National Center of Atmospheric Research (NCAR) C-130 aircraft.
Most of the time was spent in the MBL to determine ocean-released DMS fluxes and the presence of resultant reduced sulfur components (Faloona et al., 2009) but a few flight legs and profiles into the FT up to 6000 m were also included to complete sulfur flux calculations and to provide a larger-scale context to the MBL measurements.These FT excursions offered means to improve our understanding of aerosol entrained into the MBL and their potential role in influencing MBL cloud condensation nuclei (CCN) concentration (Clarke et al., 2013).Earlier surface measurements on Christmas Island (Clarke et al., 1996) and other aircraft studies (Clarke et al., 1998(Clarke et al., , 1999) ) over the Pacific suggested entrainment was a major source of aerosol number in the MBL and that many of these aerosols originated through convective cloud outflow.Preliminary exploration of HYSPLIT trajectories extending back 12 days, however, suggested a possible link between measured CO and sources of combustion aerosol over South America located approximately 10 000 km upwind.Trajectories over this extended time are often considered suspect, but because of the potential significance of such transport over the equatorial zone and potential links to boundary layer aerosol and CCN, a closer examination of HYSPLIT trajectory performance was undertaken.

Meteorological environment
The study area around Christmas Island is typically embedded in southern hemispheric southeast trade wind flow crossing the Equator because the ITCZ is located to the north of the island around 7 • N throughout most of the year.This lowlevel trade wind regime and its northeastern equivalent in the Northern Hemisphere (Trenberth, 1991) are associated with transport from the descending branch of the Hadley circulation in the subtropics.Although descent is strongest in the subtropics, subsiding air is also encountered above Christmas Island.Gage et al. (1991) frequently observed negative vertical velocities around 0.7 cm s −1 in the FT between 4 and 15 km during wind profiler measurements on Christmas Island.This leads to a weak trade wind inversion (TWI) observed during PASE between 1100 and 1700 m (Faloona et al., 2009) and attributed to smaller convective clouds with lower outflow altitudes present both north and south of the deepest ITCZ convection.
During August and September the ITCZ reaches its most northerly position resulting in a minimum in cloudiness and precipitation near Christmas Island.This was desired during PASE to better study sulfur chemistry in the remote MBL.In the absence of clouds, actinic fluxes needed to drive DMS production in the ocean increase and subsequently provide favorable conditions for studying the marine sulfur cycle.At the same time, decreased precipitation reduces scavenging of aerosol in the MBL.In addition, the limited variation in the overlying flow field above the inversion supported the choice of Christmas Island as base for the campaign because Rossby-gravity waves traveling eastward at approximately 8-10 m s −1 along the Equator generally control clouds and precipitation away from the ITCZ (Holton, 2004).This facilitates choosing a cloud-free environment in the ridge portion of the wave (Petersen et al., 2003) for flight planning.
The prevalence of the southeast trade wind regime and the overlying flow field at higher altitudes provide an ideal system for investigating air mass back trajectories.Nonetheless, during the end of August 2007 a tropical depression associated with widespread deep convection developed west of the study area, which affected FT trajectories starting along flight profiles during research flight 12 (2/3 September).Although HYSPLIT generally captures the development of the depression, increased tropical wave activity represents a limitation to air mass history analysis during PASE.The model performance for this research flight is discussed in Sect.3.2.
We note that trajectories below the TWI are not considered here because entrainment of FT air into the MBL is not resolved by the grid resolution of the underlying meteorological model.Moreover, the in situ tracers used for this analysis become mixed with other MBL air once entrained and hence their "signature" is lost.Consequently, any entrained FT air would obscure in situ tracer visualizations on MBL trajectories, even when these are correctly modeled.

Instrumentation
Basic mission support instrumentation (at 1 Hz) as the unheated Rosemount temperature sensor aboard the C-130 was provided by NCAR.We also utilize their thermoelectric dew point sensors but because responses can be delayed for rapid profiles (around 300 m min −1 ), such as descents from colder and dryer altitudes into the MBL and transitions from the warm and moist marine boundary layer into dry air aloft, measurements can be higher upon descent below the inversion and lower when ascending into the FT.Such dew point data and uncertainties in wind gust measurements during profiles for data below 4 m s −1 are excluded from the final data set.
CO and O 3 mixing ratios were also sampled by NCAR to provide high-frequency information (25 Hz) about two important trace gases (see discussion in Sect.2.3).The CO measurements are based on a response fluorescence sampling technique (commercial Aero-Laser VUV CO monitor), which is described in detail by Gerbig et al. (1996).Ozone was sampled with a Thermo Environmental Instrumentals unit utilizing a fast chemiluminescence method by measuring emitted light that emerges from the reaction of nitric oxide with ambient O 3 (Ridley et al., 1992).A second ozone monitor (TECO model 49PS, UV based method) was operated as reference and backup instrument for data evaluation during flights 3, 10 and 11, when the fast ozone unit failed.
In addition to NCAR data, aerosol physiochemistry measurements provided by the Hawaiian Group for Environmental Aerosol Research (HiGEAR) are used for our approach and discussed here.Aerosol was brought into the aircraft through the University of Hawaii solid diffuser inlet.It has been shown previously (McNaughton et al., 2007)   C. The CN counters can detect particle concentrations up to 10 000 cm −3 with a lower detection limit of about 10 nm.The first CN counter was operated to sample ambient CN (CNcold), while the second instrument was prepared with a heater in front to heat the aerosol up to 360 • C. The heating removes such volatile CN (CNvol) components as sulfuric acid, some nitrates, ammonium sulfate, and organic carbon (Clarke, 1990) and samples CN concentrations defined here as non-volatile or heated CN (CNhot).
When heated up to 360 • C, residual aerosol mostly includes light-absorbing soot, sodium sulfate, and non-volatile organics below 1 µm, while the coarse components typically consist of dust or sea salt (Clarke et al., 1997).However, residual particles from larger sulfates that do not fully volatilize during the heating of CNcold to 360 • C can also contribute.For example, a 200 nm particle that loses 99.9 % of its mass upon heating can still be detected as a 20 nm particle.
Both CN counters performed well during the campaign except for research flight seven on 23 August 2007 (instrument failure) and during cloud passage or precipitation events, because cloud and rain droplets tend to break into smaller particles when striking the sampling inlet (Weber et al., 1998).Along with aerosol size distribution measurements discussed elsewhere (Clarke et al., 2013) and other relevant parameters, these extensive in situ data enable us to describe sizeresolved aerosol volatility, chemistry, and mixing state to explore variability in observed aerosol properties over the PASE profiles and flight legs.

Classification of in situ tracers
We have selected observable gas and aerosol features in our measurements that provide independent tools to challenge the reliability of trajectories and gauge their performance.For instance, trajectories starting along portions of PASE flight tracks revealing pollution signatures could be expected to approach a continent, while those back trajectories starting along flight portions revealing recent cloud outflow may be expected to reach back to deep convection as observed in the ITCZ.Before we can test this assumption, criteria for trace gas and aerosol behavior relevant to pollution and outflow layers must be defined.We will also include stratospheric features in our discussion since these were, although rarely, observed during PASE.
Trace gas and aerosol behavior for different layers are discussed using an example profile from research flight 2 on S. Freitag: Combining in-situ measurements with HYSPLIT 5 ganic carbon (Clarke, 1990) and samples CN concentrations defined here as non-volatile or heated CN (CNhot).
When heated up to 360 • C, residual aerosol mostly includes light-absorbing soot, sodium sulfate, and non-volatile organics below 1 µm, while the coarse components typically consist of dust or sea salt (Clarke et al., 1997).However, residual particles from larger sulfates that do not fully volatilize during the heating of CNcold to 360 • C can also contribute.For example, a 200 nm particle that loses 99.9 % of its mass upon heating can still be detected as a 20 nm particle.
Both CN counters performed well during the campaign except for research flight seven on 23 August 2007 (instrument failure) and during cloud passage or precipitation events, because cloud and rain droplets tend to break into smaller particles when striking the sampling inlet (Weber et al., 1998).Along with aerosol size distribution measurements discussed elsewhere (Clarke et al., 2013) and other relevant parameters, these extensive in-situ data enable us to describe sizeresolved aerosol volatility, chemistry, and mixing state to explore variability in observed aerosol properties over the PASE profiles and flight legs.

Classification of in-situ tracers
We have selected observable gas and aerosol features in our measurements that provide independent tools to challenge the reliability of trajectories and gauge their performance.For instance, trajectories starting along portions of PASE flight tracks revealing pollution signatures could be expected to approach a continent, while those back trajectories starting along flight portions revealing recent cloud outflow may be expected to reach back to deep convection as observed in the ITCZ.Before we can test this assumption, criteria for trace gas and aerosol behavior relevant to pollution and outflow layers must be defined.We will also include stratospheric features in our discussion since these were, although rarely, observed during PASE.
Trace gas and aerosol behavior for different layers are discussed using an example profile from research flight 2 on 10 August around 1900 UTC (Figure 1).A convective cloud outflow layer is marked in cyan shading between approximately 2400-3000 m and two pollution layers between 1500 m and 2000 m are shaded in orange, while the TWI height located at about 1385 m for this research flight (Faloona et al., 2009) is shown as solid grey line.Similar pollution and cloud outflow layers were commonly observed in the FT during PASE and are discussed elsewhere (Clarke et al., 2013).
Our five observed aerosol and gas tracers of choice for this research are CNhot, CNvol, CO, O 3 , and MR.Other choices are possible depending upon measurements available and/or objectives.We further extend this analysis by discussing expected precipitation occurrence since we will examine later whether precipitation along trajectories (provided by the chosen meteorological data set) is generally consistent with vari- ations in measurements and hence could be utilized to gain 445 confidence on modeled air mass pathways.
We begin our discussion with observations made between roughly 2400-3000 m. Figure 1a shows total CN concentration or CNcold number (blue line) peaks at 450 cm −3 near 2900 m, while CNhot concentration (red line) decreases to 450 10 cm −3 at this elevation.Hence, much of the CNcold number is CNvol (green line) mainly consisting of sulfuric acid or sulfate, which decompose before reaching 360 • C (section 2.2, CNvol = CNcold-CNhot).The observed increase in CNvol accompanied by decreasing CNhot is an indicator 455 of cloud outflow in FT layers because upper portions of deep convective clouds in the ITCZ favor formation of fresh sulfate particles (see introduction).One reason for this is effective aerosol scavenging by precipitation from these clouds, which is reflected in low CNhot concentrations (Clarke et al., 460 2013).
Additionally, compared to adjacent FT air between 2000-2400 m, reduction of O 3 by 2-3 ppbv (Figure 1b) for most of this layer is consistent with cloud pumping of lower O 3containing MBL air into the FT.O 3 can also be consumed 465 via aqueous phase reactions in droplets converting SO 2 to SO 2− 4 .
Cloud outflow layers with high CNvol concentration also 10 August around 19:00 UTC (Fig. 1).A convective cloud outflow layer is marked in cyan shading between approximately 2400 and 3000 m and two pollution layers between 1500 and 2000 m are shaded in orange, while the TWI height located at about 1385 m for this research flight (Faloona et al., 2009) is shown as a solid grey line.Similar pollution and cloud outflow layers were commonly observed in the FT during PASE and are discussed elsewhere (Clarke et al., 2013).
Our five observed aerosol and gas tracers of choice for this research are CNhot, CNvol, CO, O 3 , and MR.Other choices are possible depending upon measurements available and/or objectives.We further extend this analysis by discussing expected precipitation occurrence since we will examine later whether precipitation along trajectories (provided by the chosen meteorological data set) is generally consistent with variations in measurements and hence could be utilized to gain confidence on modeled air mass pathways.
We begin our discussion with observations made between roughly 2400 and 3000 m. Figure 1a shows total CN concentration or CNcold number (blue line) peaks at 450 cm −3 near 2900 m, while CNhot concentration (red line) decreases to 10 cm −3 at this elevation.Hence, much of the CNcold number  (Clarke et al., 2013).
Additionally, compared to adjacent FT air between 2000 and 2400 m, reduction of O 3 by 2-3 ppbv (Fig. 1b) for most of this layer is consistent with cloud pumping of lower O 3containing MBL air into the FT.O 3 can also be consumed via aqueous phase reactions in droplets converting SO 2 to SO 2− 4 .Cloud outflow layers with high CNvol concentration also typically contain higher MR values (blue line in Fig. 1b) in comparison to neighboring FT air due to pumping of moist MBL air through clouds.However, relative changes in MR vary in response to the strength of precipitation during deep convective passage, which may explain the observed decrease of MR between 2400 and 2800 m.On the other hand, MR increases above 2800 m until the top of the profile concurrent with CNvol behavior, which reaches its maximum concentration of around 440 cm −3 at this altitude.Tendencies of the above-described aerosol and gas tracers in cloud outflow regimes are summarized in Table 1 indicating increase/decrease of tracer concentrations in comparison to adjacent FT air above or below.
Here, we also include CO, although CO has a lifetime of almost 2 months in the equatorial atmosphere (Staudt et al., 2001;Allen et al., 2004) and hence does not change during cloud passage.However, CO values in cloud outflow could drop if clean remote MBL air is lifted aloft by clouds.This appears consistent with the cloud outflow layer in Fig. 1b where CO values (magenta line) are similar to those found below the TWI.Note that CO measurements are missing below roughly 800 m for this profile due to instrument calibration.
Modeled precipitation is also indicated in Table 1 and naturally expected to be larger for trajectories that include flight paths with aerosol and gas features representative of cloud outflow regimes.Whether the meteorological model successfully predicts precipitation tendencies for cloud outflow features will be examined in Sect. 5.
We continue the discussion with two observed layers between approximately 1500 and 2000 m (orange shading).Figure 1a shows CNhot peaks in both layers around 70 cm −3 , which is almost five times as much as observed outside these layers.We mentioned earlier that dust and combustion aerosol are typical components of FT CNhot.In this case, however, we attribute increased CNhot to continental pollution sources because aerosol size distributions (not shown) indicate few larger particles and hence dust contributions to total number will be small (Clarke et al., 2004).Moreover, CNhot clearly trends with trace gases as commonly observed for pollution events (Crutzen and Andreae, 1990;Martin et al., 2010;Gunthe et al., 2009;Artaxo et al., 1998;Andreae and Merlet, 2001).This is illustrated in Fig. 1b showing CO and O 3 increases by 10 and 15 ppbv compared to outside these layers, respectively.
The two layers also exhibit elevated CNvol, a typical component of pollution, although their contribution to the total CN number is usually smaller in such layers.Hence, because CNhot concentrations are also lower than expected (Clarke et al., 2004(Clarke et al., , 2007)), part of the CNvol concentration could be a result from continental cloud passage, to be discussed later, and subsequent cloud outflow.
Although pollution aerosol may have traversed clouds during transport, such air is expected to generally retain higher O 3 and CO values compared to recent outflow from ITCZ deep convection.This is due to lower concentrations of these gas tracers present in clean remote MBL air lofted in ITCZ clouds as discussed earlier.In fact, a comparison of trace gas concentrations (Fig. 1b) between outflow aloft and the pollution layers below reveals that O 3 and CO are 60 and 14 % lower in the fresh outflow feature, respectively.Additionally, MR values drop by almost 60 % in both pollution layers compared to neighboring FT air consistent with continental boundary layer origin.
Similar pollution and cloud outflow layers were commonly observed when profiling into the FT, while stratospheric features were rare.The latter air mass is typically well aged and highly scavenged.As such, it often only contains small amounts of fine volatile and non-volatile aerosol, very low MR, and low CO (see Table 1), while ozone is elevated.Nonetheless, in the absence of volatile organic compound (VOC) measurements, such as methane, no allocation with regard to air mass origin can be made with absolute certainty since O 3 can also be produced in the FT via VOC or CO oxidation.

Methodology and model performance
Here, we explain our approach and data preparation.We further describe common error sources of back trajectories in Sect.3.2 and discuss their characteristics in distinct synoptic flow situations.

Data preparation
We employ the HYSPLIT modeling system (version 4.9, January 2010) on a Windows-based PC version with meteorological output fields from the National Center for Environmental Prediction (NCEP) Global Data Assimilation System (GDAS; http://www.ready.noaa.gov/archives.php).These data are provided on a 1 • latitude-longitude grid (360 • by 181 • ) and are used here to calculate 3-D back trajectories (horizontal wind field plus vertical wind).The archived GDAS data are provided every 3 h (analysis plus forecast runs) and are converted by NCEP from 64 sigma levels to 23 pressure levels starting at 1000 hPa (1000-900 hPa in 25 hPa increments, 900-50 hPa in 50 hPa increments) with the top level at 20 hPa.
In order to handle the resulting large amount of data for trajectories initiated every 10 s along flight tracks, a simple semi-automated routine was added to accelerate in-and output into and from HYSPLIT.The complete MATLAB (MathsWorks Inc.) code plus selected flight data to illustrate example trajectories can be found in the supplementary material to this paper.
HYSPLIT text files contain back trajectory information for a predefined length in time (e.g., 5/10/12/15/20 days backward) on an hourly basis together with computed meteorological information along each trajectory (e.g., rain, humidity, potential temperature).These output files were processed with MATLAB and combined with our measured in situ dataset initially provided at 1 Hz but averaged to 10 s resolution.The resulting merged files link our in situ observations to the trajectories allowing variations in measurements to be visually coordinated with variations in trajectory features.

General trajectory model performance
Computed trajectories typically deviate from the actual air mass corridor that describes the history of an air volume.How much depends on numerical errors resulting from interpolation/truncation in space and time and uncertainties in wind field measurements carried over from employed meteorological data into HYSPLIT (in this case GDAS data).These wind field uncertainties, hereafter referred to as meteorological errors, are typically the largest source of error and commonly originate from missing satellite and/or sparse radiosonde information (e.g., over the Pacific) or generally from the coarse resolution of the chosen wind field.Additionally, deformation as well as subgrid-scale processes as turbulent mixing and convection (see introduction) add to the total deviation from the actual air mass corridor since these processes are not represented by the trajectory model.The next paragraphs discuss the general characteristics of these error sources during PASE and aspects relevant to approximate a reasonable length of PASE trajectories (between 5 and 20 days).
Numerical errors can be calculated by performing a socalled forward/backward (F/B) test (Fuelberg et al., 1996).This test compares start points of backward trajectories (along research flight tracks) and ends of forward trajectories, computed from the end of the backward trajectories.Thus, each F/B calculation yields a horizontal displacement arising from interpolation/truncation in space and time over the complete length of back plus forward trajectory.Since ders of magnitude smaller for 5 to 12-day trajectories for both percentile ranges compared to corresponding trajectory length (column 5).This relation changes for 15 and 20-day trajectories.Here, we observe that the 50 % percentile ranges still give a numerical error an order of magnitude smaller than the total trajectory distance, while the 95 % percentiles reveal numerical errors of about 3 % and 8 % of the travel distance for 15 and 20-day trajectories, respectively.While this shows horizontal numerical errors are negligible compared to travel distance for the majority of trajectories up to 20 days, our set of FT trajectories is computed starting mostly along research flight profiles.Hence, we must discuss numerical error behavior in the vertical plane as this affects our ability to resolve relations between trajectories and in-situ measurements.
Results from F/B tests in the vertical for various trajectory lengths are summarized in Table 2, column 4.These calculations reveal about 50 % of all 12-day trajectories have numerical errors in the vertical plane smaller than 5 m, while 95 % of the computed pathways are accurate to within 170 m.This reflects the fact that vertical motion is generally limited by the abundance of stable and quasi-horizontal tropospheric layers (Newell et al., 1999).However, vertical numerical errors for trajectories approaching 15 days increase beyond those values considered acceptable to reliably resolve observed pollution and recent cloud outflow features (compare with typical layer extend in Figure 1) because these are often only a few hundred meters thick (Newell et al., 1999;Stoller et al., 1999).Hence, without accounting for other error sources we limit our maximum trajectory length to around 12 days.Numerical error analysis also provides insights into trajectory performance in distinct synoptic flow regimes, as Figure 2 suggests.Here, we illustrate relationships between horizontal travel distances of the back trajectories accumulated over 12 days and corresponding numerical errors in the vertical plane.Each small dot represents a result from a sin-gle trajectory computation, while the green dot in each plot provides the median value.We use our computed FT back trajectories and sort out two groups based on the following 665 criteria.The first group (black dots, plot a) consists of back trajectories reaching South America and not traversing any ITCZ deep convection, while the second group (red dots, plot b) includes only those trajectories traversing ITCZ convection but not reaching the continent.Here, we define the 670 ITCZ as a rectangle between 100-150 • W and 5-15 • N (compare with Figure 5) attributed to the region of heaviest deep convection during PASE as observed utilizing Geostationary Operational Environmental Satellite (GOES) infrared (IR) imagery.Partitioning trajectories into these two groups al-675 lows us to study the general influence of ITCZ flow pattern on trajectory performance.
Our results indicate ITCZ trajectories have a median numerical error of around 9 m in the vertical, three times larger than the median numerical error of trajectories reach-680 ing back to South America in 12 days without ITCZ passage.Although both errors are negligible, this is a notable finding knowing that the numerical error increases with travel distance (via interpolation/ truncation) but that the median total travel distance of the ITCZ trajectory group is actu- we wish to obtain the numerical error for the back trajectory exclusively, this horizontal distance has to be divided by two.
Results from F/B computations are presented in Table 2, column 3, for various back trajectory timescales between 5 and 20 days.For each timescale, we present 95 and 50 % (median) percentiles of these numerical errors in the horizontal plane for our set of 6084 FT back trajectories, whereat all results are rounded for transparency.We use percentiles to illustrate errors in Table 2 because distributions of trajectory errors typically contain a large number of extreme values (distribution skewed to the right) attributed to deviations from the median trajectory distance.We also note that trajectory errors are typically discussed in relation to trajectory length defined as the complete extent of the curved path in the horizontal plane since this eases comparison of trajectory errors to those derived in different studies.
Numerical errors in the horizontal plane are about two orders of magnitude smaller for 5-12 day trajectories for both percentile ranges compared to corresponding trajectory length (column 5).This relation changes for 15 and 20 day trajectories.Here, we observe that the 50 % percentile ranges still give a numerical error an order of magnitude smaller than the total trajectory distance, while the 95 % percentiles reveal numerical errors of about 3 % and 8 % of the travel distance for 15 and 20 day trajectories, respectively.While this shows horizontal numerical errors are negligible compared to travel distance for the majority of trajectories up to 20 days, our set of FT trajectories is computed starting mostly along research flight profiles.Hence, we must discuss numerical error behavior in the vertical plane as this affects our ability to resolve relations between trajectories and in situ measurements.
Results from F/B tests in the vertical for various trajectory lengths are summarized in Table 2, column 4.These calculations reveal that about 50 % of all 12 day trajectories have numerical errors in the vertical plane smaller than 5 m, while 95 % of the computed pathways are accurate to within 170 m.This reflects the fact that vertical motion is generally limited by the abundance of stable and quasi-horizontal tropospheric layers (Newell et al., 1999).However, vertical numerical errors for trajectories approaching 15 days increase beyond those values considered acceptable to reliably resolve observed pollution and recent cloud outflow features (compare with typical layer extent in Fig. 1) because these are often only a few hundred meters thick (Newell et al., 1999;Stoller et al., 1999).Hence, without accounting for other error sources we limit our maximum trajectory length to around 12 days.
Numerical error analysis also provides insights into trajectory performance in distinct synoptic flow regimes, as Fig. 2 suggests.Here, we illustrate relationships between horizontal travel distances of the back trajectories accumulated over 12 days and corresponding numerical errors in the vertical plane.Each small dot represents a result from a single trajectory computation, while the green dot in each plot provides the median value.We use our computed FT back trajectories and sort out two groups based on the following criteria.The first group (black dots, plot a) consists of back trajectories reaching South America and not traversing any ITCZ deep convection, while the second group (red dots, plot b) includes only those trajectories traversing ITCZ convection but not reaching the continent.Here, we define the ITCZ as a rectangle between 100-150 • W and 5-15 • N (compare with Fig. 5) attributed to the region of heaviest deep convection during PASE as observed utilizing Geostationary Operational Environmental Satellite (GOES) infrared (IR) imagery.Partitioning trajectories into these two groups allows us to study the general influence of the ITCZ flow pattern on trajectory performance.
Our results indicate ITCZ trajectories have a median numerical error of around 9 m in the vertical, three times larger than the median numerical error of trajectories reaching back to South America in 12 days without ITCZ passage.Although both errors are negligible, this is a notable finding knowing that the numerical error increases with travel distance (via interpolation/truncation) but that the median The GDAS model generally represents atmospheric conditions above 3000 m very well as MR and θ comparisons reveal.Modeled θ values above that altitude are within 1 K to observed θ behavior, while modeled MR values accurately 720 represent the observed MR bifurcation.However, below 3000 m down to about 1500 m modeled MR is higher than measured values by up to 5 g/kg, while modeled θ is 2-3 K lower than observations.Below 1500 m, modeled MR agrees with observations, while modeled θ is 1-2 K higher than mea-725 surements.Based on these θ observations, the GDAS model generally seems to have trouble reproducing the height and strength of the TWI, which was located between 1100-1700 m (Faloona et al., 2009) with an average strength of 7 K during PASE.In comparison, modeled θ profiles illustrate the 730 average modeled TWI height at around 500 m with a mean strength of 3 K likely related to the old convection and MBL scheme employed before 2010, which tended to underestimate turbulent diffusion in cumulus clouds (Han and Pan, 2011).Consequently, this provides another reason for exclu-735 sion of MBL back trajectories.
As mentioned in section 2.1 computed back trajectories starting along FT profiles during research flight 12 (2/3 September) involved strong deep convection and cyclonic circulation associated with a tropical depression.While MR 740 and θ comparisons for these FT profiles are consistent with those discussed above, larger discrepancies are observed for wind field relationships.The apparent offset during this flight which is highlighted in red dots in Figures 3 c and d, could have been caused by misplacement of the center of the de-745 pression and as such the route for these trajectories since measured wind direction indicates southerly winds, while the model predicts northeasterlies.
For all other flights during PASE, wind field data scatter total travel distance of the ITCZ trajectory group is actually 30 % smaller compared to the South American trajectory set, which reveals a median travel distance of 11 000 km.We hypothesize that this difference is due to amplifications of numerical errors as expected when linear interpolation of wind vectors is performed between grid points in strongly curved and vertically oriented flow in the ITCZ.Consequently, this analysis could provide an indirect measure of trajectories susceptible to turbulent mixing and convection since these subgrid-scale processes, which cannot be modeled with HYSPLIT, are typically associated with ITCZ synoptic pattern.
Research flights over large horizontal and vertical scales also offer a direct tool to investigate the potential importance of subgrid-scale processes during air mass transport by comparing in situ meteorological data with GDAS features.This is shown in Fig. 3a  The GDAS model generally represents atmospheric conditions above 3000 m very well as MR and θ comparisons reveal.Modeled θ values above that altitude are within 1 K to observed θ behavior, while modeled MR values accurately represent the observed MR bifurcation.However, below 3000 m down to about 1500 m modeled MR is higher than measured values by up to 5 g kg −1 , while modeled θ is 2-3 K lower than observations.Below 1500 m, modeled MR agrees with observations, while modeled θ is 1-2 K higher than measurements.Based on these θ observations, the GDAS model generally seems to have trouble reproducing the height and strength of the TWI, which was located between 1100 and 1700 m (Faloona et al., 2009) with an average strength of 7 K during PASE.In comparison, modeled θ profiles illustrate the average modeled TWI height at around 500 m with a mean strength of 3 K likely related to the old convection and MBL scheme employed before 2010, which tended to underestimate turbulent diffusion in cumulus clouds (Han and Pan, 2011).Consequently, this provides another reason for exclusion of MBL back trajectories.
As mentioned in Sect.2.1 computed back trajectories starting along FT profiles during research flight 12 (2/3 September) involved strong deep convection and cyclonic circulation associated with a tropical depression.While MR and θ comparisons for these FT profiles are consistent with those discussed above, larger discrepancies are observed for wind field relationships.The apparent offset during this flight which is highlighted in red dots in Fig. 3c and d, could have been caused by misplacement of the center of the depression and as such the route for these trajectories since measured wind direction indicates southerly winds, while the model predicts northeasterlies.
For all other flights during PASE, wind field data scatter closely around the 1 : 1 line (grey dashed lines in Fig. 3c and  d).Model winds exhibit somewhat higher accuracy at lower altitudes as revealed by root mean square errors (RMSE) of modeled wind speed in the MBL of 1.13 m s −1 and in the FT of 2.61 m s −1 .The wind field evaluation also shows modeled wind direction is less variable than measurements.For instance, horizontal lines of wind direction data on a random flight leg can be seen to change between 90 and 110 • , while modeled wind direction is constantly 100 • .This behavior is likely due to the coarse horizontal resolution of the GDAS model and subsequently its inability to represent subgridscale wind field fluctuations.Even so, overall consistency between modeled wind fields, θ and MR implies the combined impact of the meteorological error and subgrid-scale phenomena is small in this study region and as such for shortrange back trajectories.Larger deviations from the actual air mass corridor, however, should be expected for our 12 day back trajectories but also in more complex midlatitude flow than for the generally homogeneous equatorial flow observed during PASE.
While the presented analyses (Figs. 2, 3) provide a sense of how susceptible back trajectories are to deviations from the actual air mass corridor due to meteorological errors and subgrid-scale phenomena as turbulent mixing and convection, they cannot replace quantitative estimates.Such estimates can be obtained by validating the trajectory model in its entirety, for instance, by comparing multiple back trajectory runs each utilizing a different meteorological model input (Gebhart et al., 2005;Harris et al., 2005).Applying this method, Harris et al. (2005) showed that transport models are highly sensitive to the choice of meteorological input and that back trajectory end points for different data sets differ by up to 40 % of the averaged travel distance after 96 h.Note, however, that this analysis only describes the sensitivity to meteorological errors but does not provide actual estimates of these and that the investigated meteorological models were of a much coarser resolution (2.5 • grid) than used here.Riddle et al. (2006) estimated the impact of meteorological errors and subgrid-scale phenomena by measuring transport model deviations from the actual air mass corridor with the help of air mass tracking altitude-controlled balloons.They found average deviations between 26 and 36 % of the trajectory travel distance depending on the meteorological model employed.While these deviations are above the commonly used average of 20 % of the traveled distance (Stohl, 1998), it should be noted that the Riddle et al. (2006) study was conducted in more complex midlatitude flow and that trajectories involved interactions with an approaching cold front and a low-level jet at times.Nonetheless, this study again nicely points out how the magnitude of the meteorological error varies greatly depending on whether the general flow depends on synoptic-scale or subgrid-scale features.
Since our analyses only provides qualitative estimates of the total deviation from the actual air mass corridor, we will assume a total error of 20 % based on previous literature as a "rule-of-thumb" measure for all FT back trajectories starting along profiles during PASE as shown in Table 2, column 6.For instance, 50 % of the 5 day back trajectories travel 4200 km or less and hence would have a maximum total deviation of 800 km in the horizontal plane (20 % of total distance).At the same time, 45 % of 5 day trajectories travel between 4200 and 5500 km and hence have a maximum "rule-of-thumb" total error of 1100 km.Further, since travel distance and total error are coupled the latter exhibit an upper limit of 2500 km for 95 % of the 12 day trajectories, which traveled about 12 500 km.Hence, if we wish to investigate, the source of measured pollution aerosol with associated back trajectories ending over a continent approximately 12 500 km away and 12 days back in time (e.g., South America), a nominal 2500 km radius of source uncertainty would be implied in the horizontal plane.Even for this uncertainty over these distances, this should allow for reasonable separation of cloud outflow from either ITCZ or continental deep convection.
We have established criteria in this section assisting us in choosing a feasible length of back trajectories (12 days) and in interpreting trajectory errors in distinct synoptic flow regimes.We further provided rough error estimates for PASE trajectories.Based on these findings, we will discuss advantages in examining trajectories in groups over single trajectory evaluation next.

Cluster approach
To test the general utility of data clusters, we may employ our set of 12 day FT back trajectories (referred to here as data set T) to sort out two groups based on the pathway criteria defined in Sect.3.2 for Fig. 2. There, we classified one group as a bundle of all back trajectories traversing the ITCZ deep convection but not reaching South America (referred to here as group T1; 3321 trajectories), while the second group included only those trajectories reaching South America and not traversing any ITCZ convection (group T2; 1117 trajectories).Note, the ITCZ was defined as a rectangle between 100-150 • W and 5-15  gets lifted in deep convection and subsides into the PASE study area; see section 2.3).At the same time, elevated CO and CNhot concentrations for continental trajectories (group T2) appear reasonable because of the variety of CO and CNhot sources present there (section 2.3).However, both CO 865 and CNhot histograms show some overlap suggesting some of the trajectories in both groups contain in-situ values not representative of the primary trajectory cluster.
In order to relate these results, solely based on trajectory path classifications, to observed air mass features, we may 870 also sort distinct groups based on our in-situ measurements (data set M) as discussed in section 2.3.There, we defined distinct air mass layers based on the behavior of our selected tracers as CO, CNhot, CNvol, O 3 or MR in comparison to adjacent FT environments (Table 1).Using these criteria 875 we found several features including convective cloud outflow from the ITCZ (referred to here as group M1) and continental pollution (group M2).To ensure two distinct groups for our analysis, we select cases with features clearly identified relative to the adjacent FT environment as exemplified in the 880 profile illustrated in Figure 1 (e.g.core of pollution features marked in orange shading).Hence, data volumes are reduced to 440 and 990 data points for groups M1 and M2, respectively, but are still considered representative as both groups include profile portions from all but four research flights (4, 885 5, 7, and 12).
In-situ observations of CO and CNhot related to groups M1 (red histograms) and M2 (black) are illustrated in Fig- ures 4b and d, respectively.Over 90 % of the CO values associated to cloud outflow are below 62 ppbv, whereas over 90 890 % of the CO relating to continental pollution is found above that value.Hence, 62 ppbv represents an approximate threshold for distinction between the two air masses during PASE.Its equivalent for CNhot can also be estimated to around 110 cm −3 .As such, both thresholds correspond closely to those 895 estimated from data set T (60 ppbv; 95 cm −3 ), although the latter exhibit greater uncertainty.The general agreement in these classifications where trajectory and measured data sets, T1 and M1, both describe cloud outflow, while T2 and M2 illustrate continental pollution, argues for overall trajectory 900 reliability during PASE.
Back trajectories related to the in-situ data groups M1 and M2 are illustrated in Figures 5a and b, respectively.As expected, most of the modeled pathways associated with group M1 (blue lines in Fig. 5a; 92 % of all trajectories) traverse the 905 ITCZ region, identified here by a magenta rectangle marking rectangle in Fig. 5).Since our set of back trajectories are merged with in situ data, we can immediately sort any PASE measurements based on these group criteria.This is exemplified for CO and CNhot observations illustrated in Fig. 4a and c, respectively, for the defined groups T1 (red histograms) and T2 (black).These histograms and their corresponding cumulative distribution functions (CDFs; dashed lines) reveal distinct differences.While about 65 % of the CO measurements associated with group T1 are below 60 ppbv, group T2 exhibits about 65 % above that value.Similarly, the median CNhot concentration around 75 cm −3 is lower for group T1 compared to T2 exhibiting a median around 115 cm −3 .Because CO measurements in MBL air for PASE research flight 4 in the vicinity of the ITCZ are below 60-63 ppbv (Clarke et al., 2013), we conclude that much of the CO measurements from group T1 are consistent with the ITCZ trajectory behavior (pristine MBL air with low CO gets lifted in deep convection and subsides into the PASE study area; see Sect.2.3).At the same time, elevated CO and CNhot concentrations for continental trajectories (group T2) appear reasonable because of the variety of CO and CNhot sources present there (Sect.2.3).However, both CO and CNhot histograms show some overlap suggesting some of the trajectories in both groups contain in situ values not representative of the primary trajectory cluster.
In order to relate these results, solely based on trajectory path classifications, to observed air mass features, we may also sort distinct groups based on our in situ measurements (data set M) as discussed in Sect.2.3.There, we defined distinct air mass layers based on the behavior of our selected tracers as CO, CNhot, CNvol, O 3 or MR in comparison to adjacent FT environments (Table 1).Using these criteria we found several features including convective cloud outflow from the ITCZ (referred to here as group M1) and continental pollution (group M2).To ensure two distinct groups for our analysis, we select cases with features clearly identified relative to the adjacent FT environment as exemplified in the profile illustrated in Fig. 1 (e.g., core of pollution features marked in orange shading).Hence, data volumes are reduced to 440 and 990 data points for groups M1 and M2, respectively, but are still considered representative as both groups include profile portions from all but four research flights (4, 5, 7, and 12).
In situ observations of CO and CNhot related to groups M1 (red histograms) and M2 (black) are illustrated in Fig. 4b  and d, respectively.Over 90 % of the CO values associated to cloud outflow are below 62 ppbv, whereas over 90 % of the CO relating to continental pollution is found above that value.Hence, 62 ppbv represents an approximate threshold for distinction between the two air masses during PASE.Its equivalent for CNhot can also be estimated to around the region of heaviest deep convection during PASE.However, only 53 % of the back trajectories (orange lines in Fig. 5b) related to pollution group M2 reach back to South America in 12 days without passing through the "ITCZ" box, while the remainder traverses the ITCZ region (blue lines in Fig. 5b).Consequently, the latter trajectories may not be reliable because pollution aerosol is expected to be scavenged in ITCZ deep convection (section 2.3).Nevertheless, overall consistency of M2 trajectories is suggested in the tight band (orange cluster) formed by the majority of the trajectories for over almost 10,000 km.
As discussed in the introduction, air history analysis may be more revealing if air mass clusters are established simultaneously from measurements and back trajectories (Methven et al., 2003).This was not considered in the above evaluations intended to reveal general consistency between trajectory features and measurements but will be illustrated in the next section.This ascent flown during research flight two was discussed 940 in section 2.3.There, we identified one cloud outflow feature between approximately 2400-3000 m (cyan shading in Figure 1) and two pollution layers between 1500-2000 m (orange shading in Figure 1).As described earlier, we will focus on FT back trajectories and only examine air masses 945 above the TWI located at 1385 m for this flight (solid grey line in Figure 1).Our combined trajectory and in-situ data set allows us to color-code these FT back trajectories with any measured aerosol and gas tracers, but we will focus on those essential to distinguish between cloud outflow and pollution 950 features as illustrated in Table 1.Additionally, we will investigate modeled precipitation (from GDAS data) along each trajectory to evaluate any consistency with observed features.Note that these precipitation amounts are not exact but may be utilized to determine precipitation tendencies (i.e.does it 955 occur and if so where).
The 12-day FT back trajectories for this profile are illustrated in Figures 6a-c color-coded with aircraft altitude (Start Altitude) using colors scaled between dark blue and dark red for altitudes between 1385 m and 3000 m.For instance, tra-960 jectories associated with the cloud outflow feature between approximately 2400-3000 m have colors ranging from orange to dark red.In plot a, back trajectories are illustrated over longitude, latitude, and altitude to provide a spatial view, while the same paths are shown in 2D in plot b. Figure 6c 965 provides another view of these trajectories over longitude and altitude.This projection provides a clear illustration of vertical structure for PASE back trajectories.Hence, all tracer and precipitation colorations are shown over Longitude and Altitude.Plot d illustrates modeled GDAS precipitation ac-970 cumulated over the length of 12 days for each trajectory, while plots e-h demonstrate CNhot, CO, CNvol, and O 3 colorations, respectively.
The back trajectories for the two pollution layers are colored blue in Figures 6a-c and extend back to about 75 • W, 975 5 • N ending at 3-5 km.This puts them over the Andes in 110 cm −3 .As such, both thresholds correspond closely to those estimated from data set T (60 ppbv; 95 cm −3 ), although the latter exhibit greater uncertainty.The general agreement in these classifications where trajectory and measured data sets, T1 and M1, both describe cloud outflow, while T2 and M2 illustrate continental pollution, argues for overall trajectory reliability during PASE.
Back trajectories related to the in situ data groups M1 and M2 are illustrated in Fig. 5a and b, respectively.As expected, most of the modeled pathways associated with group M1 (blue lines in Fig. 5a; 92 % of all trajectories) traverse the ITCZ region, identified here by a magenta rectangle marking the region of heaviest deep convection during PASE.However, only 53 % of the back trajectories (orange lines in Fig. 5b) related to pollution group M2 reach back to South America in 12 days without passing through the "ITCZ" box, while the remainder traverses the ITCZ region (blue lines in Fig. 5b).Consequently, the latter trajectories may not be reliable because pollution aerosol is expected to be scavenged in ITCZ deep convection (Sect.2.3).Nevertheless, overall consistency of M2 trajectories is suggested in the tight band (orange cluster) formed by the majority of the trajectories for over almost 10 000 km.
As discussed in the introduction, air history analysis may be more revealing if air mass clusters are established simultaneously from measurements and back trajectories (Methven et al., 2003).This was not considered in the above evaluations intended to reveal general consistency between trajectory features and measurements but will be illustrated in the next section.

Cluster-visualization approach
In this section, we discuss our trajectory-visualization approach for two research flight profiles representative of the generally homogeneous equatorial flow observed during PASE.We will examine its applicability for rapid air mass history analysis under commonly observed research flight conditions by addressing the following questions.Do trajectories exhibiting similar aerosol physiochemistry reveal similar characteristics of recognized sources or processes?Are back trajectories consistent over the temporal extent of a research flight when the flight is conducted such that similar air mass features are encountered multiple times during the flight?Are transition areas between two distinct aerosol regimes represented adequately by trajectories?

Ascent around 19:00 UTC, 10 August
This ascent flown during research flight two was discussed in Sect.2.3.There, we identified one cloud outflow feature between approximately 2400 and 3000 m (cyan shading in Fig. 1) and two pollution layers between 1500 and 2000 m (orange shading in Fig. 1).As described earlier, we will focus on FT back trajectories and only examine air masses above the TWI located at 1385 m for this flight (solid grey line in Fig. 1).Our combined trajectory and in situ data set allows us to color-code these FT back trajectories with any measured aerosol and gas tracers, but we will focus on those essential to distinguish between cloud outflow and pollution features as illustrated in Table 1.Additionally, we will investigate modeled precipitation (from GDAS data) along each trajectory to evaluate any consistency with observed features.Note that these precipitation amounts are not exact but may be utilized to determine precipitation tendencies (i.e., does it occur and if so where).
The 12 day FT back trajectories for this profile are illustrated in Fig. 6a-c  Ecuador and Columbia where rising motion is evident.As the modeled air parcels travel west they gradually subside (Fig. 6c) from this altitude into the study region.Hence, measured pollution aerosol could originate from Amazon  biomass-burning that can be readily observed in Moderate Resolution Imaging Spectroradiometer (MODIS; http:// earthdata.nasa.gov/data/nrt-data/rapid-response)imagery for this period (not shown).
In contrast to pollution trajectories, convective cloud outflow pathways in these plots behave differently.Trajectories above 2400 m (colored yellow to dark red) show two common outflow signatures.Plots b and c reveal that the modeled air was lifted by roughly 2000 m (dark red) and 4000 m (orange) near 120 • W, 9 • N. In plot b, we also observe that lifting was associated with looping of these trajectories.This suggests measurements are consistent with modeled paths, although those portions of back trajectories representing the lower-level inflow into the ITCZ should be taken with caution because various error sources (see discussion in section 3.2) complicate accurate representation of (backward) transport through deep convective clouds.Consequently, it is not certain whether these cloud outflow air masses, which likely passed ITCZ convection, were originally advected with southeasterly MBL winds from the South East Pacific as shown in plots a and b.We will now challenge trajectories and modeled precipitation with our in-situ tracers.
It was recognized in section 2.3 the increase in CNvol (Figure 6g and Figure 1a) with simultaneous decrease in CNhot (plot e) in FT layers are indicators of cloud outflow as observed between approximately 2400-3000 m.Further, it was argued that simultaneous reductions of CO (plot f) and O 3 (plot h) as illustrated here suggest deep convective cloud outflow from the ITCZ because clean remote MBL air is lifted aloft by these clouds.Hence, trajectories between 2400-3000 m are consistent with in-situ data.Moreover, this consistency is enforced since modeled precipitation accumulated along each trajectory path is highest at this elevation (70-80 mm/ 12 days).
In Figures 6e, f and h showing CNhot, CO, and O 3 , respectively, we can see that layers near 1500 m and 2000 m reveal similarly elevated values for these tracers (compare with Figure 1), suggesting pollution of similar origin for both (Table 1).These layers are divided by a smaller feature at ap-1020 proximately 1750 m with decreased CNhot, CO, and O 3 by 45 cm −3 , 8 ppbv, and 13 ppbv compared to pollution layer values, respectively.Since MR values increase by 6 g/kg (Figure 1b; coloration not shown) in this feature, compared to about 4 g/kg for both pollution layers, this could suggest 1025 cloud passage.This is further supported by elevated CNvol concentration (Figs.1a and 6g) for this feature.Hence, because elevated CNvol concentrations are observed in the lower pollution layer as well, this suggests both pollution layers and the small outflow feature at 1750 m are coupled.

1030
In fact, modeled precipitation tendencies (Fig. 6d) indicate elevated amounts for the small scavenged pollution feature near 1750 m (55 mm/ 12 days) in comparison to the pollution layers (30 mm/ 12 days).Additionally, at the end of our discussion in section 2.3 we concluded that scavenged pol-1035 lution air masses would generally retain higher CO and O 3 values compared to lower tracer amounts present in clean air lofted from the MBL in ITCZ convection over the Pacific.This can be clearly seen in Figures 6f and h.
We further examine temporally-resolved GDAS precipita- The second precipitation field associated with pollution trajectories is less intense and scattered between 100 • W and 1055 the Andes.This may conflict with our initial assumption that air masses from the Amazon basin have to pass the Andes often associated with deep convection forming on their eastern slopes as result of lifting (Andreae and Merlet, 2001).However, careful evaluation shows this precipitation away from 1060 the Andes is associated to those trajectories starting along the transition region (2000-2400 m) between ITCZ outflow and pollution but also cloud outflow trajectories of which a some show secondary lifting between 80-100 • W (e.g.Fig. 6c).

1065
This transition region needs to be examined more carefully given our estimated model altitude uncertainties after thousands of kilometers (Table 2).While the decrease in CNhot and O 3 values, as shown in Figures 6e and h ending at 3-5 km.This puts them over the Andes in Ecuador and Colombia where rising motion is evident.As the modeled air parcels travel west they gradually subside (Fig. 6c) from this altitude into the study region.Hence, measured pollution aerosol could originate from Amazonian biomass burning that can be readily observed in Moderate Resolution Imaging Spectroradiometer (MODIS; http://earthdata.nasa.gov/data/nrt-data/rapid-response)imagery for this period (not shown).
In contrast to pollution trajectories, convective cloud outflow pathways in these plots behave differently.Trajectories above 2400 m (colored yellow to dark red) show two common outflow signatures.Plots b and c reveal that the modeled air was lifted by roughly 2000 (dark red) and 4000 m (orange) near 120 • W, 9 • N. In plot b, we also observe that lifting was associated with looping of these trajectories.This suggests measurements are consistent with modeled paths, although those portions of back trajectories representing the lower-level inflow into the ITCZ should be taken with caution because various error sources (see discussion in Sect.3.2) complicate accurate representation of (backward) transport through deep convective clouds.Consequently, it is not certain whether these cloud outflow air masses, which likely passed ITCZ convection, were originally advected with southeasterly MBL winds from the southeast Pacific as shown in plots a and b.We will now challenge trajectories and modeled precipitation with our in situ tracers.
It was recognized in Sect.2.3 that the increase in CNvol (Figs. 6g, 1a) with simultaneous decrease in CNhot (plot e) in FT layers is indicative of cloud outflow as observed between approximately 2400 and 3000 m.Further, it was argued that simultaneous reductions of CO (plot f) and O 3 (plot h) as illustrated here suggest deep convective cloud outflow from the ITCZ because clean remote MBL air is lifted aloft by these clouds.Hence, trajectories between 2400 and 3000 m are consistent with in situ data.Moreover, this consistency is enforced since modeled precipitation accumulated along each trajectory path is highest at this elevation (70-80 mm/12 days).
In Fig. 6e, f and h showing CNhot, CO, and O 3 , respectively, we can see that layers near 1500 and 2000 m reveal similarly elevated values for these tracers (compare with Fig. 1), suggesting pollution of similar origin for both (Table 1).These layers are divided by a smaller feature at approximately 1750 m with decreased CNhot, CO, and O 3 by 45 cm −3 , 8 ppbv, and 13 ppbv compared to pollution layer values, respectively.Since MR values increase by 6 g kg −1 (Fig. 1b; coloration not shown) in this feature, compared to about 4 g kg −1 for both pollution layers, this could suggest cloud passage.This is further supported by elevated CNvol concentration (Figs.1a, 6g) for this feature.Hence, because elevated CNvol concentrations are observed in the lower pollution layer as well, this suggests both pollution layers and the small outflow feature at 1750 m are coupled.In fact, modeled precipitation tendencies (Fig. 6d) indicate elevated amounts for the small scavenged pollution feature near 1750 m (55 mm/12 days) in comparison to the pollution layers (30 mm/12 days).Additionally, at the end of our discussion in Sect.2.3 we concluded that scavenged pollution air masses would generally retain higher CO and O 3 values compared to lower tracer amounts present in clean air lofted from the MBL in ITCZ convection over the Pacific.This can be clearly seen in Fig. 6f and h.
We further examine temporally resolved GDAS precipitation along each path as shown in Fig. 7. Here, back trajectories are color-coded with time beginning along the flight track (dark red), while increased precipitation is expressed in increased marker size.The increasing sizes of blue circles in the lower left corner indicate modeled precipitation intensities of 1, 5, and 10 mm per hour, respectively.The plot reveals one center of heavier precipitation within the ITCZ associated with the outflow layer between 2400 and 3000 m (compare with Fig. 6b, d).Its location near 120 • W, 9 • N, agrees remarkably well with observed deep convection as evident in the underlying GOES IR satellite image from 6 August 2007 23:59 UTC, corresponding to when the trajectories were in this location.
The second precipitation field associated with pollution trajectories is less intense and scattered between 100 • W and the Andes.This may conflict with our initial assumption that air masses from the Amazon basin have to pass the Andes, often associated with deep convection forming on their eastern slopes as result of lifting (Andreae and Merlet, 2001).However, careful evaluation shows this precipitation away from the Andes is associated to those trajectories starting along the transition region (2000-2400 m) between ITCZ outflow and pollution but also cloud outflow trajectories of which some show secondary lifting between 80 and 100 • W (e.g., Fig. 6c).
This transition region needs to be examined more carefully given our estimated model altitude uncertainties after creases (Figure 6g).At the same time, CO values (Fig. 6f) are around 56 ppbv (cyan color) similar to those observed in the scavenged pollution layer near 1750 m.Hence, estimation of trajectory consistency within this layer is inconclusive and back trajectories should be interpreted with caution.

Atmos
Another independent data set shown in Figure 8 can further be employed to check trajectory consistency.This figure illustrates Cloud Aerosol LIdar with Orthogonal Polarization (CALIOP) satellite information (http://www-calipso. larc.nasa.gov/products/lidar/browseimages/production/).It shows a satellite overpass from 29 July 2007, chosen to correspond to the approximate arrival time and location (red oval) of back trajectories over South America.The associated CALIOP aerosol product (orange mask) reveals an enhanced aerosol layer extending up to 3 km and scattered aerosol associated with deep convection up to 8 km over the Amazon basin.
CALIOP data also reveal a layer spreading west near mountain chain tops at around 4 km associated with clouds (cyan mask).Note that the latter observation suggesting pollution transport over the Andes and as such over the Pacific was observed near 75 • W, 13 • S, while back trajectories end near 75 • W, 5 • N. Nevertheless, we expect similar deep convection near 75 • W, 5 • N, which is supported by additional overpasses around 29 July 2007 (not shown).Overall, CALIOP data and MODIS imagery (not shown) for this date indicate the source to be slash-and-burn aerosol from Amazonian rainforest, although our 12-day back trajectories do not extend far enough back in time to actually show a clear ascent of the modeled air parcels out of the Amazon basin (only suggested in Figure 6c).
This discussion reveals that our cluster-visualization approach is applicable for this profile.We color-coded in-situ tracers onto back trajectories and demonstrated that two distinct trajectory clusters with different source regions exhibit aerosol physiochemistry consistent with those sources and with supplementary satellite observations.We further com-pared this profile with another one conducted at the end of this research flight and both show good agreement in reflecting similar HYSPLIT flow regimes for pollution and outflow 1110 layers (not shown).In spite of a temporal difference of about 8 hours between the profiles, trajectory consistency was evident.Moreover, since marked changes in trajectories correspond to variations in the measurements over a single research profile and are linked to air mass origins, a single pro-  The profile was conducted during research flight 14, the last flight of the campaign.This descent around 21:30 UTC on 1120 6 September seems similar to the profile discussed before with modeled air parcels just above the inversion (near 1100 m) gradually subsiding toward the study region from South America (Figs.9a-c), while some of the modeled air parcels arriving at higher altitudes (green-yellow) reveal lifting and 1125 looping near 110 • W, 11 • N.However, in-situ tracers show a more structured view.
Our chosen tracers CNhot and CO color-coded on trajectories in Figures 9e and f, respectively, illustrate six layers of varying depth.The first layer situated right above the in-1130 version extends up to 1200 m and contains the highest CNhot and CO values of the profile of 320 cm −3 and 94 ppbv (both in dark red colors), respectively.This small layer is followed by a feature extending up to 1700 m revealing decreased CNhot and CO of 40 cm −3 (dark blue) and 67 ppbv 1135 (light blue), respectively.Above that layer between 1700-2400 m, another feature is evident marked by increases in both tracers (both in yellow-green colors).However, both tracer values are about 30 % smaller compared to the layer right above the inversion.The fourth feature with the largest 1140 depth in this profile is located between around 2400-3200 m (green-yellow color in Figs.9a-c) and reveals decreased CNhot and CO of around 40 cm −3 and 60 ppbv (both in dark blue), respectively.Above this, a feature located between 3200-3600 m exhibiting increasing CNhot and CO values up 1145 to 150 cm −3 (light green) and 69 ppbv (cyan), respectively.The layer situated at the top of this profile between 3600-3900 m reveals a CNhot increase up to 220 cm −3 (yellow), while CO decreases down to 58 ppbv (dark blue).
The uppermost layer between 3600-3900 m with an in-1150 verse relation between CNhot and CO indicates a feature other than pollution or recent ITCZ cloud outflow (Table 1).Since 12-day back trajectories in Figures 9a and b   CALIOP data also reveal a layer spreading west near mountain chain tops at around 4 km associated with clouds (cyan mask).Note that the latter observation suggesting pollution transport over the Andes and as such over the Pacific was observed near 75 • W, 13 • S, while back trajectories end near 75 • W, 5 • N. Nevertheless, we expect similar deep convection near 75 • W, 5 • N, which is supported by additional overpasses around 29 July 2007 (not shown).Overall, CALIOP data and MODIS imagery (not shown) for this date indicate the source to be slash-and-burn aerosol from Amazonian rainforest, although our 12 day back trajectories do not extend far enough back in time to actually show a clear ascent of the modeled air parcels out of the Amazon basin (only suggested in Fig. 6c).
This discussion reveals that our cluster-visualization approach is applicable for this profile.We color-coded in situ tracers onto back trajectories and demonstrated that two distinct trajectory clusters with different source regions exhibit aerosol physiochemistry consistent with those sources and with supplementary satellite observations.We further compared this profile with another one conducted at the end of this research flight and both show good agreement in reflecting similar HYSPLIT flow regimes for pollution and outflow layers (not shown).In spite of a temporal difference of about 8 h between the profiles, trajectory consistency was evident.Moreover, since marked changes in trajectories correspond to variations in the measurements over a single research profile and are linked to air mass origins, a single profile can characterize layers with different air mass history for this flight.

Descent around 21:30 UTC, 6 September
The profile was conducted during research flight 14, the last flight of the campaign.This descent around 21:30 UTC on 6 September seems similar to the profile discussed before with modeled air parcels just above the inversion (near 1100 m) gradually subsiding toward the study region from South America (Fig. 9a-c), while some of the modeled air parcels arriving at higher altitudes (green-yellow) reveal lifting and looping near 110 • W, 11 • N.However, in situ tracers show a more structured view.
Our chosen tracers CNhot and CO color-coded on trajectories in Fig. 9e and f, respectively, illustrate six layers of varying depth.The first layer situated right above the inversion extends up to 1200 m and contains the highest CNhot and CO values of the profile of 320 cm −3 and 94 ppbv (both in dark red colors), respectively.This small layer is followed by a feature extending up to 1700 m revealing decreased CNhot and CO of 40 cm −3 (dark blue) and 67 ppbv (light blue), respectively.Above that layer between 1700 and 2400 m, another feature is evident marked by increases in both tracers (both in yellow-green colors).However, both tracer values are about 30 % smaller compared to the layer right above the inversion.The fourth feature with the largest depth in this profile is located between around 2400 and 3200 m (greenyellow color in Fig. 9a-c) and reveals decreased CNhot and CO of around 40 cm −3 and 60 ppbv (both in dark blue), respectively.Above this, a feature located between 3200 and 3600 m exhibiting increasing CNhot and CO values up to 150 cm −3 (light green) and 69 ppbv (cyan), respectively.The layer situated at the top of this profile between 3600 and 3900 m reveals a CNhot increase up to 220 cm −3 (yellow), while CO decreases down to 58 ppbv (dark blue).
The uppermost layer between 3600 and 3900 m with an inverse relation between CNhot and CO indicates a feature other than pollution or recent ITCZ cloud outflow (Table 1).Since 12 day back trajectories in Fig. 9a  For the remaining five layers below the stratospheric feature it is not immediately clear which of these layers is a pollution feature or an ITCZ cloud outflow layer because accumulated precipitation illustrated in plot d is increased for three of the five layers.Hence, we will try to use the conjunction of in situ measurements and trajectory information in our cluster-visualization approach.Instead of analyzing trajectories utilizing in situ data as done for the first case study, we assume the modeled pathways to be correct for now and try to interpret in situ observations based on this assumption and then examine any inconsistencies over the profile.our cluster-visualization approach.Instead of analyzing trajectories utilizing in-situ data as done for the first case study, we assume the modeled pathways to be correct for now and try to interpret in-situ observations based on this assumption and then examine any inconsistencies over the profile.

1170
We begin our discussion assuming the largest layer between 2400-3200 m to be recent cloud outflow from the ITCZ because most back trajectories in green to yellow coloration in Figures 9a-c show a change in altitude by almost 6 km and looping near 110 • W, 11 • N. At the same time, mod-1175 eled precipitation is around 70-120 mm.Since CNhot and CO reduction (see above) are connected with CNvol increasing to 620 cm −3 , while O 3 drops to 22 ppbv, consistent with recent ITCZ cloud outflow (Table 1), we conclude back trajectories are reliable for this layer.We continue our assessment for those trajectories reaching back to the Amazon basin.These are the three lower layers marked in dark blue to cyan coloration (Figures 9a-c).Modeled pathways for the upper two layers (blue-cyan) end over the Amazon basin at altitudes between 4-8 km, while We discussed CO and CNhot concentrations for these layers at the beginning of this section and found these to be el-1200 evated between 1100-1200 m and 1700-2400 m.Similar behavior is observed for O 3 suggesting a pollution source consistent with the trajectories.The layer embedded in between However, although tracers for these features suggest consistency with trajectories, modeled precipitation values appear too low for scavenged pollution (around 10 mm), while modeled precipitation seems too high (near 65 mm) in the lower-1215 most pollution layer with the strongest pollution signal.
These data can be compared with CALIOP satellite information corresponding to the approximate arrival time (compare with Figure 10) and location (red oval) of back trajectories over South America (28 August 2007) as illustrated 1220 in Figure 11.The associated aerosol product (orange) and cloud mask (cyan) reveal heavy deep convection extending up to 15 km connected to aerosol in higher altitudes, which suggests lifting of slash-and-burn aerosol from Amazonian rainforest as previously showed (Andreae et al., 2001;An-1225 dreae andMerlet, 2001).Additionally, the vertical scale of observed heavy deep convection corresponds with the arrival altitudes of these pollution trajectories (also compare with lower cloud tops in Fig. 8 and lower arrival altitudes of back trajectories in Fig. 6c), although in this case precipitation 1230 tendencies are weakly modeled for the lower pollution features.
The highest apparent pollution layer (3200-3600 m) evident between recent ITCZ outflow and the previously mentioned stratospheric feature is marked in red in Figures 9a-   We begin our discussion assuming the largest layer between 2400 and 3200 m to be recent cloud outflow from the ITCZ because most back trajectories in green to yellow coloration in Fig. 9a-c show a change in altitude by almost 6 km and looping near 110 • W, 11 • N. At the same time, modeled precipitation is around 70-120 mm.Since CNhot and CO reduction (see above) are connected with CNvol increasing to 620 cm −3 , while O 3 drops to 22 ppbv, consistent with recent ITCZ cloud outflow (Table 1), we conclude back trajectories are reliable for this layer.
Illustrations of 12 day trajectories for this profile in Fig. 10 overlaid on a GOES IR image from 23:59 UTC on 28 August 2007 also strengthen this case.These back trajectories are color-coded with time (light blue color corresponds to record time of IR image), while marker size increases with increasing intensity of precipitation along the path.As in the earlier profile (compare with Fig. 7), we observe consistency of observed deep convection with the modeled center of heaviest precipitation and lifting in the ITCZ near 110 • W, 11 • N.
We continue our assessment for those trajectories reaching back to the Amazon basin.These are the three lower layers marked in dark blue to cyan coloration (Fig. 9a-c).Modeled pathways for the upper two layers (blue-cyan) end over the Amazon basin at altitudes between 4 and 8 km, while the lowermost trajectories (dark blue) arrive near 70 • W, 5 • N. At this location marked changes of 2-4 km are shown corresponding to lifting on the eastern slopes of the Andes.
We discussed CO and CNhot concentrations for these layers at the beginning of this section and found these to be elevated between 1100-1200 m and 1700-2400 m.Similar behavior is observed for O 3 suggesting a pollution source consistent with the trajectories.The layer embedded in between these pollution layers reveals decreasing CNhot and O 3 to 40 cm −3 and 22 pbbv, respectively, while CNvol increases to 670 cm −3 .Although this marks the highest CNvol concentration of the profile, this layer is indicative of precipitation scavenging of pollution rather than ITCZ outflow.Scavenged our cluster-visualization approach.Instead of analyzing trajectories utilizing in-situ data as done for the first case study, we assume the modeled pathways to be correct for now and try to interpret in-situ observations based on this assumption and then examine any inconsistencies over the profile.
We begin our discussion assuming the largest layer between 2400-3200 m to be recent cloud outflow from the ITCZ because most back trajectories in green to yellow coloration in Figures 9a-c show a change in altitude by almost 6 km and looping near 110 • W, 11 • N. At the same time, modeled precipitation is around 70-120 mm.Since CNhot and CO reduction (see above) are connected with CNvol increasing to 620 cm −3 , while O 3 drops to 22 ppbv, consistent with recent ITCZ cloud outflow (Table 1), we conclude back trajectories are reliable for this layer.We continue our assessment for those trajectories reaching back to the Amazon basin.These are the three lower layers marked in dark blue to cyan coloration (Figures 9a-c).Modeled pathways for the upper two layers (blue-cyan) end over the Amazon basin at altitudes between 4-8 km, while the lowermost trajectories (dark blue) arrive near 70 • W, 5 • N. At this location marked changes of 2-4 km are shown corresponding to lifting on the eastern slopes of the Andes.

Illustrations of 12-day trajectories for this profile in
We discussed CO and CNhot concentrations for these layers at the beginning of this section and found these to be elevated between 1100-1200 m and 1700-2400 m.Similar behavior is observed for O 3 suggesting a pollution source consistent with the trajectories.The layer embedded in between However, although tracers for these features suggest consistency with trajectories, modeled precipitation values appear too low for scavenged pollution (around 10 mm), while modeled precipitation seems too high (near 65 mm) in the lower-1215 most pollution layer with the strongest pollution signal.
These data can be compared with CALIOP satellite information corresponding to the approximate arrival time (compare with Figure 10) and location (red oval) of back trajectories over South America (28 August 2007) as illustrated 1220 in Figure 11.The associated aerosol product (orange) and cloud mask (cyan) reveal heavy deep convection extending up to 15 km connected to aerosol in higher altitudes, which suggests lifting of slash-and-burn aerosol from Amazonian rainforest as previously showed (Andreae et al., 2001;An-1225 dreae andMerlet, 2001).Additionally, the vertical scale of observed heavy deep convection corresponds with the arrival altitudes of these pollution trajectories (also compare with lower cloud tops in Fig. 8 and lower arrival altitudes of back trajectories in Fig. 6c), although in this case precipitation 1230 tendencies are weakly modeled for the lower pollution features.
The highest apparent pollution layer (3200-3600 m) evident between recent ITCZ outflow and the previously mentioned stratospheric feature is marked in red in Figures 9a-   pollution air masses generally retain higher CO values compared to lower values of this tracer present in air lofted from the remote MBL in ITCZ convection over the Pacific.However, although tracers for these features suggest consistency with trajectories, modeled precipitation values appear too low for scavenged pollution (around 10 mm), while modeled precipitation seems too high (near 65 mm) in the lowermost pollution layer with the strongest pollution signal.
These data can be compared with CALIOP satellite information corresponding to the approximate arrival time (compare with Fig. 10) and location (red oval) of back trajectories over South America (28 August 2007) as illustrated in Fig. 11.The associated aerosol product (orange) and cloud mask (cyan) reveal heavy deep convection extending up to 15 km connected to aerosol in higher altitudes, which suggests lifting of slash-and-burn aerosol from Amazonian rainforest as previously shown (Andreae et al., 2001;Andreae and Merlet, 2001).Additionally, the vertical scale of observed heavy deep convection corresponds with the arrival altitudes of these pollution trajectories (also compare with lower cloud tops in Fig. 8 and lower arrival altitudes of back trajectories in Fig. 6c), although in this case precipitation tendencies are weakly modeled for the lower pollution features.
The highest apparent pollution layer (3200-3600 m) evident between recent ITCZ outflow and the previously mentioned stratospheric feature is marked in red in Fig. 9a-c.The above discussion reveals that the cluster-visualization approach can be used to describe potential sources for in situ observations made during this profile.Although precipitation tendencies are weakly modeled for the lower pollution features, we showed overall back trajectory consistency for six layers over a vertical range of 2800 m representing four distinct air mass features (stratospheric air, ITCZ outflow, pollution, scavenged pollution).Comparison of this profile with two other profiles from the same research flight indicate similar HYSPLIT flow regimes for these layers (not shown) in spite of temporal differences of a couple of hours and a spatial separation of about 100 km between the three profiles.

Summary and conclusions
The work presented here discusses the visualization of in situ tracers superimposed upon FT back trajectories initiated along flight tracks and profiles every 10 s for an "off-theshelf" air history analysis.A simultaneous evaluation of air mass history and back trajectory consistency can be carried out by interpreting the conjunction of these high-frequency trajectory and tracer data sets.This procedure can be of particular value for rapid assessment of data being collected during intense airborne campaigns and is summarized below.
First, the high number of back trajectories used enhances assessment of trajectory reliability because clusters of air mass pathways with spatially consistent patterns of in situ tracer values are statistically and visually more representative than a single trajectory.As a result, a maximum length of back trajectories can be estimated by examining whether the majority of trajectories form a tight band.This interpretation can be more suspect at points where the clusters disintegrate into single trajectories or become irregular.This is illustrated in Fig. 5b where the tight band of trajectories (orange lines) with similar pollution signatures breaks into smaller ensembles near South America of which three stronger bands reach the Amazon basin and a weaker one Panama and Colombia.Further, illustrating these trajectories over 15 days (not shown) reveals complete disintegration of these bands.Nonetheless, 12 day trajectories suggest the Amazon to be the most likely source region because three tighter clusters end there.
Second, when marked changes in trajectories are associated with marked visual changes in measured physical, chemical, or thermodynamic properties then trajectory reliability is reinforced.This is particularly true when these properties are associated with trajectory paths characteristic of recognized sources or processes.As demonstrated in Sect.5, such combined information can be a useful tool for estimation of likely origins of pollution and cloud outflow features for portions of the flight track.
Third, utilizing the high density of meteorological model data we demonstrated modeled precipitation tendencies can be employed to better interpret measurements.We showed that precipitation events are linked to scavenging of aerosol in both ITCZ and continental clouds.
Although back trajectories are a valuable tool for an "offthe-shelf" air history analysis, it is important to be familiar with the limitations of trajectory modeling as reviewed in this study.Despite the fact that back trajectories typically show good agreement with the actual air mass corridor in synopticscale flow and in the free troposphere, modeled pathways will likely disagree if the measured air mass history is governed by turbulent mixing and/or convection as commonly observed in the boundary layer.Nevertheless, the cloud parameterization technique and resolution of the GDAS meteorological model utilized for this analysis was found to be sufficient to detect transport through deep convection (e.g., ITCZ).However, because this technique is likely to fail for localized cellular convection, MBL trajectories are not considered in this study.
This work also discussed several methods that could provide a sense of how susceptible back trajectories are to deviations from the actual air mass corridor due to meteorological errors and also subgrid-scale phenomena such as turbulent mixing and convection.This was done by evaluating the behavior of numerical errors in distinct synoptic flow regimes (Table 2, Fig. 2).We further investigated the contribution of wind field uncertainties to the trajectory deviation from the actual air mass corridor by comparing modeled data with our in situ observations (Fig. 3).Even so, these analyses cannot replace quantitative estimates obtained by validating the trajectory model in its entirety, for instance, with the help of air mass tracking balloons.
As such, researchers often have to rely on quantitative approximations from previous studies.While such approximations emerge as a useful guideline they may not always be appropriate since meteorological errors and subgrid-scale phenomena vary greatly both geographically and seasonally.Hence, we suggest combining the aforementioned analyses together with the visualization of in situ tracers superimposed upon back trajectories.Such information can then facilitate subsequent flight planning for targeted objectives.In addition, time and space limitations only allowed us to include a few gas and aerosol indicators as example "tracers."Additional constraints and understanding will be enhanced through similar visualizations of supplementary gas, aerosol, thermodynamic, and microphysical measurements.
Following the outlined procedure, we demonstrated consistency between our measurements and HYSPLIT trajectories that extend back to sources over 10 000 km away on the South American continent.We attribute this to the wellconstrained flow and low synoptic variability in this equatorial region during PASE.Similar evaluations may yield different conclusions for other more varied regions.

Fig. 2 :
Fig. 2: Relationships between total travel distances accumulated over 12 days and corresponding numerical errors in the vertical plane from F/B tests for (a) a selected set of back trajectories approaching South America and (b) a group of trajectories traversing the ITCZ.Median values are marked as green dots.

Fig. 2 .
Fig. 2. Relationships between total travel distances accumulated over 12 days and corresponding numerical errors in the vertical plane from F/B tests for (a) a selected set of back trajectories approaching South America and (b) a group of trajectories traversing the ITCZ.Median values are marked as green dots.

Fig. 3 :
Fig. 3: Modeled data vs. in-situ measurements along FT flight tracks for (a) MR, (b) θ, (c) wind direction and (d) wind velocity for all research flights.The grey dashed lines in c) and d) mark the 1:1 relationship.Anomalous FT data from research flight 12 are highlighted in red in the lower row.See section 3 for discussion.

Fig. 3 .
Fig. 3. Modeled data vs. in situ measurements along FT flight tracks for (a) MR, (b) θ , (c) wind direction and (d) wind velocity for all research flights.The grey dashed lines in (c) and (d) mark the 1 : 1 relationship.Anomalous FT data from research flight 12 are highlighted in red in the lower row.See Sect. 3 for discussion.
and b for MR and potential temperature (θ) over altitude, respectively.Measurements for all fourteen research flights are illustrated by the dashed black lines, while modeled data are presented in red dashed lines.Comparisons of wind field relations in terms of direction (plot c) and velocity (in d) are also shown and are split into MBL and FT data marked in black and blue dots, respectively.We include MBL data here, although MBL back trajectories are not investigated (see Sect. 2.1), for clarity and significance of these comparisons.All in situ data (averaged to 10 s resolution; see Sect.3.1) are smoothed for these comparisons utilizing a 31 pt Gaussian filter to account for much lower resolution of modeled data in time and space.Based on PASE flight climb and descent rates (around 300 m min −1 ), this 31 pt Gaussian filter corresponds to effective smoothing (2σ ) over roughly 750 m.

Fig. 4 :
Fig. 4: Histograms and corresponding CDFs of measured CO (upper row) and CNhot (lower row) as obtained by classifying all back trajectories (data set T, column 1) and in-situ observations (data set M, column 2) into two distinct air mass groups.ITCZ and South American air masses are marked in red and black color, respectively.See section 4 for additional information.

Fig. 4 .
Fig. 4. Histograms and corresponding CDFs of measured CO (upper row) and CNhot (lower row) as obtained by classifying all back trajectories (data set T, column 1) and in situ observations (data set M, column 2) into two distinct air mass groups.ITCZ and South American air masses are marked in red and black color, respectively.See Sect. 4 for additional information.
Fig. 5: Back trajectories associated with distinct air mass features observed during PASE for (a) convective cloud outflow and (b) pollution.The magenta rectangle identifies the region of heaviest deep convection (ITCZ) during PASE.Trajectories marked in blue pass through the magenta rectangle, while those colored orange bypass the "ITCZ" box.

5
In this section, we discuss our trajectory-visualization approach for two research flight profiles representative of the generally homogeneous equatorial flow observed during PASE.We will examine its applicability for rapid air mass history analysis under commonly observed research flight 930 conditions by addressing the following questions: Do trajectories exhibiting similar aerosol physiochemistry reveal similar characteristics of recognized sources or processes?Are back trajectories consistent over the temporal extent of a research flight when the flight is conducted such that sim-935 ilar air mass features are encountered multiple times during the flight?Are transition areas between two distinct aerosol regimes represented adequately by trajectories?5.1 Ascent around 19:00 UTC, 10 August

Fig. 5 .
Fig. 5. Back trajectories associated with distinct air mass features observed during PASE for (a) convective cloud outflow and (b) pollution.The magenta rectangle identifies the region of heaviest deep convection (ITCZ) during PASE.Trajectories marked in blue pass through the magenta rectangle, while those colored orange bypass the "ITCZ" box.

1040
tion along each path as shown in Figure7.Here, back trajectories are color-coded with time beginning along the flight track (dark red), while increased precipitation is expressed in increased marker size.The increasing sizes of blue circles in the lower left corner indicate modeled precipitation intensi-1045 ties of 1, 5, and 10 mm per hour, respectively.The plot reveals one center of heavier precipitation within the ITCZ associated with the outflow layer between 2400-3000 m (compare with Figs.6b, d).Its location near 120 • W, 9 • N, agrees remarkably well with observed deep convection as evident in 1050 the underlying GOES IR satellite image from 6 August 2007 23:59 UTC, corresponding to when the trajectories were in this location.

Fig. 7 .
Fig. 7. Back trajectories from case study I overlaid on a GOES IR image from 6 August, 23:59 UTC.Trajectories are color-coded with time, while marker size reflects modeled precipitation along the path.The blue circles in the lower left corner exemplify precipitation intensities of 1/5/10 mm.

Fig. 8 :
Fig. 8: CALIOP feature mask for Case Study I from 29 July, 06:30 UTC, highlighting pollution and cloud features in orange and cyan, respectively.The region of pollution trajectory ends is circled red.
1115file can characterize layers with different air mass history for this flight.

5. 2
Descent around 21:30 UTC, 6 September (dark red)   reveal this layer subsided from 10 km into the study area, this suggests stratospheric air mixed into the troposphere by 1155 deep convection indicated via looping of these trajectories near 120 • W, 5 • N.This is further supported by very low MR values dropping to 1 g/kg (not shown) and O 3 increasing to around 47 pbbv (Figure9h, red coloration).

Fig. 8 .
Fig. 8. CALIOP feature mask for case study I from 29 July, 06:30 UTC, highlighting pollution and cloud features in orange and cyan, respectively.The region of pollution trajectory ends is circled red.

Fig. 10 :
Fig. 10: As Figure 7 but for Case Study II.Trajectories are overlaid on a GOES IR image from 28 August, 23:54 UTC.

1180
Illustrations of 12-day trajectories for this profile inFigure 10 overlaid on a GOES IR image from 28 August 2007 23:59 UTC also strengthen this case.These back trajectories are color-coded with time (light blue color corresponds to record time of IR image), while marker size increases with 1185 increasing intensity of precipitation along the path.As in the earlier profile (compare with Figure 7), we observe consistency of observed deep convection with the modeled center of heaviest precipitation and lifting in the ITCZ near 110 • W, 11 • N. 1190

1195
the lowermost trajectories (dark blue) arrive near 70 • W, 5 • N. At this location marked changes of 2-4 km are shown corresponding to lifting on the eastern slopes of the Andes.

Fig. 11 :
Fig. 11: As Figure 8 but for Case Study II showing a CALIOP mask from 28 August, 06:45 UTC. 1235 c. Figure9bshows its location 12 days earlier near Panama (80 • W, 9 • N).At this location, plot c shows lifting from 3 to 6 km, while plot d illustrates elevated modeled precipitation peaking around 50 mm.Since Figure10suggests the majority of precipitation to occur over the ocean near Panama, this 1240 indicates potential scavenging of pollution near the source.This is supported by increased CNvol concentrations around

Fig. 10 .
Fig. 10.As Fig. 7 but for case study II.Trajectories are overlaid on a GOES IR image from 28 August, 23:54 UTC.

S
Fig. 10: As Figure 7 but for Case Study II.Trajectories are overlaid on a GOES IR image from 28 August, 23:54 UTC.
Figure  10overlaid on a GOES IR image from 28 August 2007 23:59 UTC also strengthen this case.These back trajectories are color-coded with time (light blue color corresponds to record time of IR image), while marker size increases with increasing intensity of precipitation along the path.As in the earlier profile (compare with Figure7), we observe consistency of observed deep convection with the modeled center of heaviest precipitation and lifting in the ITCZ near 110 • W, 11 • N.

Fig. 11 :
Fig. 11: As Figure 8 but for Case Study II showing a CALIOP mask from 28 August, 06:45 UTC. 1235 c. Figure9bshows its location 12 days earlier near Panama (80 • W, 9 • N).At this location, plot c shows lifting from 3 to 6 km, while plot d illustrates elevated modeled precipitation peaking around 50 mm.Since Figure10suggests the majority of precipitation to occur over the ocean near Panama, this 1240 indicates potential scavenging of pollution near the source.This is supported by increased CNvol concentrations around
Figure  9bshows its location 12 days earlier near Panama (80 • W, 9 • N).At this location, plot c shows lifting from 3 to 6 km, while plot d illustrates elevated modeled precipitation peaking around 50 mm.Since Fig.10suggests the majority of precipitation to occur over the ocean near Panama, this indicates potential scavenging of pollution near the source.This is supported by increased CNvol concentrations around 320 cm −3 and also elevated CNhot, CO, and O 3 , which are above those found in ITCZ outflow between 2400 and 3200 m.Even so, considering our nominal total error estimate of 2500 km for 12 day back trajectories (Table2) needed to reliably separate convective cloud outflow from continental pollution, no distinction can be made here between the Amazonian biomass and coastal pollution www.atmos-meas-tech.net/7 inlet and the used tubing setup measure particles with an efficiency close to 100 % in the submicron range (< 1 µm) and as such provide effective sampling of aerosol properties discussed in Sect.2.3.All measurements are synchronized for variable transport times between the inlet and instruments.

Table 2 .
The 95 and 50 % percentiles of numerical (columns 3, 4) and total errors (column 6) from 6084 FT back trajectories for various trajectory timescales between 5 and 20 days.Relative numerical errors are provided in parentheses where these are equal or higher than 1 % of the total trajectory travel distance (column 5).