A new method for long-term source apportionment with time-dependent factor proﬁles and uncertainty assessment using SoFi Pro: application to 1 year of organic aerosol data

. A new methodology for performing long-term source apportionment (SA) using positive matrix factorization (PMF) is presented. The method is implemented within the SoFi Pro software package and uses the multilinear engine (ME-2) as a PMF solver. The technique is applied to a 1-year aerosol chemical speciation monitor (ACSM) dataset from downtown Zurich, Switzerland. The measured organic aerosol mass spectra were analyzed PMF a small (14 d) mer (0.6 µgm − 3 , 12.0 %), slightly higher mean concentrations during spring and fall (1.0 and 1.5 µgm − 3 , or 15.6 % and 18.6 %, respectively), and the highest mean concentrations during winter (1.9 µgm − 3 , 25.0 %). In summer, OOA is separated into SV-OOA and LV-OOA, with mean concentrations of 1.4 µgm − 3 (26.5 %) and 2.2 µgm − 3 (40.3 %), re-spectively. For the remaining seasons the seasonal concentrations of SV-OOA, LV-OOA and OOA range from 0.3 to 1.1 µgm − 3 (3.4 %–15.9 %), from 0.6 to 2.2 µgm − 3 (7.7 %– 33.7 %) and from 0.9 to 3.1 µgm − 3 (13.7 %–39.9 %), respec-tively. The relative PMF errors modeled for this study for HOA, COA, BBOA, LV-OOA, SV-OOA and OOA are on average ± 34%, ± 27%, ± 30%, ± 11%, ± 25% and ± 12%, respectively.


Abstract.
A new methodology for performing long-term source apportionment (SA) using positive matrix factorization (PMF) is presented. The method is implemented within the SoFi Pro software package and uses the multilinear engine (ME-2) as a PMF solver. The technique is applied to a 1-year aerosol chemical speciation monitor (ACSM) dataset from downtown Zurich, Switzerland.
The measured organic aerosol mass spectra were analyzed by PMF using a small (14 d) and rolling PMF window to account for the temporal evolution of the sources. The rotational ambiguity is explored and the uncertainties of the PMF solutions were estimated. Factor-tracer correlations for averaged seasonal results from the rolling window analysis are higher than those retrieved from conventional PMF analyses of individual seasons, highlighting the improved performance of the rolling window algorithm for long-term data.
In this study four to five factors were tested for every PMF window. Factor profiles for primary organic aerosol from traffic (HOA), cooking (COA) and biomass burning (BBOA) were constrained. Secondary organic aerosol was represented by either the combination of semi-volatile and low-volatility organic aerosol (SV-OOA and LV-OOA, re-spectively) or by a single OOA when this separation was not robust. This scheme led to roughly 40 000 PMF runs. Full visual inspection of all these PMF runs is unrealistic and is replaced by predefined user-selected criteria, which allow factor sorting and PMF run acceptance/rejection. The selected criteria for traffic (HOA) and BBOA were the correlation with equivalent black carbon from traffic (eBC tr ) and the explained variation of m/z 60, respectively. COA was assessed by the prominence of a lunchtime concentration peak within the diurnal cycle. SV-OOA and LV-OOA were evaluated based on the fractions of m/z 43 and 44 in their respective factor profiles. Seasonal pre-tests revealed a noncontinuous separation of OOA into SV-OOA and LV-OOA, in particular during the warm seasons. Therefore, a differentiation between four-factor solutions (HOA, COA, BBOA and OOA) and five-factor solutions (HOA, COA, BBOA, SV-OOA and LV-OOA) was also conducted based on the criterion for SV-OOA.

Introduction
Atmospheric aerosols are at the center of scientific and political air quality discussions due to their highly uncertain direct and indirect climate effects (IPCC, 2013) and negative impact on human health (e.g., Peng et al., 2005). Regulatory policies addressing these effects require characterization and understanding of aerosol physicochemical properties, sources and formation processes. During the past years, the study of submicron particulate matter (PM 1 ) has gained interest (Hallquist et al., 2009), in particular the organic fraction comprising 20 %-90 % of the total submicron aerosol mass . Atmospheric aerosols are typically classified as primary or secondary aerosols, where primary aerosols are directly emitted into the atmosphere and secondary aerosols are formed by reaction of precursor gases. Aerodyne aerosol mass spectrometers (AMSs) and aerosol chemical speciation monitors (ACSMs) have become important and widely used instruments for the online chemical characterization of non-refractory submicron aerosol (NR-PM 1 ) (Canagaratna et al., 2007;Ng et al., 2011b;Fröhlich et al., 2013). The resulting aerosol data can be utilized to study seasonal trends of PM 1 sources to support emission reduction strategies. This is highly relevant for very polluted areas like China and India but also for Europe, where particulate matter concentrations substantially decreased during the last 2 decades but still frequently exceed legal thresholds (Barmpadimos et al., 2011(Barmpadimos et al., , 2012European Environment Agency, 2019).
Source apportionment of organic aerosol (OA) measured with an AMS and/or ACSM is typically performed using the positive matrix factorization algorithm (PMF, Paatero and Tapper, 1994). PMF solutions describe the complex, timedependent organic aerosol composition as a linear combination of static factor profiles (for AMS/ACSM data, mass spectra) and their time-dependent contributions. Factors can represent a primary organic aerosol emission (POA) or secondary organic aerosol (SOA).
Many organic source apportionment studies with AMS (see review by Zhang et al., 2011) and ACSM data (e.g., Aurela et al., 2015;Budisulistiorini et al., 2013;Canonaco et al., 2013;Fröhlich et al., 2015;Li et al., 2017;Minguillón et al., 2015;Reyes-Villegas et al., 2016;Ripoll et al., 2015;Schlag et al., 2016;Sun et al., 2013Sun et al., , 2018Tiitta et al., 2014;Zhang et al., 2019;Zhu et al., 2018) have successfully employed the PMF algorithm. PMF results suffer from rotational ambiguity (Paatero et al., 2002); i.e., several PMF results exist with a similar goodness of fit. An approximate method for the quantification of the rotational uncertainty, i.e., the amount of rotational ambiguity (Paatero et al., 2014), can be obtained using the global f peak tool, which allows exploration of a single one-dimensional transect through the multidimensional solution space and is discussed for AMS data in Ulbrich et al. (2009). This approach only leads to a rough estimate of the rotational uncertainty, as it allows investigation of only a single transect whose selection is uncontrollable, while other rotations remain entirely inaccessible. An improved method for both uncertainty estimation and factor resolution was demonstrated by Canonaco et al. (2013), where intelligent exploration of rotations was implemented introducing a priori information in the form of factor profiles in the multilinear engine (ME-2, Paatero, 1999). Moreover, Ulbrich et al. (2009) also estimated the statistical uncertainty via the resampling bootstrap technique (Efron, 1979). This method generates a set of new input matrices for analysis from random resampling of the original input data. This resampling perturbs the input data by including replicates of some points while excluding others, with the main assumption that the overall properties of the analyzed data (fingerprints of the factors, contributions of the factors) are not systematically changed; i.e., changes are purely statistical. If a sufficient number of resamples has been carried out, the variation within the identified factors across all bootstrap runs is regarded as representing their statistical uncertainty.
A crucial limitation of the traditional PMF approach is that the time-dependent variability of the composition of the organic aerosol sources cannot be properly modeled using static profiles in a year-long PMF model. Both POA and SOA may have time-dependent composition. For example, vehicles utilize different fuel blends in winter and summer for traffic (Agrola, 2017), while biomass burning may be dominated by different burning types and/or materials in different seasons, e.g., domestic heating in winter, agricultural waste/residue burning in spring/fall, and wildfires in summer. SOA sources may likewise be affected by seasonal changes in either precursor emissions (e.g., monoterpene emissions increase exponentially with temperature) or physicochemical processes (e.g., gas-particle partitioning, oxidant concentrations) (Hallquist et al., 2009). Amongst others, , Daellenbach et al. (2017) and Sun et al. (2018) showed that ACSM SOA mass spectra possess distinct seasonal trends which need to be considered during the PMF analysis. For Zurich, Stefenelli et al. (2019) and Qi et al. (2019) were able to demonstrate this seasonal variability of SOA characteristics by molecular analysis, with terpenerelated SOA being dominant in summer and aged wood burning organic aerosol being dominant in winter.
Technically, modeling seasonally dependent mass spectra from a given source family, e.g., traffic, biomass burning, or SOA, can be achieved in two ways. PMF can be applied to a multi-season dataset, with time-dependent source composition modeling of a single factor per source or source class, similar to typical representations of SOA in short-term field campaigns by two factors with different degrees of oxygenation . However, multi-factor representations of seasonal changes are likely to significantly increase the complexity of the PMF solution, primarily due to a rapid increase in the number of factors and thus leading to difficulties in interpretation. Another possibility is to perform PMF over a small, moving time frame such that the factor profiles evolve with time while maintaining a single factor per source family. This is likely the best choice for long-term data, due to both the relative simplicity of the solution and important savings in computational and evaluation time. The latter is also more compatible with a continuously growing dataset, e.g., for online source apportionment studies, where the entire dataset does not have to be completely reanalyzed when new data are included, in contrast to classical batch analyses. Parworth et al. (2015) have already shown the effectiveness of such an approach, i.e., employing a small and moving PMF window for analyzing remote long-term ACSM data with only a few unconstrained aerosol sources/components. However, a rotational and statistical uncertainty exploration was not conducted.
This study presents the analysis of ACSM data measured in Zurich between February 2011 and February 2012. The dataset includes several sources that were difficult to separate using unconstrained PMF, which are constrained using known POA sources in ME-2 for a small and rolling time window. This strategy allows us to adequately account for time-dependent variation of the POA and SOA factor profiles. The applied constraining technique allows for a more comprehensive and quantitative assessment of the rotational uncertainty than the global f peak tool could achieve. The statistical uncertainties of PMF solutions are estimated using a bootstrap resampling technique. In this study, the size of the rolling window, tightness of constraints, and several other parameters, e.g., number of PMF repeats per rolling window, are discussed and validated.
2 Instruments and methods

Instrumentation and sampling site
An ACSM (Aerodyne Research, Inc., Billerica, MA, USA) was deployed at the Kaserne station, an urban background station in the city center of Zurich (Switzerland), between February 2011 and February 2012 (Lanz et al., 2007Canonaco et al., 2013). The ACSM is an instrument based on Aerodyne aerosol mass spectrometer (AMS) technology but optimized for long-term measurements with minimal maintenance requirements. The ACSM measures the real-time composition of non-refractory submicron particulate matter, customarily referred to as NR-PM 1 . The instrument is described in detail in Ng et al. (2011b) (see also Jayne et al., 2000;Jimenez et al., 2003;Allan et al., 2003Allan et al., , 2004and Canagaratna et al., 2007, for a more detailed description of the AMS technique). Technical problems in the ACSM inlet system during the last third of the campaign resulted in a total of 2-3 weeks of missing data.
The ACSM in Zurich was operated with a scan rate of 1 s amu −1 between m/z 10 and 140 and produced averaged scans every 15 min. The data were re-averaged to 30 min to obtain higher signal-to-noise ratios for ME-2 analysis. To obtain quantitative mass concentrations for ACSM data, a collection efficiency parameter (CE) needs to be applied to account for the incomplete detection of aerosol species due to particle bounce at the instrument vaporizer (Middlebrook et al., 2012). The effects of the nitrate mass fraction and particle acidity on CE have been parameterized for ambient data (Middlebrook et al., 2012). As discussed previously , CE = 1 for the current study is assumed because of otherwise systematic overestimation compared to the PM 10 measurements by a tapered oscillating microbalance (TEOM, FDMS 8500, Thermo Scientific) calibrated by gravimetric measurements of offline PM 10 filters.
The meteorological data (temperature, relative humidity, solar radiation, precipitation, wind speed and wind direction) and trace gases (CO, NO x , O 3 , total hydrocarbons) were measured by the Swiss National Air Pollution Monitoring Network, NABEL (Empa, 2010). Equivalent black carbon (eBC) was measured with an Aethalometer AE 31 (Magee Scientific Inc., Berkeley, CA, USA). The data were corrected for loading effects and multiple scattering using the method of Weingartner et al. (2003). Mass absorption cross sections as determined by Herich et al. (2011) were used to convert b abs (λ = 880 nm) to eBC. The measured absorption coefficients at wavelengths 470 and 880 nm using the alpha values based on Zotter et al. (2017) were used to estimate the contributions to eBC from traffic (eBC tr ) and biomass burning (eBC wb ).
Seasonal PMF runs performed on the ACSM data in earlier studies  showed three primary OA factors and one to two secondary OA factors contributing throughout the measurement year. Among the primary OA factors a traffic-related hydrocarbon-like organic aerosol (HOA) factor was found, which correlated with NO x and eBC tr , as well as a biomass burning organic aerosol (BBOA) factor, which correlated with eBC wb , as also shown in other studies (Lanz et al., 2007Ulbrich et al., 2009;Zhang et al., 2011;Canonaco et al., 2013). Given that in summer the daily values of m/z 60 were always higher than the threshold for biomass burning impact identified in Cubison et al. (2011), BBOA was also modeled during the warm seasons. The third primary OA factor was assigned to cooking-related organic aerosol (COA) and exhibited enhanced concentrations during mealtimes, similarly to previous studies (Allan et al., 2010;He et al., 2010;Slowik et al., 2010;Sun et al., 2011;Mohr et al., 2012;Crippa et al., 2013;Elser et al., 2016). For warm days during the first winter and in spring, summer and fall the variability of the bulk OOA (oxygenated organic aerosol) was captured by two distinct factors, i.e., SV-OOA (semi-volatile oxygenated organic aerosol) and LV-OOA (low-volatility oxygenated organic aerosol). For the remaining colder period only one OOA factor accounted for the variation of the bulk OOA.

The multilinear engine (ME-2)
ME-2 (Paatero, 1999) is a powerful engine for solving the positive matrix factorization algorithm (PMF, Paatero and Tapper, 1994). Model configuration and post-analysis are performed by Source Finder (SoFi Pro 6.8, Datalystica Ltd., Villigen, Switzerland) within the Igor Pro software environment (Wavemetrics, Inc., Portland, OR, USA) as described in Canonaco et al. (2013). In its bilinear mode, PMF describes the measured data matrix X as a product of two matrices, G and F, and the residual matrix E. In element notation the equation is (1) In the measured matrix X the columns j are the m/zs, and each row i represents a single mass spectrum. p is defined as the number of factors of the selected model solution, i.e., the number of columns of G and the number of rows of F. Each column of the matrix G represents the time series of a factor, whereas each row of F represents the factor profile (i.e., mass spectrum); both are indexed by k. In an unconstrained PMF run in ME-2, the model is initialized with random entries in G and F ("seed") and the quantity Q is minimized with respect to all model variables by means of the conjugate gradient algorithm (Paatero, 1999): where e ij are the elements of the residual matrix E and σ ij represents the measurement uncertainty for the input point x ij . To compare Q values from various PMF runs with a different size and/or number of factors, Q is normally scaled by the remaining degrees of freedom (Q exp , which depends on the size of the input data and the number of chosen factors): (3) PMF is subject to rotational ambiguity, in which different combinations of G and F yield similar Q values. Some of these combinations may contain mixed factors and/or environmentally unreasonable descriptions of the data. Previous work has shown that constraining expected factor profiles using the a-value approach for AMS/ACSM data is an efficient method for isolating the set of environmentally interpretable PMF runs Canonaco et al., 2013;Crippa et al., 2014). The a value determines the extent to which the m/z in the mass spectral profile, also referred to as anchor (f kj ), is allowed to vary during the model iteration according to The index j stands for the actual variable (m/z) of the kth factor, and the a value is its scalar product. For example, an a value of 0.1 allows for a variability of ±10 % during the iterative process. This implies that some variables might increase and some might decrease within this limit. Note that after renormalizing the solution, the extent to which the constrained values changed might be slightly outside the defined a-value range. For example, consider a case where the a value is set to 0.1 for all variables of a factor profile. The values of all variables but one could decrease by 10 %, while the value of this single variable might increase by 10 % during the iteration. After renormalizing the factor profile outside ME-2 by, e.g., the sum of the profile, the intensity of this single variable will exceed the boundaries set with the a values during the PMF iteration. Moreover, note that the avalue approach defines only the boundaries of a solution and does not imply any weighting within these boundaries.

PMF input preparation step
The organic data and error matrices  are computed using the ACSM local tool version 1.5.3.2 (Aerodyne Research, Inc., Billerica, MA, USA) in Igor Pro. Weak (signal-to-noise ratio between 2 and 0.2) and bad (signal-tonoise ratio below 0.2) variables were downweighted according to the recommendations in Paatero and Hopke (2003). The m/z 16, 17, 18 and 28 variables that are replicates of the variability of m/z 44 were removed for the PMF calculation and recalculated a posteriori as a function of the m/z 44 contribution attributed to each factor profile (Elser et al., 2016). This approach is preferable to downweighting (Ulbrich et al., 2009), as it maintains a direct mathematical relationship between m/z 44 and its dependent variables, which can otherwise be distorted by dynamic weighting of outliers in the PMF robust mode.

New rolling method using ME-2
The new method consists in performing PMF runs on a small and moving window that is translated across the entire dataset. At each step, many individual PMF runs are performed, and the resulting runs are accepted or rejected according to predefined criteria. The window is then moved to the next position, with the distance between window positions being significantly smaller than the window size itself. The set of all accepted PMF runs determines the final source apportionment solution and is also used to assess model uncertainties.
The novelty of this method compared to Parworth et al. (2015) lies in the application of ME-2 for enhanced control of the matrix rotations and in the automated application of user-defined criteria to determine the set of accepted PMF runs. Moving properties of the window (window runs) are discussed in Sect. 2.3.1, whereas the main settings of PMF within a window (PMF runs) are described in Sect. 2.3.2.

The rolling strategy
PMF analysis is conducted on a subset of data defined by a small window that is moved in 1 d increments across the entire dataset and as such allows capturing of seasonal variations of the factor profiles. Note that rolling windows containing less than 10 % of real data are automatically skipped by the rolling algorithm. This avoids performance of PMF runs over large gaps due to, e.g., calibrations or instrument failures. The window size (s win ) is a free parameter that requires optimization. The rolling window PMF analysis of Parworth et al. (2015) utilized a 2-week window, arguing that this length is representative of the average lifecycle of aerosols in the atmosphere. Even for (low-time-resolution) ACSM data, 2 weeks have been shown to provide enough temporal variability to distinguish sources with similar factor profiles such as HOA and COA (Fröhlich et al., 2015). In the present study, likewise a 14 d window is selected after additionally assessing the performance of 3, 7, 21, and 28 d windows.
The model performance in response to s win is assessed by monitoring the value of Q/Q exp (which decreases as the mathematical goodness of fit improves) and the number of non-modeled time points (n non-modeled ) as a percentage of the total number of measurements. n non-modeled is defined as any ACSM time point for which the user-defined criteria (see Sect. 2.3.3 and 2.3.4) are not met for any PMF runs that include this measurement (note that for most points this will include PMF runs from multiple overlapping windows). Figure 1a shows Q/Q exp and n non-modeled as a function of s win . The Q/Q exp values are minimized for a 7 d window and are approximately 15 % higher for the 3 and 14 d windows and 45 % higher for the 21 and 28 d windows. n non-modeled shows a minimum for 14 d with a slight increase for larger windows and a steep increase for smaller s win .
A 14 d window has been chosen for the current dataset, as this avoids significant increases in Q/Q exp without inducing unacceptably high n non-modeled . Moreover, because the 1 d step of the rolling window is smaller than the 14 d width, each time point is included in 14 different window runs (except for those within the first or last 14 d of the dataset). As discussed later, these repeats aid the uncertainty analysis.

Window settings
The rolling strategy described above defines a new window after every window shift. Within this new window, a PMF run, referred to as repeat in the text, is generated via ME-2, which initializes new seeds, a values, and bootstrap resampling of the PMF input. The seed initializes all model entries in G and F, and unconstrained information therein is randomly initialized. Additionally, a priori information on the factors from the seasonal pre-tests is used to confine the solution space and thus to decrease the rotational ambiguity of the solution.
In the current study, constraints are applied only to profiles of the POA factors, namely traffic (HOA), cooking (COA) and biomass burning (BBOA). The HOA and COA profiles are taken from Crippa et al. (2013), while BBOA is the averaged mass spectrum reported by Ng et al. (2011a). These anchor profiles were also successfully used for the seasonal analysis of the Zurich-Kaserne data .
Every constrained factor profile applied in a PMF run requires a sensitivity analysis of the a value to identify the range of reasonable solutions Crippa et al., 2014;Elser et al., 2016). Typically, variation of the a value of one or more constrained factor profile(s) allows exploration of a region in the solution space that includes environmentally reasonable solutions. In the present analysis, the goal is to consider all PMF runs (not just the best one) that are mathematically and environmentally reasonable. Recent studies have systematically investigated the entire solution space allowed by the a values, e.g., by conducting PMF runs covering every combination of a values over the range 0 to 1 with a step of 0.1 (Elser et al., 2016;Bozzetti et al., 2017;Daellenbach et al., 2017). However, this approach is not computationally practical for moving window analysis. For instance, given that three factors are constrained in this present study, the above a-value exploration strategy would require 11 3 = 1331 PMF runs for a-value exploration per window run. Also, each combination of a values would require a minimum of 100 PMF runs for bootstrap analysis . Furthermore, the seasonal pre-tests indicated that both four-and five-factor solutions should be assessed (corresponding to one or two OOA factors). In total, this would require 1331 × 100 × 2 ∼ 2.66 × 10 5 PMF runs per window. Moreover, the daily shift of the rolling window will initialize the window runs 351 times (1 year minus the s win ), resulting in 1331×100×2×351 ∼ 9.35×10 7 PMF runs for a system-  Table 1. Overview of the rolling mechanism and the repeats of the PMF analysis.
Rolling mechanism: -a 14 d time window is defined -window is shifted by 1 d over the entire dataset PMF analysis: -for each window a four-and five-factor (HOA, COA, BBOA and one up to two OOAs) PMF run is performed, where HOA, COA and BBOA are constrained within the a-value approach.
-PMF runs are initialized 50 times from random starting points for the unconstrained information in G and F (seeds). The a values for the constrained factor profiles are randomly and independently varied from a = 0 to a = 0.4 with a resolution of a = 0.1 (a-value exploration). In each run the PMF input is resampled within the bootstrap method. atic analysis. This will require several months of computation even on modern PCs with multicore processors. To overcome these computational issues, two strategies were considered for reducing the number of runs required for a-value exploration. In both cases, a systematic exploration of the a-value space is replaced by randomly generated a values between zero and an upper limit (a max ). For the first strategy, the a max limit was fixed at one, and the number of repeats (x PMF ) was adjusted until the same criteria described above for s win optimization were satisfactory. However, this approach was rejected, as executing the full set of PMF runs required computational times on the order of months (see Part A of the Supplement) and therefore was impractical on regular PCs. The second strategy, which is used here, exploits the a priori information of the sources. If some factor profiles are known to be present and their source profiles are known to some extent, there is no need to explore regions in the solution space, for which these factor profiles may drastically depart from their realistic anchors.
Therefore, a max undergoes a systematic scan from zero upwards, with model performance assessed by Q/Q exp and n non-modeled , as described above for the s win estimation. The current strategy counts as a local-minimum algorithm, as the full parameter space (s win , a max , x PMF ) is not fully investigated. Moreover, pre-tests based on literature data, i.e., a 14 d PMF window for s win (Parworth et al., 2015) and an upper a value of 0.3 a max (Crippa et al., 2014), represented the starting condition for the parameter optimization discussed in Fig. 1. Figure 1b shows an almost flat Q/Q exp , while that of the n non-modeled behaves as a quadratic function with a minimum at a = 0.4. For a values below 0.4 the constrained fingerprints cannot optimally adapt to the current input. Given only 50 random a-value explorations out of 1331 (see above) of the entire a-value space for every PMF window, outcomes for higher a max may be purely stochastic and lead to a high degree of mixing and consequently rejection of the PMF runs (high n non-modeled ). a = 0.4 represents the optimum a max and is set as a free parameter for the a-value exploration.
The random resampling of the PMF input uses the bootstrap approach for every repeat. A window comprising 14 d with at most 48 (number of scans per day) × 14 (d) = 672 time points will create resamples containing again 672 new time points, where some time points may occur multiple times and others may be absent. As above, Q/Q exp and the percentage of n non-modeled are monitored as a function of the x PMF . Figure 1c reveals a constant Q/Q exp , whereas the number of n non-modeled decreases and stabilizes from 50 repeats onwards. We conclude that 50 repeats per window are sufficiently high for the bootstrap strategy. Note that the final number of PMF runs per time point may be higher than x PMF due to the overlapping PMF runs resulting from the rolling strategy. The total number of PMF runs for this study equals 50 (x PMF ) × 351 (number of days) × 2 (four-and five-factor) = 35 100 runs and required approximately 3 d on a modern multicore PC.

The post-PMF analysis
Manual inspection of all generated PMF runs is impractical and is replaced by an automated procedure based on pre-defined user criteria that (1) identifies and sorts unconstrained factors and (2) determines whether each PMF run should be accepted or discarded. Examples of user-defined criteria could include the factor correlation with an external tracer in terms of either the overall time series or diurnal pattern or characteristic temporal features, e.g., a prominent lunch peak for a cooking factor. Modeled PMF factors for which no factor criteria are satisfied, i.e., very poor score values due to factor mixing/swapping or sampling of transient sources not accounted for, typically yield n non-modeled .
In addition to determining whether an individual PMF run should be accepted or rejected, the criteria are used to determine the identity of unconstrained factors. While the positions of constrained factors within the F and G matrices are pre-defined for constrained factors, the same is not true of unconstrained factors, and these must be correctly identified prior to further data analysis. Consequently, all possible combinations for sorting unconstrained factor positions are evaluated (factor identification) and their scores combined together. As criteria with various score ranges are potentially possible, e.g., correlation coefficient, lunch peak ratio, the explained variation (EV; see Eq. 5) of m/z 60 and variable fractions, these score values must be corrected before being added up. z-score transformation as a linear correction is applied, where at the end the score distribution of each criterion possesses a mean value of 0 and a standard deviation of 1. Finally, the z-score transformed combination with the highest values is chosen to represent the PMF result for a specific PMF run. This is essential in the case of the two unconstrained factors SV-OOA and LV-OOA in this study. Note that this requires criteria to be defined for a minimum of all factors but one (i.e., p − 1 factors).
Considering the large amount of PMF runs by the rolling window algorithm, the main advantage of this criteria-based inspection is that the complexities of a factor profile and time series are reduced to single values ("scores"). Based on the score plots, potentially promising PMF runs can be further investigated and validated. This significantly improves the efficiency of PMF analysis by discarding PMF runs where the score for any criterion falls below the user-defined thresh-old ("bad PMF runs"). In contrast to conventional analyses, where a single PMF run often represents an optimal description of the dataset, the entire set of PMF runs classified as environmentally reasonable is used for the analysis and presentation. This provides a more comprehensive and robust representation of the dataset and supports uncertainty assessment.
To determine whether an individual PMF run is accepted or rejected, acceptance thresholds are defined for each of the selected criteria. These thresholds are free parameters and must be defined for each criterion separately. A threshold is inferred either from previous studies or from significance tests or similar statistical analyses (see the discussion for the HOA and COA thresholds in Sect. 2.3.4 for such an example).
The computational time required for criteria application subsequent averaging is typically on the order of minutes to hours with a modern multicore PC, depending on the number of accepted PMF runs. Thereafter, the results can be inspected in real time, allowing the user to efficiently investigate the set of PMF runs and, if needed, test various criteria.

Chosen criteria in this study
In this study one criterion per factor was defined, although it is possible to apply multiple criteria to the same factor, as each criterion is assessed individually on an accept/reject basis. Figure 2 shows the criterion scores calculated for each PMF run, with each plot representing an individual factor. The grey points show the score values for all PMF runs, the blue points denote PMF runs where criterion thresholds are satisfied, and the green points represent PMF runs where criterion thresholds for all criteria are simultaneously fulfilled. These green points are then used to compute the final PMF solution. The criteria and their corresponding thresholds applied for each criterion (blue points in Fig. 2) are also reported in Table 2 (first value).
In the current study, the thresholds for the criteria of HOA and COA were determined based on statistical analyses with the help of the results from conventional (no rolling technique) seasonal PMF from previous studies . The contributions of HOA and its tracer eBC tr were bootstrapped together and the correlation coefficient (R Pearson ) was evaluated each time, leading to a distribution for R Pearson . Similarly, the time series of COA was bootstrapped and the lunch peak enhancement in COA evaluated each time (COA 11+12+13 h /COA 9+10+14+15 h ), leading to a distribution for the lunch peak concentration. Finally, the 10th percentile value was chosen as the threshold score value. These seasonal thresholds are also visible as steps in the score plots (blue points in Fig. 2a and b, respectively) and are also reported in Table 2 (second value in brackets). For spring 2011, summer 2011 and winter 2012, however, the resulting thresholds for HOA either caused too many miss- Figure 2. PMF runs sorted based on the scores (grey points), PMF runs fulfilling the criterion thresholds (blue points) and PMF runs fulfilling criterion thresholds in all criteria (green points). The five criteria are (a) diurnal correlation between HOA and eBCtr (seasonal thresholds from statistical analysis), (b) relative lunch peak for COA (seasonal thresholds from statistical analysis), (c) explained variation of m/z 60 for BBOA, (d) f 44 in the LV-OOA profile and (e) f 43 in the SV-OOA profile, respectively. Note that (e) contains three episodes with zero points, which represent four-factor solutions automatically selected by the algorithm, where no five-factor solution was manually selected (and the SV-OOA criterion is thus irrelevant).
ing time points (R Pearson = 0.8) or had rather non-significant correlation coefficients (R Pearson = 0.2, with a p value of 0.4, n = 24 as for the other seasons). Hence, these thresholds were systematically lowered for spring 2011 and increased for winter 2012 to achieve the highest possible correlation coefficient with maximal data coverage, i.e., the same n non-modeled when considering all PMF runs for these periods in these criteria. NO x is a typical tracer for HOA in urban areas. However, due to incomplete NO x measurement coverage in this campaign (especially during spring and fall), eBC tr is used as a traffic tracer and the R Pearson correlation coefficient is computed between the diurnal cycle of eBC tr and the HOA factor.
As is frequently the case, no chemical tracers for COA were available in this study. Previous measurements in Zurich  have demonstrated a strong diurnal pattern for COA, with an increased concentration during lunchtime. As a proxy for COA, the lunchtime COA enhancement is monitored ( Table 2).
The wood burning contribution to black carbon (eBC wb ) as determined by the eBC source apportionment (eBC-SA) method of Sandradewi et al. (2008) was considered as a Table 2. Criteria scheme employed in this study. The first value represents the applied threshold for the final PMF solution and the values in brackets for HOA and COA stand for the threshold value coming from the seasonal resampling analysis. f 44 for LV-OOA is used for factor sorting rather than as an acceptance/rejection threshold.

Factor
Criteria possible criterion for BBOA but then rejected. The eBC-SA analysis applies to air masses highly influenced by biomass burning and has been validated for winter data only. Uncertainties in eBC wb during warm seasons, when the biomass burning contribution is small, have been shown to be quite high (Harrison et al., 2013). Therefore, it was decided to use another metric for BBOA, exploiting the key spectral feature at m/z 60. For BBOA the explained variation (EV) (Paatero, 2010) for m/z 60 is monitored as follows: This threshold is chosen following the recommendation in Paatero (2010), where a variable modeled by its mean explains already ∼ 25 % of the variation. If the measured variability of a variable is explained by a specific factor, that factor must capture more than the mean value of the variable, and hence Paatero (2010) recommended 30 %-35 % as a minimum EV. However, using 30 % or 35 % as a threshold resulted in several weeks of non-modeled time points, in particular for spring and fall 2011. An a value of 25 % resulted in a reasonable compromise between EV and the number of non-modeled time points. Note that this approach requires the assumption that m/z 60 should be predominantly explained by BBOA, which is likely true when the fraction of the OA signal occurring at m/z 60 (f 60) is relatively high. However, for measurements where f 60 is low, m/z 60 is more likely to also have contributions from other sources. A rough guideline for utilizing this criterion is a threshold for biomass burning influence of f 60 = 0.003 as identified by Cubison et al. (2011). In the current dataset, ∼ 85 % of all measured time points exceeded this threshold. Every measured day was observed to comprise at least some time points (in winter, spring and fall almost all points but in summer mostly evening points) above this threshold, suggesting that the criterion is valid throughout the dataset.  Table 2) all score values are allowed here, whereas for SV-OOA (Fig. 2e, Table 2) the PMF runs meeting the thresholds for the five-factor solutions are selected. This threshold corresponds to the point where n non-modeled is minimal with respect to this criterion; i.e., considering all PMF runs in this criterion leads to the same n non-modeled at the highest possible f 43 for SV-OOA.
The criterion of SV-OOA is further used to differentiate between four-and five-factor solutions on the window runs. For the PMF windows where no five-factor solution with SV-OOA is selected, the set of four-factor solutions in the corresponding PMF window is automatically selected (green points at zero in Fig. 2e). Finally, the averaging procedure also controls and prevents that four-and five-factor solutions are simultaneously considered for the averaging of single time points by privileging five-factor solutions; i.e., any time point containing accepted PMF runs with both four-and fivefactor solutions retains only the five-factor solution.

Brief statistical analysis of the rolling result
The amount of n non-modeled resulting from the criteria and thresholds reported in Table 2 yields 99.31 % data coverage, corresponding to a total of only 3 non-modeled days. Overall, the selected criteria resulted in 1970 accepted PMF runs (∼ 5.6 % out of the 35 100 PMF runs). The Q/Q exp has an average value of 4.4 and a median of 4.8, and the first and third quartiles are 3.7 and 5.5, respectively. These values are reasonable, given that many previously conducted AMS studies reported values between 1 and 10 . On average, each data point has 43 replicates (median = 24, first and third quartiles 9 and 60, respectively), 932 F. Canonaco et al.: Long-term source apportionment with time-dependent factor profiles using SoFi Pro which are used to assess the statistical uncertainty of the PMF solution as discussed in Sect. 3.5.

Factor time series
3.2.1 Overview Figure 3a shows the time series of each factor for the entire dataset as a mean, averaged over all accepted PMF runs. The data from Fig. 3a are re-averaged to monthly and seasonal means and shown in Fig. 3b and c, respectively. For Fig. 3c, seasons are defined as follows: winter is December-February, spring is March-May, summer is June-August, and fall is September-November.
The time series of the primary OA factors HOA, COA and to some extent BBOA are rather spiky (Fig. 3a), underlining a strong influence of local sources. The COA spikes that are present from May 2011 through the end of September 2011 are likely due to local barbecuing events during the evening, as also observed in an earlier study at this site (Lanz et al., 2007). The highest COA concentrations are observed in early July 2011, where the NR-PM 1 mass concentrations reached 70 µg m −3 , and correspond to three consecutive evenings/nights of a yearly Latin American dance and grill festival (Caliente). During this festival, the courtyard containing the measurement site was filled with food and grill stands, explaining the dominant contribution of COA. Throughout the summer and spring and less frequently in fall/winter SV-OOA was modeled in addition to LV-OOA. This warm period was characterized by high daily temperatures and induced on the one hand variability in the condensed OOA allowing for separation of SV-OOA and LV-OOA and on the other hand increased emissions of biogenic SV-OOA precursors . Figure 4 summarizes the weekday (left) and weekend (right) daily cycles for the modeled factors. The daily cycle of HOA follows the averaged daily cycles of the estimated traffic of eBC (eBC tr ) and of NO x . The same is true for the daily cycle of BBOA following that of the biomass burning of eBC (eBC wb ). HOA, eBC tr and NO x exhibit a clear rush-hour peak on weekdays and none on the weekend. During the weekdays, a small lunch peak is visible for COA, underlying the meal activity during the working days and the presence of many restaurants in this area. There are no evident differences between the weekday and weekend daily cycles of LV-OOA, SV-OOA and OOA. LV-OOA and OOA show rather flat daily cycles, similarly to their inorganic aerosol tracers SO 2− 4 and NH + 4 , respectively. This is in line with their most-likely regional background, as already suggested earlier . Only the concentration of SV-OOA tends to decrease during the afternoon, suggesting its volatile nature, similarly to its inorganic aerosol tracer NO − 3 . The weekly cycle for HOA, COA, BBOA and the OOAs, including their tracers eBC tr , NO x , eBC wb , SO 2− 4 , NO − 3 and NH + 4 , respectively, are reported in Supplement B. Apart from OOA, the weekly cycles for HOA, BBOA, SV-OOA and LV-OOA are in good agreement with their tracers.

Comparison with external data
The analysis and further validation of the PMF runs using the criteria-based selection are performed on the PMF results of the rolling windows, and therefore correlations are performed over 14 d in this study. The performance of the rolling strategy can then be verified by the factor-tracer correlation, e.g., on average over the seasons (Table 3). Moreover, the same factor-to-tracer correlations are also evaluated for the seasonal pre-tests (PMF runs over the seasons with no rolling strategy) and are reported in brackets in Table 3. NO x data are available only in winter and fall 2011. Both NO x and eBC tr are correlated with HOA over the full year and within individual seasons. The correlation values with NO x , are lower compared to those found in Canonaco et al. (2013). However, in Canonaco et al. (2013) the data covered mostly the two winters including some parts of spring and fall. For the latter two seasons NO x data were not properly validated and was consequently removed from further analysis (no NO x data are available for spring and summer). Moreover, in Canonaco et al. (2013) the model validation was strongly based on the first winter period, and when performing the correlation between HOA and NO x data for that period only, the correlations were similar also in the current study (not shown in the table). BBOA shows substantial correlation with eBC wb in fall and winter, as also found in Canonaco et al. (2013), while the correlation is low in spring and very low in summer. These low correlations are expected, since the determination of eBC wb is highly uncertain when the eBC wb /eBC traffic ratio is low. Wood burning source apportionment of eBC data, as already stressed above, is not suited under warm conditions with low biomass burning contributions. However, the correlation is good over the full year, as the problematic data yield eBC wb concentrations near zero anyway, and the correlation is thus driven by the data with high signal-to-noise ratios.
High correlations between LV-OOA and SO 2− 4 are seen over the year as well as for spring and fall, whereas they are lower in summer, as shown in Table 3, in contrast to Lanz et al. (2007) (R Pearson = 0.5 between LV-OOA and SO 2− 4 during a summer AMS campaign). The correlation between SV-OOA and NO − 3 is higher for winter 2011 and summer but lower in spring and fall. This is understandable, as the spring and fall represent the transition between modeling SV-OOA and LV-OOA (summer) compared to one OOA only (winter). The correlation between SV-OOA and LV-OOA for winter 2012 is not shown due to the low number of time points for which both OOAs were modeled. OOA correlates well  Table 3. Correlation coefficients (R 2 Pearson ) with a significance level of p ≤ 0.01 between the factor contribution and expected tracers over the year and the meteorological seasons as defined above. The first value describes the correlation for the rolling result, whereas the value in brackets is for the seasonal PMF result (no rolling).

Factor
Year with NH + 4 throughout the year in accordance with summer and winter data reported previously (Lanz et al., 2007Canonaco et al., 2013). In contrast to the OOAs, few differences are observed for BBOA, HOA, or COA between the two winters. This supports the conclusion that the different OOA behavior in these two winters reflects actual meteorological and chemical differences rather than mixing and/or splitting between the POA and SOA factors.
Importantly, the rolling results show generally higher correlations with the external tracers than do the conventional seasonal PMF runs (values in brackets in Table 3). This demonstrates that the rolling approach generally outperforms the conventional seasonal PMF analysis.

Time-dependent factor profiles
The mean factor profiles of the six modeled sources/components over the entire year are presented in Fig. 5. Error bars show 1 standard deviation of profile variability across the entire measurement year. Note that this variability comprises both the time-dependent variation of the factor profiles and the PMF error (see Sect. 3.5 for more details on the discussion of the errors in this study).
A better understanding of the temporal variation of the factor profiles is gained when inspecting them over time. Figure  6 shows the fractional contributions of m/z 41, 43, 44, 55, 57 and 60 to each factor profile as a function of time. Each variable is normalized by its mean contribution. In general, the variation of the fractions for the primary OA factors (HOA, COA and BBOA) seems small compared to the variability of the oxygenated factors (LV-OOA, SV-OOA and OOA). The primary OA factors show low profile variability with almost no seasonal pattern. Note that minimum and maximum values of these variables for the primary OA factors (less pronounced for HOA and COA) reach ∼ 0.6 and 1.4, respectively, i.e., the boundaries given by a max . The 75th percentiles of the a values for HOA, COA and BBOA touches a max less than 0.9 % of the time and the 90th percentile hits a max 34 %, 24 % and 73 % of the time (see Supplement D Fig. S5). This suggests that the factor profiles are not limited by the constraining technique, but rather by the employed scheme of criteria. Allowing for higher a max and loosening the criteria threshold would most likely increase the variability in these ions but would also lead to mixed and environmentally unreasonable solutions. This is different for the oxygenated factors. LV-OOA, SV-OOA and OOA for example contain high m/z 60 for the colder season, likely indicating significant impact of biomass burning Heringa et al., 2011;Qi et al., 2019). In addition, m/z 57 shows a strong seasonal pattern, i.e., high in winter and low during summer for SV-OOA and LV-OOA. Strong peaks are also observed for m/z 43 in LV-OOA during summer. This is due to less oxygenated bulk LV-OOA compared to the winter in Zurich, when LV-OOA or OOA represent more oxygenated aerosol with higher m/z 44 and lower m/z 43, as already noted in Canonaco et al. (2015). SV-OOA also contains a very strong increase in m/z 55 during the Caliente episode. Most likely one COA factor alone is insufficient to capture all the variability of m/z 55. As a consequence, PMF uses an additional factor for modeling the variability of m/z 55; here, SV-OOA which may contain some characteristics of cooking SOA, as the latter has been shown to have non-negligible contribution at m/z 55 as well (Klein et al., 2016). Further evidence comes from Fig. 6e (and also Fig. S4 in the Supplement), where m/z 55 and m/z 43 peak around Caliente in SV-OOA and LV-OOA, respectively. Moreover, m/z 44 drops in LV-OOA. This implies that SV-OOA has some characteristics of cooking, while LV-OOA becomes more SV-OOA-like during Caliente. The period of influence of these peaks lasts until 8-10 d before and after Caliente, most likely as it is incorporated during the window runs 14 d before and after Caliente.
The time-dependent mass spectral matrix of the factors can be found in the Supplement Section C, although a detailed analysis is beyond the scope of the current study. When employing this type of analysis, future studies should investigate in more detail changes in the variables in the factor profiles. This information might provide new insights on seasonal or source-specific markers, essential for source apportionment analyses. Figure 7a and b show the scaled residuals as functions of m/z and time, respectively. The scaled residuals do not reveal any systematic over-or under-estimation. The data scatter around zero with the interquartile range almost always between ±3 throughout the entire year, evidencing the good quality of the PMF solution on average (±3 is the reasonable range for scaled residuals defined in Paatero and Hopke, 2003). The highest residuals occur during the Caliente festival (beginning of July), as shown by the dark red spike (interquartile range) in the time-series plot (Fig. 7b), when the PMF solution is strongly influenced by extremely local and short-term cooking and biomass burning sources that are not fully captured by the retrieved COA and BBOA factors.

Residual analysis
This results in a change in the factor profiles of COA and BBOA and SV-OOA (as already stressed in Sect. 3.3). However, the COA, BBOA and SV-OOA profiles roughly 8-10 d before and after Caliente are again consistent with those retrieved during the rest of the season, i.e., the unique fingerprint during the Caliente episode does not strongly influence the solution of the PMF windows around Caliente. A few other episodes in spring (May) and at the end of the summer (September) reach also higher scaled residuals. In the current dataset, these likely indicate PMF runs that have not fully captured profile responses to rapid meteorological changes (colder to warmer season and vice versa). This happens on a shorter timescale than the chosen PMF window and as a consequence cannot be fully captured by the 14 d PMF windows, causing PMF solutions with mixed factor profiles and higher scaled residuals. Note that during the last third of the measurement the scaled residual distribution tends to be broader. This is due to technical problems on the ACSM inlet system mainly related to the filter valve clogging, causing noisier signals and consequently noisier PMF results for the valve switching system employed at that time. This condition is not accounted for by the ACSM error model and increases the scaled residuals.

Uncertainty of the PMF solution
Within this study, each PMF run combines a random selection of a values for the three constrained POA factors with random (time-based) resampling of the input matrix. PMF runs satisfying the acceptance criteria are retained for the fi- nal result, leading to several repeats for each time point i. The variability among these repeats at each i can be used to infer the rotational and statistical uncertainty. These two types of uncertainties are discussed below and are collectively referred to as PMF error within this study. Additional contributions to the overall uncertainty of this analysis that are not assessed here include anchor profile selection as well as the error related to the criteria construction, such as the type of criterion (correlation, diurnal, profile characteristics, etc.), tracer selection, and its related threshold selection. The proposed relative PMF error in percentage in this study is given by the following formula: where σ is the standard deviation and avg is the mean value of all replicates of a time point i. The probability density function (pdf) of PMF error for each time point i( σ avg ) i is reported in Fig. 8. The relative PMF errors are given by the center of the lognormal fit (x 0 ) as visualized in Fig. 8 and are for HOA, COA, BBOA, LV-OOA, SV-OOA and OOA ±34 %, ±27 %, ±30 %, ±11 %, ±25 % and ±12 %, respectively.
The data reported in Fig. 8 were first log-transformed, as the untransformed distribution was skewed to the right, mostly due to time points with low signal-to-noise ratio that would have had a stronger impact on the final error calculation using an untransformed, i.e., linear representation.

Recommendations and current limitations
The techniques described in this study are relevant for longterm source apportionment (SA) studies, in particular for ACSM data. The stability of the primary profiles (HOA, COA and BBOA) suggests that they are rather independent of the season and that employing primary OA factors coming from other SA studies (here profiles from an AMS SA in Paris conducted years earlier) using, e.g., the a-value constraints works even for long-term SA. However, this outcome is not completely independent, as it results from the defined a max as well as the applied scheme of criteria with their corresponding criteria thresholds. Increasing these thresholds would most likely increase the variation in the POA factor profiles but would also favour more mixing between these factors. Significant seasonal changes in factor profiles were found for SV-OOA and LV-OOA. Hence, the rolling mechanism is essential when accurately apportioning the oxygenated organic aerosol fraction. The use of a 14 d window, as already proposed by two former studies (Fröhlich et al., 2015;Parworth et al., 2015), was shown to be appropriate for this long-term SA analysis and represents a promising starting point for future long-term SA studies, although detailed evaluation for datasets with other sources and temporal characteristics is needed.
In general, selection of the rolling window size (s win ) should consider both the fraction of non-modeled time points (see Fig. 1) and interactions between s win and solution ac- ceptance criteria. The latter point is illustrated by the use of the relative intensity of the COA lunchtime peak in this study. This peak was observed to be almost absent during the weekend. As a consequence, avoiding systematic biases in the fraction of non-modeled time points requires the s win to be larger than 7 d to guarantee the presence of weekdays in every window run. Employing a reliable tracer even during the weekends for the cooking source would have allowed for a better exploration of s win below 7 d, as similar Q/Q exp values resulted for 3, 7 and 4 d windows, as shown in Fig. 1.
The importance of defining the proper number of factors is strongly emphasized when analyzing transient events, e.g., the Caliente episode. This becomes even more important when performing automated source apportionment schemes, where the ability of factors to dynamically change and adapt to the current window run is limited, as is the case for the current rolling mechanism presented in this study. During Caliente the variability of m/z 55 required two cooking factors to achieve complete apportionment. With only one cooking factor allowed, other unconstrained factors (especially SV-OOA) took on some cooking characteristics. This resulted in mixed SV-OOA and LV-OOA factors, as m/z 55 and 43 were clearly peaking around Caliente for SV-OOA and LV-OOA, respectively. Relevant transient events that should still be part of the SA result would most likely require further attention with additional and separate PMF runs, where the user can better control the required number of factors and s win . Such problems are clearly evident from diagnostics such as increased residuals (Fig. 7b) and sudden changes in factor profiles (Figs. S3 and S4), facilitating their appropriate identification and treatment. A 14 d window is likely too large for transient events representing a small fraction of s win , where the latter strongly influences the contributions of the data for s win days around the event. Crippa et al. (2014) already demonstrated for 25 AMS datasets that an a max of 0.3 for the constrained information was often required for those SA studies. For the present algorithm and dataset, an a max of 0.4 was shown to be ideal. Smaller a max did not allow the constrained profiles to sufficiently adapt to the data, whereas higher values were subject to mixing of the profiles. a-value limits strongly depend on how well the fingerprint matches the PMF input. Fingerprints applied obtained by SA analyses of other locations or during other meteorological conditions might require a higher a-value limit compared to those extracted from, e.g., a preanalysis conducted on a subset of the PMF input.
The other remaining free parameters (x PMF and in particular the choice of the criteria and their corresponding thresholds) must be assessed by the user for any new SA study, as they may strongly depend on site/source characteristics and tracer availability. Moreover, investigation of various tracers as criteria candidates for one source is also very desirable, as it allows us to quantify errors when discussing factor-tracer interchangeability.
Unlike batch-style PMF (i.e., a single PMF run encompassing the entire dataset), here corrections or scaling factors affecting entire rows or columns of the input data matrix should be applied prior to SA analysis. For example, the collection efficiency (CE) parameter applied for ACSM data analysis is applied to all measured m/zs of a mass spectrum and does not alter the relative contributions obtained by a single PMF result. However, it does affect the overall source apportionment returned by the rolling window strategy presented within this study. This comes from the fact that the final source apportionment result is the aggregate of a set of accepted solutions whose criteria for acceptance may include goodness of correlation with an external tracer, and such correlations are affected by CE. Therefore, applying CE post-PMF will require the user to re-evaluate the score plots and to reassess the criteria thresholds.
It is likely that the PMF errors reported above can be further reduced by further refinements to the rolling window algorithm. One major limitation is the application of seasonspecific criteria thresholds. In the future, criteria thresholds with a higher temporal resolution are certainly desirable. Another major limitation is the continuous presence of the primary OA factors during the entire analysis. Similarly to the (de)activation of SV-OOA within this study, in the future one or more factors should be (de)activated during the evolution of the rolling approach to better cope with the complex and dynamic real atmospheric conditions.

Conclusions
A rolling-window PMF algorithm was applied to NR-PM 1 organic data measured with an ACSM between February 2011 and February 2012 in downtown Zurich, Switzerland. The rolling approach allows for a source apportionment of time-dependent factor profiles and has several advantages, e.g., very fast PMF runs of rather small PMF runs (few seconds for 14 d windows) compared to conventional batch analysis (several minutes, as the PMF run is always the entire dataset) or one factor per source compared to several factors in batch analysis to cope with time-varying factor profiles. Moreover, the rolling technique is particularly helpful for the analysis of automated and/or continuous analysis of both long-term and continuously growing datasets, where batch analysis is at best inefficient and probably not feasible. Factor-tracer correlations were shown to be higher for the averaged seasonal analysis (from the rolling window) than for the seasonal pre-tests (PMF runs with no rolling). This highlights the improved performance of the rolling PMF runs compared to conventional batch PMF analysis for long-term data.
PMF runs were conducted where the a values of the constrained factor profiles were randomly changed within the boundaries 0 to a max in conjunction with the bootstrap resampling strategy. The resulting PMF runs were selected and studied using the criteria scheme based on information on the sampling site from previous SA studies. This method has shown its usefulness when evaluating and studying hundreds of thousands of PMF runs. The criteria used here consisted of features in the diurnal patterns of HOA and COA, the amount of explained variation of m/z 60 attributed to BBOA, and representation of OOA by one or two factors depending on the difference between SV-OOA and LV-OOA in f 43 values.
The separation between the primary OA factors (HOA, COA and BBOA) and oxygenated organic aerosol (SV-OOA, LV-OOA and OOA) was rather robust throughout the year. HOA and COA were rather constant, whereas BBOA showed a very strong seasonality with the highest contribution in winter and lowest in summer. The model separated OOA into SV-OOA and LV-OOA mainly during the warm season (spring and summer), including a warm episode during the first winter. The strongest changes in the factor profiles were visible for the oxygenated species SV-OOA and LV-OOA, whereas the primary species HOA, COA and BBOA showed smaller variations. Hence, the rolling mechanism is certainly essential when properly apportioning the oxygenated organic aerosol fraction.
The model was still able to separate a semi-volatile fraction for the colder seasons based on the variation in m/z 43 and 44, where very little variation was present in nitrate, often used as a tracer of SV-OOA.
The rotational and statistical uncertainties were assessed via random a-value exploration and bootstrap resampling. The relative PMF errors (expressed by the standard deviation divided by the average concentration of all replicates per time point) are on average ±34 %, ±27 %, ±30 %, ±11 %, ±25 % and ±12 % for HOA, COA, BBOA, LV-OOA, SV-OOA and OOA, respectively.
Finally, the free parameters tested and validated in this study, i.e., the 14 d window length, 0.4 as upper limit for the a value of the constrained primary OA factor profiles, together with the scheme of criteria and the x PMF per window run, depend on the sources and meteorological conditions of downtown Zurich. When applying this new rolling strategy on datasets dissimilar to Zurich, some or all of these parameters might be subject to investigation to achieve a complete and quantitative source apportionment analysis.
Author contributions. All the authors made substantial contributions to the conception, design, analysis and interpretation of the data. All the authors participated in drafting the article or revised it critically for important intellectual content and all the authors gave final approval of the version to be submitted and of any revised version.

Competing interests. Francesco Canonaco, Carlo Bozzetti, Anna
Tobler and Yulia Sosedova have also been/are still employed by Datalystica Ltd. during the final development of the main SoFi Pro packages, and Datalystica Ltd. is the official distributor of the SoFi Pro licenses.
Financial support. This research has been supported by the COST action CA16109 Chemical On-Line cOmpoSition and Source Apportionment of fine aerosoLs (COLOSSAL), the SNF COST project SAMSAM IZCOZO_177063, the SNF project IZLCZ2_169986 Haze pollution in China: Sources and atmospheric evolution of particulate matter (HAZECHINA), the EU Horizon 2020 Framework Programme via the ERA-PLANET and transnational project SMURBS (grant agreement no. 689443), and the Swiss State Secretariat for Education, Research and Innovation (SERI; contract nos. 15.0159-1 and 15.0329-1).
Review statement. This paper was edited by Mingjin Tang and reviewed by two anonymous referees.