A hybrid method for reconstructing the historical evolution of aerosol optical depth from sunshine duration measurements

A novel method has been developed to estimate aerosol optical depth (AOD) from sunshine 10 duration (SD) measurements under cloud-free conditions. It is a physically-based method serving for the reconstruction of the historical evolution of AOD during the last century. In addition to sunshine duration data, it requires daily water vapor and ozone products as inputs taken from the ECMWF twentieth century reanalysis ERA-20C, available at global scale over the period 1900–2010. Surface synoptic cloud observations are used to identify cloud-free days. For sixteen sites over Europe, the accuracy of the 15 estimated daily AOD, and its seasonal variability, is similar or better than those from two earlier methods when compared to AErosol RObotic NETwork measurements. In addition, it also improves the detection of the signal from massive aerosol events such as important volcanic eruptions (e.g., Arenal and Fernandina Island in 1968, Chichón in 1982 and Pinatubo in 1992). Finally, the reconstructed AOD time series are in good agreement with the dimming/brightening phenomenon and also provides preliminary 20 evidence of the early-brightening phenomenon.


Introduction
Aerosols in the atmosphere are generally produced by natural and anthropogenic mechanisms, e.g. dust and sea salt triggered by wind-driven processes or carbonaceous aerosols (organic and black carbon) from combustion in urban/industrial processes or from biomass burning. They play a crucial role in the Earth's climate through their direct effects by scattering and absorbing solar radiation (Charlson et al., 5 1992;Hansen et al., 1997) and their indirect effects by acting as cloud condensation nuclei (Tang et al., 2016).
In the IPCC Fifth Assessment Report (IPCC, 2013), aerosols are mentioned as the largest contributor to the large uncertainties in the projections of climate change. A significant source in this uncertainty is linked to the limited knowledge of the historical evolution of aerosol load. In addition, the role played 10 by aerosols in the dimming and brightening phenomenon is not yet well-established (Wild, 2009(Wild, , 2016.
For that reason, it has become of great importance to the scientific community to estimate the historical evolution of aerosol load accurately.
Aerosol optical depth (AOD) has been widely used to represent the aerosol radiative impacts. It has been mainly measured using reference instruments, sun photometers for instance, from various ground-based 15 networks. Among them are the Aerosol Robotic Network (AERONET; Holben et al., 1998), the Global Atmospheric Watch Precision Filter Radiometer network (McArthur et al., 2003) and the SKYradiometer NETwork (Aoki et al., 2006). The most widely used ground-based network is AERONET. Although AERONET already contains over 700 stations globally with a fairly good spatial coverage over land compared to many other observational networks, it still lacks of temporal coverage. AERONET has 20 provided aerosol optical properties and AOD only since the 1990s at some sites, while the number of measurement sites started to substantially increase only in the early 2000s. The earliest records of satellite-based AOD are provided by TOMS (Total Ozone Mapping Spectrometer, e.g. Torres et al., 2002) and AVHRR (Advanced Very High Resolution Radiometer, Geogdzhayev et al., 2005Geogdzhayev et al., ) from 1979 and 1983 onwards, respectively. It is apparent that neither sun photometer nor satellite records of AOD 25 are available for a long time period, which would extend before the 1980s. The pyranometer measurements of surface solar irradiance (SSI) are also very valuable to infer AOD (Lindfors et al., 2013;Huttunen et al., 2016). However, this type of measurements started to become available mainly only after the 1950s, with the establishment of numerous radiation sites during the International Geophysical Year (IGY) 1957(IGY) -1958 To overcome this scarcity of information on the evolution of AOD, especially before 1950, some researchers have used proxy-approaches. Thus, some of them have used sunshine duration (SD) measurements to infer AOD under cloud-free conditions (Sanchez-Romero et al., 2016;Li et al., 2016;Dumitrescu et al., 2017), because SD measurements offer remarkably long time series going beyond 1950 and with a noticeable worldwide spatial coverage. SD for a given period, mostly a day, is defined 5 as the sum of the sub-periods for which broadband direct normal irradiance (DNI) exceeds the threshold value of 120 W m -2 (WMO, 2008). This value is assumed to be the "burning threshold". From this definition, the evidence of the link between SD and AOD can be summarized as follows: an increase in AOD would decrease DNI yielding then less sub-periods when DNI would be greater than the burning threshold and therefore results in a reduction in SD. For a review of the topic, we refer to Sanchez- The Campbell-Stokes sunshine recorder (CSSR) is one of the devices used to measure SD. It has been manufactured since the late 19 th century to record the duration of the sunbeam through a burned trace on an appropriate card (Sanchez-Lorenzo et al., 2013). The measurement of the length of the burned trace for a given card over the day gives the daily SD. Recently, automatic SD recorders have been developed 15 and they are becoming more and more spatially distributed (Wood et al., 2003;Kerr and Tabony, 2004;Matuszko et al., 2015). These instruments are much more accurate than CSSR because they measure the beam irradiance over the day and thus count efficiently the duration during which the beam irradiance exceeds the 120 W m -2 threshold. However, the measurement time series from the automatic SD recorders are too recent to provide long enough SD time series for our purpose. Thus, this study will 20 mostly deal with measurements operated with CSSR.
There are two different previous approaches that have been published on the retrieval of daily AOD from daily SD measurements. These are the methods described in Sanchez-Romero et al. (2016) and Li et al. (2016). In the first approach, the central assumption is the station-by-station fitted linear relationship between SD fraction (SDF), i.e. SD normalized by the length of day from sunrise to sunset, and AOD 25 measurements. Sanchez-Romero et al. (2016) applied the approach to SD stations throughout Spain and collocated AERONET stations. A similar approach was applied by Dumitrescu et al. (2017) over 57 Romanian stations. Both studies were limited to the summer season only to ensure a sufficient number of clear-sky days and thus a sufficient number of data pairs in the linear fit between SDF and AOD.
The second approach is a physically-based method, in essence similar to the direct sun methods applied with sunphotometer measurements to retrieve AOD, in which individual attenuators such as Rayleigh scattering, mixed gases, ozone and water vapor are removed from the overall attenuation. And the residual attenuation is then due to aerosols. This approach has the advantage that it can be applied at any time, i.e. does not depend on the season as the first approach, which needs enough collocated SDF and 5 AOD measurements for the linear fit. Li et al. (2016) applied their approach over China with corrected SD measurements. Nevertheless, there was an overestimation in the retrieved monthly AOD when compared to AERONET measurements. This overestimation was due to an inadequate constant value used for the burning threshold, as we discuss below in more detail.
We propose a new and more accurate method for estimating AOD from SD measurements under cloud-10 free days. In a sense, it combines the best aspects from both Li et al. (2016) and Sanchez-Romero et al. (2016) with further enhancements in some parts. Among the novelties, the proposed method exploits a more accurate broadband DNI model and takes into account local conditions affecting SD measurements.
Then, the proposed method is used to provide a realistic historical evolution of AOD over a few European stations. 15 The European Climate Assessment & Dataset (ECA&D) project has been selected for such long time series of SD measurements. About a hundred stations in Europe have collected SD measurements together with cloud cover data and other meteorological parameters such as air temperature and relative humidity. Few of them are collocated with AERONET stations providing AOD.
The article is organized as follows. In Section 2, a description of all data used in this study is given. Then, 20 the two published state-of-the-art methods and the new hybrid method are described in Section 3. In section 4, a detailed theoretical study is carried out on the relationship between AOD and SD with radiative transfer simulations and the influence of water vapor as well as ozone. The performance of the proposed method is compared to the two approaches. The results of different AOD estimates are seasonally inter-compared, analyzed and discussed in Section 5.

AERONET measurements
AERONET is a well-known and globally distributed network of ground-based stations which are equipped with sun photometers measuring direct sun radiances at eight wavelength bands centered at 5 340, 380, 440, 500, 670, 870, 940, 1020 nm. The measurements are made every 15 minutes between sunrise and sunset with an accuracy of 0.01-0.02 (Eck et al., 1999). They are performed only under cloud-free conditions . From the AERONET direct sun measurements, daily AOD and total water column (TWC) are provided . In this paper, level 2.0 data are selected for their assured quality. The data can be downloaded from the website 10 https://aeronet.gsfc.nasa.gov/new_web/download_all_v3_aod.html.

ECA&D database
Ground-based SD measurements and cloud information represented by total cloud cover (TCC) constitute useful data from this database.

ECMWF total water vapor and ozone column
Among other products, the European Centre for Medium-range Weather Forecasts (ECMWF) provides reanalysis variables of TWC and total ozone column (TOC). These variables take into account several analyses of ground-based observations used as inputs of ECMWF model. They are considered as representing the atmospheric state fairly well. They are inputs for the physically-based method in order to remove the attenuation due to water vapor and ozone from broadband DNI.
In this article, we select mean daily TWC and TOC from the ECMWF twentieth century reanalysis ERA-20C covering the period 1900-2010 (Poli et al.;2016)

MODIS MxD08_D3 products
The MODIS MxD08_D3 products, namely MOD08_D3 and MYD08_D3 data collected from the Terra and Aqua platforms, respectively, are daily level-3 satellite atmosphere datasets based on NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) observations. MODIS instruments are flying 25 onboard both Terra (morning overpass) and Aqua (afternoon overpass) satellites and for aerosols, they provide information for clear-sky pixels over both land and ocean. Despite of the coarser spatial resolution than the corresponding one for level-2 product, the level-3 product was used because it takes into account all overpasses of the instrument during the same day making it to be more representative of daily average aerosol properties than the level-2 instantaneous data products. In this level-3 product, the aerosol retrievals are spatiotemporally aggregated into a daily dataset with spatial resolution of 1° latitude x 1° longitude. These aerosol data cover the period of early 2000 (Terra) and mid-2002 (Aqua) onwards.
We use the AOD product at 550 nm that is a combined product of Collection 6.1 Dark Target (Levy et al., 2013) and  (Gelaro et al., 2017). This global dataset assimilates several satellite and ground-based measurements and regarding aerosol products, there are, for example, ground-based measurements from AERONET, MODIS observations from both Terra and Aqua satellites, Multi-angle Imaging SpectroRadiometer (MISR) data from Terra satellite, and Advanced Very-High-Resolution Radiometer (AVHRR) from NOAA Polar Operational Environmental satellites (Randles et al., 2017). MERRA-2 data cover the 20 period from 1980 to the present and the spatial resolution is 0.5° latitude x 0.625° longitude (Molod et al., 2015). In this study, we use the hourly data of total aerosol extinction AOD (TOTEXTTAU) at 550 nm and the total aerosol Ångström Exponent (470-870 nm, TOTANGSTR) available through the Goddard Earth Sciences Data and Information Services Center (GES DISC; http://disc.sci.gsfc.nasa.gov/mdisc/). All hourly data between sunset and sunrise are averaged in order to 25 retrieve the daily MERRA-2 estimates.

Linear regression method (LRM)
In order to retrieve AOD from SD observations, Sanchez-Romero et al. (2016) first computed the slope and the intercept of the linear regression between daily SDF and daily AOD at 440 nm under cloud-5 free conditions for each collocated AERONET and SD stations as follows: A day is considered as cloud-free if the average of three-daily observations is rounded to 0 okta. For a given station, the linear fit between SDF and AOD is accepted if the correlation coefficient is statistically significant with respect to a maximum p-value of 0.05.

Physically-based method (PBM)
The concept of this method is based on the simplified broadband DNI models. Let Gb denote the broadband DNI received on a plane normal to the Sun rays at ground level, it is generally defined as: where is the Sun-Earth distance correction factor depending on the day of the year, is extra- 15 terrestrial irradiance received on a plane normal to the Sun rays also known as solar constant which is 1367 W m -2 , , , , , are individual broadband transmittances of the main attenuators i.e.
Rayleigh scattering, uniformly mixed gases, ozone absorption, water vapor absorption and aerosol extinction, respectively. If and attenuator transmittances are known, the broadband aerosol transmittance can be computed as follows: The broadband aerosol transmittance is mathematically defined as: where is the optical aerosol mass and is broadband AOD. Several researchers have shown that BAOD can be considered as approximately equal to AOD at 750 nm (Qiu, 1998;Molineaux et al., 1998).
To retrieve AOD from SD observations, Li et al. (2016) assumed the diurnal uniformity of AOD, water vapor, ozone from dawn to dusk. They used the simplified broadband DNI model developed by Paulescu 5 and Schlett (2003) for estimating broadband transmittance of each attenuator with a common optical air mass adopted for all attenuators as defined by Kasten and Young (1989). In Eq. (2), is set to the burning threshold of 120 W m -2 according to WMO (2008). The implication of diurnal uniformity of SD is that SD can be converted to hour angle (ω) i.e. = 15( 2 ) in degrees. From an hour angle, in turn, when the latitude and the solar declination angle are known, the solar zenith angle (ϴs) can be computed. 10 Then, the solar zenith angle is used to compute transmittance and air mass. With this approach, AOD is estimated at the instant close to sunrise/sunset under a fully cloud-free day, at an instant when the burned trace becomes visible/extinct. For more detailed information on this method, please refer to Li and al. (2016).
In addition, Li et al. (2016) applied a correction to the SD data. They added a small constant of a few 15 percent to the reported SD data, due to the systematic bias found when compared retrieved AOD to MODIS-AOD. Nevertheless, an overestimation remained in the monthly mean comparisons between retrieved AOD and AERONET measurements. This overestimation was due to an unsuitable constant value for the burning threshold, as we discuss below in more detail.

New Hybrid method (NHM)
One clear advantage of the Li et al. (2016) method is that it does not require, in principle, ancillary AOD measurements for the training as the method by Sanchez-Romero et al. (2016). Moreover, as a physically-based approach, it is an attractive option to estimate AOD from SD measurements. However, we found two points where this approach can be improved. They can be summarized as follows: (1) we 25 make use of more accurate broadband transmittances for each atmospheric attenuator and (2) we establish seasonal station specific burning thresholds, since this threshold is clearly varying as a function of season, station and instrument conditions. AOD information is needed for the second improvement. Therefore, we exploit prior ground-based AOD measurements such as AERONET measurements. If not available, we can exploit satellite-based AOD from MERRA-2 because of its complete spatial coverage. Because, the method uses AERONET measurements as in Sanchez-Romero et al. (2016), we call our new method a hybrid method (NHM).

A more accurate broadband DNI model
Since for cloudless days the first/last burned traces occur close to sunrise/sunset, it is crucial to select a 5 broadband DNI model, which performs well at high SZAs. In the literature, several broadband DNI models can be found, which perform well when all SZAs are included, but very few of them are accurate at high SZAs.
The broadband model can be considered as accurate enough if the broadband transmittance of each relevant attenuator is also accurate enough. One limitation in the Li et al. (2016) approach is the use of a Transfer of Sunshine, Gueymard, 1995) serving as reference. Gueymard (2003a) demonstrated that the differences are significant at high SZA for ozone, water vapor and aerosol contents. 15 Gueymard (2003a) also compared broadband transmittances for each attenuator of these different broadband models against the corresponding broadband transmittance from SMARTS and found significant differences between models. From this study, the most accurate models for each attenuator can be identified as did by Gueymard (2003b  This demonstrates the accuracy of the new model and also the fact that the use of the Paulescu model has partially contributed to the errors found in the Li et al. (2016) approach when deriving AOD from SD measurements.

Seasonal variability of burning threshold
Despite of the recent progress made for accurate SD measurements by means of automatic SD recorder, most of the historical SD measurements have been recorded with traditional devices such as the CSSR 5 or Jordan instruments. Since our objective is to estimate AOD over a period as long as possible, using data from Europe in our validation, CSSR is the instrument of our interest. The observational errors with CSSR are various. These have been discussed in the literature (Jaenicke and Kasten, 1978;Helmes andJaenicke, 1984, Sanchez-Romero et al., 2015). In the following paragraph, a detailed discussion is provided.

10
Among the sources of errors, three situations can be listed as follows. First, if there is water deposit and dew on the glass sphere especially in the morning, more energy or irradiance is needed to warm the glass sphere to be able to focus the sunrays on the card for producing the burned trace (Painter et al., 1981).
Second, in case there is moisture in the card, it needs to be dried enough to make the burned trace visible, implying also the need for additional energy. Third, there is variability in the card types with different 15 properties, for instance, absorption of water by the card (Helmes and Jaenicke, 1984). The errors with the CSSR are also caused by the local effects of the temperature and relative humidity on the burn card (Ikeda et al, 1986).
A need to use seasonal burning threshold was previously reported (Bider, 1958;Jaenicke and Kasten, 1978;Baumgartner, 1979;Painter et al., 1981). Based on several analyses, it has been found that the 20 burning threshold could range from 70 to 385 W m -2 . Painter et al. (1981) mentioned that some daily thresholds could reach values as high as 400 W m -2 under some particular atmospheric conditions. The threshold is typically higher in wintertime than in summertime due to the variability in temperature and relative humidity. Therefore, there is a clear evidence for a seasonal variability of burning thresholds in CSSR measurements depending on the local conditions at the given station (Sanchez-Romero et al., . The concept of this proposed method is to determine local monthly averaged burning thresholds and then use these thresholds in the PBM to derive BAOD as follows, by combining equations (3) and (4) where is a month between January and December, ̅̅̅ ( ) is the computed effective monthly burning threshold. This latter takes into account as many factors as possible affecting SD measurements.
It is computed by using the new model with appropriate atmospheric inputs.

On the relationship between AOD and SDF
As explained above, the effective burning threshold typically exhibits a comprehensive variability from wintertime to summertime. We investigated the nature of relationship between AOD and SDF for various thresholds for a given site located, for instance, at a latitude of 35 N used for SZA computations ranging from 80° and 88°. In this case, a typical atmospheric state is considered here: TOC of 350 DU, elevation 10 of 500 m, Ångström exponent of 1.3, TWV of 3 cm. This given atmospheric state is then associated with each aerosol content from a set of 1000 AODs at 550 nm built following the Monte Carlo draws as described previously. This yields 1000 realistic atmospheric states.
For each atmospheric state, DNI is simulated from sunrise to sunset with a high temporal resolution. For a given burning threshold, the length of the period for which DNI is greater than the burning threshold is 15 derived. Then, it is converted to SDF. Figure 2 shows the curve of the change of AOD at 750 nm with SDF for three burning thresholds: 120 W m -2 , 250 W m -2 and 400 W m -2 . According to WMO (2008), the typical range of SDF under cloud-free conditions is between 0.7 and 1. Regardless of the burning threshold value, the relationship tends to be linear for AOD greater than 0.1 and exhibits a non-linearity at AOD values below 0.1 (Fig. 2).  Table 2 of Wandji Nyamsi et al. (2017). Figure 3a shows the influence of water vapor when deriving AOD from SDF. It shows that the spread of points slightly increases with 5 higher burning threshold. Therefore, this demonstrates the importance of taking water vapor content into account. Figure 3b shows similarly the influence of ozone. The results clearly indicate that ozone does not have a significant influence, regardless of the threshold. Thus, a monthly climatology of ozone content is enough to infer AOD from SDF.

Methodology
To ensure a reasonable set of ground-based stations as well as a good quality of measurements, four criteria have been applied on those ECA&D stations having both SD and TCC measurements within a calendar day. First, to include at least the brightening period, ECA&D stations having SD measurements 5 starting before the year 1985 were chosen. Second, ECA&D stations should be collocated with AERONET stations with a maximum distance of 50 km. Third, the overlapping period that both cloudfree SD and AERONET measurements are available should cover at least three years. Finally, as the fourth criteria, temporal homogeneity tests have been applied on SD measurements to select stations with homogeneous time series. For this last one, three main tests are used: the Standard normal homogeneity 10 test (Alexandersson, 1986), the Buishand range test (Buishand, 1982) and the Pettitt test (Pettitt, 1979).
If two of these three tests indicate that the time series is homogeneous with a confidence of 95%, then the station is included in the study. These filters result in a set of 16 stations over Europe. The station name, code, geographical coordinates and period from AERONET are indicated in Table 1. Table 2 reports similar details as in Table 1 but for SD and TCC measurements from the ECA&D database. The SD time series are divided into two periods, before and during the overlapping temporal coverage when both AOD and SD measurements are simultaneously available. The later period is called "learning/training period", while the earlier period with only SD measurements and no aerosol 5 information is called "reconstruction period". A cloud-free day is assumed when the mean daily value of TCC is rounded to 0 okta. To improve the selection of cloud-free days, an additional criteria was applied as follows: the presence of cloud can be assumed if SDF is less than 0.7 (WMO, 2008). The uncertainty of SD measurement is 0.1 h (WMO, 2008) and the one of AOD at 750 nm is 0.01 (Eck et al., 1999).
Because we assume that the effective wavelength in BAOD is close to AOD at 750 nm, any AOD value 10 is converted to the corresponding AOD at 750 nm by means of the Ångström law. Hereafter, AOD at 750 nm is called AOD for the sake of simplicity.   Table 3 reports values for statistically significant relationships and used for the historical reconstruction of AOD. A linear relationship for a season is assumed statistically significant when the computed p-value between SDF and AOD is lower than 0.05. 10 Then, from the appropriate data sources as described in Section 2, daily atmospheric parameters are used to compute daily burning thresholds for the NHM method by using Eq. (2). For a given month, the burning threshold is computed as the median of at least seven daily values. Then, an interpolation/extrapolation process to the nearest existing neighboring monthly threshold is used for the completion of data for cases of missing values. In such a case, an interpolated/extrapolated monthly value 15 is kept if there is at least one monthly value within a season. After investigating several ways, it was found that the described process yields reasonable and physically understandable results. The computed local monthly burning threshold is then used as shown in Eq. (5) for the historical reconstruction of AOD. All three methods are applied as described previously to show their individual performance to infer AOD 5 from SD measurements. For each method, the deviations are computed by comparing to AERONET measurements. They are synthesized by the mean difference (MD), the root mean square difference (RMSD) and the correlation coefficient (CC).
(7) 10 For each station, seasonal means were computed as the average of all available estimated daily AODs for cloud-free days over a given season. Then, seasonal anomalies were also computed as differences between seasonal means and long-term mean obtained as the average of all seasonal means over the 1960-2010 period including the dimming/brightening period. The anomalies represent the quantity of interest in this study, as uncertainties and systematic biases associated with aerosol estimates can be then  Table S1 in the supplement for computed monthly burning threshold for all stations. The seasonality of the burning threshold exhibits 10 an opposite seasonality as compared to temperature/humidity for the Northern hemisphere. In other words, the lower the temperature, the higher the burning threshold.

Performance of the three methods
To assess the performance of the methods, the available data of the most recent period are randomly divided into two datasets. One dataset is used to establish the slope and intercept for the LRM method.
Then, the second dataset is used for the validation of all methods. The process is repeated several times to ensure the consistency of the validation results. Only stations with more than 80 data pairs for 5 validation are retained to assess the performance of the three methods, resulting in the following 10 stations: Munich University, Ispra, Burjassot, Caceres, Badajoz, Murcia, Granada, El Arenosillo, Malaga and Izana. Figure 5 shows an example of scatterplot between measured daily AOD from AERONET (horizontal axis) and estimated AOD from the three methods (vertical axis) over the validation period in Granada. Depending on the statistical indicators, the best performance is seen either with the LRM method or with the NHM method. At all stations, the worst performance is seen with the PBM method, largely due to the errors induced by the use of 120 W m -2 as a burning threshold.  Table 4: Statistical indicators for each station. N is the number of data pairs. Mean is the average AERONET measurements. MD is the mean difference. RMSD is the root mean square difference. The first value is the PBM method, the second one is the LRM method and the third one is the NHM method. The best performance is in bold.  Table 4 reports the statistical indicators summarizing the errors of the three methods for each station. In general, the performance of LRM and NHM is quite similar and clearly better than the PBM. For all methods, the CC varies between 0.3 and 0.7, with best correlation observed mostly with the NHM method. The MD for PBM is 0.04 (in Munich) at the minimum and reaches up to 0.2 (in El Arenosillo).
In all stations, both LRM and NHM show biases mostly close to zero in terms of MD. The RMSD reaches up to 0.3 (in Izana as well as El Arenosillo) for PBM, while for both LRM and NHM, it reaches up to 0.1 denoting a more limited spread of points. Overall, the NHM performs slightly better than the LRM 5 as indicated by the averaged statistics. Since PBM shows the worst performance in all stations, in the following, we exclude this method in all further analysis.
A diagnostic method is used to quantify the uncertainties corresponding to sunshine duration based AOD retrievals (Sayer et al., 2020). AERONET observations are used to derive the diagnostic expected error ( ) envelope for the retrieved AOD. This type of EE envelope uncertainty estimate is similar to the 10 corresponding one of MODIS Dark Target satellite AOD retrievals (Levy et al., 2010). As derived using AERONET AOD as ground-truth, the diagnostic EE envelope includes all possible sources of uncertainties due to for example, changes of the card type, burning threshold, cloud contamination, changes in aerosol properties during the day, and uncertainties in sunshine duration measurements. To derive the EE estimates, we use a random subset of about 7000 AOD retrievals from the validation dataset 15 as described previously with all stations. We divide the data into 100 bins that each bin contains same amount of measurements. We compute the standard deviation of the retrieval error and the average retrieved AOD in each bin. We intentionally select the retrieved AOD as the variable to compare the uncertainty against so that it is possible to estimate the retrieval uncertainties also in cases in which accurate or measured AOD is not available. We fit a linear model to the uncertainty data and derive the 20 EE envelope estimate as follows: = ± (0.01 + 0.40 × ) where is the retrieved AOD from the NHM method.

6.3.
Reconstructed historical evolution of aerosol load 25 The slopes and intercepts of the LRM method in Table 3 are used to derive AOD from SDF for all cloudfree days over the period where SD and TCC are commonly available as reported in Table 2. Figure 6 shows an example of the historical evolution of seasonal mean AOD in Malaga from five data sources.
The pink, blue, green, orange and brown lines indicate the seasonal mean AOD time series from AERONET, LRM, NHM, MERRA-2 and MODIS respectively. There is no AOD estimate from the LRM method in wintertime because the CC of the linear fit was not statistically significant. However, the NHM method is able to provide AOD estimates, clearly showing the potential of the NHM method to operate in all seasons. In general, the estimated AOD values from respectively, are used to compute the seasonal means. Over those months, few high AODs were measured. Some deviations are seen between the AOD estimates such as those from methods. This is the case, for instance, between the LRM and NHM methods in summer. Therefore, further investigations were carried out in more details. Between the mid-1970s and the 2000s, the AOD estimates from LRM tend to be greater than those from NHM in Malaga. This disparity can be explained as follows: when applying the LRM method, a fixed AOD value derived from a given SDF can be lower or greater than the true AOD value. Figure 7 shows 5 the cloud of magenta dots corresponding to ground-based measurements between SDF and AOD in Malaga during the summer season. The linear fit obtained from the LRM method is in blue dashed line and the estimate from NHM is in green triangle. The period between the mid-1970s and the 2000s in summertime of Fig. 6 is typically characterized by SDF ranging between 0.8 and 0.9 delimited by the yellow area in Fig. 7. A visual inspection shows that the vast majority of the dots within the yellow area 10 is below the regression line meaning that the measured AOD values are lower than the estimated AOD values derived from the LRM method (blue dashed line). In other words, any estimates lower than those from LRM method could be considered as more realistic, which is the case with the NHM estimates. In It is worth mentioning that, in addition to ODR, we have tested several other regression methods such as ordinary least squares (OLS). We found that depending on the season, the reconstructed aerosol load from the OLS method can agree very well with the one from NHM resulting in a close to zero average 10 difference over the full period. However, this was not generally true and in some cases there were significant differences between different regression methods. This highlights the importance to select the most appropriate method that takes into account the measurement uncertainties (Mikkonen et al., 2019).
One of the longest time series for both SD and TCC measurements is from Izana. The results for this station are shown because there are several studies in the literature providing the historical evolution of 15 summer season AOD derived with different approaches, thus allowing now further comparisons and discussions. Figure 8 shows summer season means of AOD estimated from five data sources from 1955 onwards since a breakpoint was found in 1953 within the SD measurements.   Table 1. The cyan area refers to range between 1 st and 3 rd quartiles of the different series for each year. The light purple and misty rose bands indicate (early) brightening and dimming periods, respectively. Figure 9 shows the mean of seasonal anomalies (black line) of AOD for all stations given in Table 1. The lower and upper shading limits represent 1 st and 3 rd quartiles respectively. The peak values are mostly around the specific years when aerosol events are known to be widespread at global scale.

5
During the winter season, there are no clear periods of increasing and decreasing trends. However, in all the other seasons, both dimming and brightening periods can be seen and can be summarized by three main phases as follows: -From the 1940s to the late 1950s, the anomalies show a smooth decrease with the most pronounced decrease during the summer season. This behavior is known as early brightening with a peak around the late 1940s and agrees with earlier studies (Wild, 2009;Sanchez-Lorenzo et al., 2015).
-From the late 1950s to the late 1980s, a moderate increase of the anomalies is seen, known as dimming period. It is in agreement with the well-known period of reduction in SSI at global scale, widely stated in the literature (Stanhill and Cohen, 2001;Wild et al., 2005;Stjern et al., 2009;5 Wild, 2009).
-From the 1990s onwards, a decrease of the relative anomalies is seen. This is in agreement with previous findings of increasing SSI during this period, known as brightening period (Wild et al., 2005(Wild et al., , 2008Wild, 2009).

Conclusions
In this study, we have proposed a new method for estimating AOD from sunshine duration measurements. It is a physically-based method similar to the approach used with the sunphotometer measurements of AOD. The method is used for reconstructing the historical AOD under cloud-free conditions as far back to the past as possible using in addition to the sunshine duration measurements 15 also daily total ozone column and total water vapor from ECMWF twentieth century reanalysis ERA-20C, which are available at global scale from 1900 to 2010. Surface synoptic cloud observations are further required to identify cloud-free days. In addition, as an input, it uses the seasonal burning thresholds, which are measurement station dependent.
We applied the method to sixteen ground-based stations over Europe. As a result, the reconstructed AOD is the number of the day over the year starting from #1 for 1 st January is the total column content of ozone (DU) is the total column content of water vapor (cm) 15 is the aerosol optical depth at 1000 nm is the altitude of site above mean sea level (m)