Estimation of pollen counts from light scattering intensity when sampling multiple pollen taxa – establishment of an automated multi-taxa pollen counting estimation system (AME system)

Laser optics have long been used in pollen counting systems. To clarify the limitations and potential new applications of laser optics for automatic pollen counting and discrimination, we determined the light scattering patterns of various pollen types, tracked temporal changes in these distributions, and introduced a new theory for automatic pollen discrimination. Our experimental results indicate that different pollen types often have different light scattering characteristics, as previous research has suggested. Our results also show that light scattering distributions did not undergo significant temporal changes. Further, we show that the concentration of two different types of pollen could be estimated separately from the total number of pollen grains by fitting the light scattering data to a probability density curve. These findings should help realize a fast and simple automatic pollen monitoring system.


Introduction
Pollen counting is a time-consuming and labor-intensive task that requires professional skills. However, recent technological developments have made automatic pollen sampling and identification possible (Buters et al., 2018), for example, with recognition systems using microscopic images of pollen grains (Boucher et al., 2002;Ranzato et al., 2007;Oteros et al., 2015), pollen color patterns from pollen images (Landsmeer et al., 2009), fluorescence emission signals (Swanson and Huffman 2018;Mitsumoto et al., 2009;Mitsumoto et al., 2010;Richardson et al., 2019), light scattering (Crouzy et al., 2016;Šaulienė et al., 2019), holographic images (Sauvageat et al., 2020), size and morphological characteristics (O'Connor et al., 2013), real-time PCR (Longhi et al., 2009), texture and infrared patterns of microscopic images of pollen (Marcos et al., 2015;Gottardini et al., 2007;Chen et al., 2006), or a combination of several of these. Many studies applied machine learning algorithms to the problem (Punyasena et al., 2012;Tcheng et al., 2016;Crouzy et al., 2016;Gonçalves et al., 2016;Gallardo-Caballero et al., 2019;Šaulienė et al., 2019). These automated pollen identification methods have been applied not only to aerobiological research but also to palynological studies for the identification of fossilized pollen (France et al., 2000;Kaya et al., 2014;Li et al., 2004;Zhang et al., 2004;Rodríguez-Daminán et al., 2006). Analysis using light scattering patterns has a particular focus, with several methods being developed for establishing an automatic aerosol or bioaerosol counting system (Huffman et al., 2016). For example, polarization signals can be used to discriminate Cryptomeria japonica from polystyrene spherical particles (Iwai, 2013). Studies applying machine learning algorithms have shown that light scattering patterns can be used for automatic classification and counting of multiple pollen taxa simultaneously (Crouzy et al., 2016;Sauliene et al., 2019). Other studies have applied statistical techniques to compare the light scattering data and number of multiple taxa pollen grains (Kawashima et al., 2007Matsuda and Kawashima, 2018). Surbek et al. (2011) also studied the discrimination method for hazel, birch, willow, ragweed, and pine pollen showing that they have distinct In the present study, light scattering patterns from various pollen taxa are investigated with a KH-3000-01 to verify whether they have different light scattering patterns. A novel method, the automated multi-taxa pollen counting estimation system (AME system), is also proposed to discriminate between two taxa with similar scattering patterns.

Materials and methods
A protection cylinder (radius = 5 cm, height = 30 cm) was attached to the sampling tube of a KH-3000-01 laser-opticsbased automatic pollen counter (Yamatronics, Japan). The KH-3000-01 is a widely used automatic pollen counting system (e.g. Wang et al., 2014;Takahashi et al., 2001;Miki et al., 2017Miki et al., , 2019Kawashima et al., 2007Kawashima et al., , 2017Matsuda and Kawashima 2018). A laser irradiates particles that pass through the sampling system and the forward and side scattering signals from each particle are recorded. In this study pollen grains from known taxa were injected through an injection tube in the wall of the protection cylinder and sampled in the KH-3000-01 (Fig. 1). The side and forward scattering intensities were evaluated by converting the light intensity into a voltage. The relationship between the light intensity and the physical properties, which are the size and roughness of the particle surface of sampled particle, was analyzed (Matsuda and Kawashima, 2018).

Temporal changes in light scattering patterns
Alnus pollen grains were directly sampled from catkins on a tree growing at the Swiss Federal Office of Meteorology and Climatology on a sunny morning on 28 February 2019. Light scattering measurements were taken using the fresh pollen grains soon after they were collected. The remaining pollen grains were stored in tubes, and scattering patterns were reevaluated after storing them for 1 h, 2 h, 6 h, and 10 d. Multiple comparisons using the Bonferroni method were performed on the side and forward scattering data to assess whether the light scattering distributions showed changes after storage. Bonferroni method is a multiple comparison method used for non-parametric data sets. In order to carry out the multiple comparisons, 316 scattering data of each taxa were picked up because the Bonferroni method requires the same amount of data of each taxa, and 316 scattering data were the smallest amount of data amongst each time step (10 d).

Light scattering patterns of different pollen taxa
Dried pollen grains from Alnus, Ambrosia, Artemisia, Betula, Castanea, Cedrus, Corylus, Fagus, Fraxinus, Helianthus, Olea, Phleum, Quercus, Taxus, and Zea were sampled in a similar way. These taxa are representative of the pollen types commonly observed in Europe. After collecting the light scattering distributions of each pollen type, multiple comparisons using the Bonferroni method were performed to evaluate whether these distributions differ significantly from each other. In order to carry out the multiple comparisons, 210 scattering data of each taxa were picked up based on the smallest amount of data amongst the taxon (Helianthus).

Automatic discrimination theory
To carry out simple and fast automatic pollen discrimination in an AME system, the number of pollen grains of each type from the total number of pollen grains was calculated as follows.
For two different types of pollen (A and B) in the side scattering intensity range a-b and in the forward scattering intensity range c-d, the following equation holds: where P is the representative probability density function of the scattering intensity. p is the representative probability of the scattering intensity of each pollen grain lying in the integration intervals. Next, the scattering intensity distribution that gives the number of pollen grains at each scattering intensity was fitted to a distribution function. In this experiment, the normal distribution was fitted to the number of pollen grains at every 100 mV step. The Gaussian function is written as follows: where α and c are coefficients, µ is the mean, and σ is the standard deviation.
Fitting the data to the normal distribution function enables one to calculate the probability of a pollen grain showing a certain light scattering intensity. The probability density of the normal distribution function (P ) is written as follows: Fitting was performed by nonlinear optimization. The normal distribution was chosen so that we can handle the light scattering plots using a known function.
Equation (1) gives Here, N is the number of sampled pollen grains of each pollen type, which are the values to be calculated. N total is the total number of sampled pollen grains and n is the total number of sampled pollen grains in the integration interval, which are known numbers. C is the correction factor defined by the following equation: C is needed for renormalization of the probability distribution because the device KH-3000-01 is able to detect the scattering intensity only in the range of 0-4500 mV. By solving two equations in Eq. (4), N A and N B will be theoretically estimated.
In this paper, Alnus and Artemisia were chosen as examples to evaluate the usability of the theory above. Because fitting worked well in the range of 600-800 mV for the side scattering and 300-500 mV for the forward scattering,  a = 600, b = 800, c = 300, and d = 500 were substituted in Eq. (4). The evaluation tests were carried out 5 times using the light scattering data for both Alnus and Artemisia (Fig. 2).
The magnitude of the estimation error is calculated as follows.

Temporal changes in light scattering pattern
The scattering distribution of Alnus pollen (Fig. 3) showed no significant temporal changes in scattering distributions in 10 d (Table 1).

Light scattering distributions of different pollen taxa
Pollen grains with smaller sizes tend to show smaller voltage values (Fig. 4). The results of the multiple comparisons (Table 2) indicated that there is always a significant difference between side and forward scattering between two different pollen types except between the following:

Automatic counting
Counting the number of pollen grains of each type can be carried out by solving the two equations from Eq. (4), side (n side a-b ) and forward (n front c-d ), side (n side a-b ) and total (N total ), and forward (n front c-d ) and total (N total ). The parameters of the probability density curve of the side and the forward (Fig. 5)  The results (Fig. 6) show that the estimated number of pollen grains had average errors of 47 %, 34 %, and 39 % for Alnus and 31 %, 19 %, and 21 % for Artemisia (Table 3).

Discussion
Temporal changes in the shapes of pollen grains are expected to affect the changes in light scattering patterns. However, our experimental data indicate that light scattering patterns show little to no changes over time (up to at least 10 d). Thus, there should be no problem using pollen grains that are either fresh or have been stored for several days for studies with the KH-3000-01. Further investigation is required to understand whether this is true for species other than Alnus and for longer periods of time. Understanding the morphological stability of each pollen type would be helpful to understand the temporal stability of light scattering patterns. Light scattering data from various pollen taxa indicate that it is not possible to discriminate between the side scattering patterns of Alnus vs.  theory should be applicable to other pairs as long as they have different scattering intensity distributions.
The estimation of the pollen counts of Alnus and Artemisia had average errors of approximately 40 % and 23 %, respectively. Test 4 had the largest error, with approximately 134 % for Alnus and approximately 44 % for Artemisia, which increased the average error. It is difficult to identify an obvious reason for these large values, but it is possible that the pollen samples were contaminated by dust or that pollen grains picked up for this experiment were biased in size or shape. Additionally, other estimations derived from the fitted curve of the forward and the side scattering distributions showed that even when the pollen counts are estimated only from scattering intensity data without using the total number of pollen grains, which is a known number, the pollen   Other taxa should, however, be investigated in future. Pollen counts can be estimated by solving Eq. (4), which contains three equations, meaning that it is possible to make estimates for three different pollen taxa simultaneously. If K. Miki and S. Kawashima: Estimation of multi-taxa pollen counts from light scattering intensity more integration intervals were picked up from the probability density curve of the scattering intensity and added to the equation, in theory it would be possible to count more pollen taxa. It is possible, however, that the accuracy of the estimated values might decline due to the accuracy of the fitted curve. Therefore, narrowing down a target to two or three pollen types considering the season should be helpful to make accurate automatic counts of several pollen taxa simultaneously.
In this study, the normal distribution function was chosen for fitting because of its universal property. However, further consideration is required to determine the best function for fitting actual light scattering characteristics.

Conclusion
By applying the statistical analysis method, the Bonferroni method, to the scattering patterns of Alnus at each time step, our experiment showed that there seems to be no significant temporal changes in the light scattering patterns. We also confirmed that different pollen types do not always have different light scattering patterns. However, when two different pollen types have different light scattering patterns, it was possible to calculate the number of pollen grains of each taxa using these light scattering patterns by solving the probability density function of the pattern.
Code and data availability. The authors confirm that the data supporting the findings of this study are available within the article. All the light scattering data used in this research are given in Figs. 2, 3, and 4. The code that was used for this experiment was established following the Eqs. (1)-(6).
Author contributions. KM established the system, performed the data analysis, and wrote the paper. SK arranged the experimental set-up and proofread the paper.