Accuracy in starphotometry

Starphotometry, the nightime counterpart of sunphotometry, has not yet achieved the commonly sought observational error level of 1%: a spectral optical depth (OD) error level of 0.01. In order to address this issue, we investigate a large variety of systematic (absolute) uncertainty sources. The bright star catalog of extraterrestrial references is noted as a major source of errors with an attendant recommendation that its accuracy, as well as its spectral photometric variability, be significantly improved. The small Field of View (FOV) employed in starphotometry ensures that starphotometry, unlike sunor moonpho5 tometry, is only weakly dependent on the intrinsic and artificial OD reduction induced by scattering into the FOV by optically thin clouds. A FOV of 45 arc-seconds was found to be the best tradeoff for minimizing such forward scattering errors concurrently with flux loss through vignetting. The importance of monitoring the sky background and using interpolation techniques to avoid spikes and to compensate for measurement delay was underscored. A set of 20 channels was identified to mitigate contamination errors associated with stellar and terrestrial-atmospheric gas absorptions, as well as aurora and airglow emissions. 10 We also note that observations for starsphotometers similar to our high-Arctic starphotometer should be made at high angular elevations, i.e. at airmasses lower than 5. We noted the significant effects of snow crystal deposition on the starphotometer optics, how pseudo OD increases associated with this type of contamination could be detected and how proactive techniques could be employed to avoid their occurrence in the first place. If all these recommendations are followed, one may aspire to achieve component errors that are well below 0.01: in the process one may attain a total 0.01 OD target error. 15


Introduction
The nocturnal monitoring of semi-transparent atmospheric features, such as particles (aerosols, optically thin clouds) or gases (O 3 , H 2 O) can be performed using attenuated starlight, to derive a spectral optical depth (OD). The passive remote sensing method of stellar spectrophotometry (known as starphotometry by the atmospheric remote sensing community) was accordingly introduced in the early 1980s (Alekseeva, 1980;Roddier, 1981). Despite some technological progress, accurate stellar 20 spectrophotometry remains a challenge (Deustua et al., 2013;Bohlin et al., 2014). Its evolution, with an emphasis on problems particular to starphotometry, can be followed in Roscoe et al. (1993); Leiterer et al. (1995Leiterer et al. ( , 1998; Herber et al. (2002); Pérez-Ramírez et al. (2008a, b); Baibakov et al. (2009Baibakov et al. ( , 2015; Ivȃnescu (2015). The accuracy of the optical depth (OD) retrieval resolution camera (for two of the points on Figure 2). Details on the theoretical and empirical context needed to understand the star spot computations is given in the associated text (Appendix C).
In order to avoid any flux loss, the photometer FOV must be much larger than the FWHM of the short exposure image, whose 95 intensity profile (the star Point Spread Function, or PSF) can be approximated with a Gaussian profile (Racine, 1996). The total FWHM is then quadratically composed of the ω=FWHM of the "seeing" spot (the blurring due only to the air turbulence) and ω d =FWHM of the Airy diffraction spot (approximated by a Gaussian profile whose FWHM is set equal to the FWHM of the diffraction spot). Optical aberrations, especially coma for this type of telescope, may also play a role. However, tests done at AiryLab (2012) show that the C11, when correctly collimated, is not subject to optical aberrations that influence the size of such large star spots as those of Figure 2. The angular size of an Airy spot can be computed as ω d = 1.03·λ/D, with λ being the measurement wavelength. This gives less than 1 (0.49 for C11 and 0.75 for M703), at λ = 640 nm (peak of CCD detection).
Since these values are 10-20 times smaller than the star spots, the observed FWHM is practically ω. Figure 2 indicates that, for typical atmospheric remote sensing sites (near sea-level, not particularly dry, near heated buildings etc.), the expected seeing could be ∼ 10 times larger than what is usual in professional astronomy (∼ 1 ). Uncontrolled telescope motion in strong winds important to note the dramatically large values associated with the sea-level (10 m altitude) Eureka station of Figure 2. However, the seeing at the 610 m altitude Ridge Lab (CANDAC site, also at Eureka) is, relatively very small (Steinbring et al., 2013) and comparable with the best observing sites. One concludes that most of the turbulence at Eureka is confined in the first few hundred meters above sea level. It is instructive to characterize the vertical structure of the turbulence, notably its effect on the refractive index variation and, consequently, on star blurring (see, for example, Owens (1967) for basics on the refractive 120 index of air). Unfortunately, a precise characterization solely based on radiosonde measurements may not be possible (Roddier, 1981). However, that vertical structure can nevertheless be approximated parametrically. Accordingly, we express the vertical variation of the star spot size due to turbulence as where dv/v is the relative wind shear (whose kinetic turbulent energy is the primary influence on the refractive index variation 125 (dn) between the atmospheric layers). This equation is a first order, empirically derived to a convenient expression, whose goal was to arrive at a coarse representation of ω versus altitude. The constant k t 6 is an empirical normalizing constant that adjusts the right side of equation (1) so that its integration yields the surface-level ω values of Figure 2. Employing an ensemble of Eureka, polar winter sounding profiles acquired over a ∼ 6 week period within the Figure 2 measurement period, we integrated the dω 2 interpretation of those profiles from the maximum altitude of the radiosonde to a given altitude in order to 130 yield ω at every altitude ( Figure 3). On the median (red) and average (green) curves one can identify major blurring increases: just below 3 km, below 200-400 m (suggesting a quasi-permanent turbulent layer), and again about 10-20 m from the surface.
This confirms the very low ω values at the Ridge Lab, despite the dramatically large seeing at sea-level.
3 Observing methodology A photometric system, from the perspective of the astronomical community, is a system assessing the brightness of an object 135 on a logarithmic scale, normalised to a standard reference (a natural source or a convenient synthetic spectrum).  (1). Most of the turbulence is below the Ridge Lab elevation (magenta line). The black curves were derived from the two nearest soundings to the ω measurements in Figure 2. The kt = 6 (derived for the black curves) was employed for all the other curves.

Catalog photometric system
We denote by I the star irradiance expressed in absolute measurement units. By definition, the apparent magnitude (M ) of a star is computed from the ratio between I (the observed irradiance) and I 0,cref , the unattenuated ("0") irradiance of a "catalog reference" ("cref") source 140 M = −2.5 log I I 0,cref = −2.5 log I + 2.5 log I 0,cref where "log" is short for "log 10 ". The quantity ZP = 2.5 log I 0,cref is usually referred to as the "zero-point" of the photometric system and serves, from a practical standpoint, to identify the photometric system. Star magnitudes are therefore photometric-system dependent. The magnitude of the reference source at 145 any wavelength is, by definition, M 0,cref = 0 (i.e. when I = I 0,cref ). Most of the photometric systems currently employed are based on Vega as primary reference source ("primary standard") (Bessell, 2005).
One can recast equation (2) into its extraterrestrial form M 0 = −2.5 log I 0 I 0,cref = −2.5 log I 0 + 2.5 log I 0,cref The adjective "extraterrestrial" can be also represented by "unattenuated", "extra-atmospheric", "exoatmospheric" or "zero-150 airmass" in the literature (ground-based measurements are also referred to as "attenuated"). The signature extraterrestrial magnitudes of each star are found in the catalog(s) of various observation campaigns. M 0 , obtained with the V standard wide-band filter (Johnson and Morgan, 1953), covering most of the visible spectrum, is usually called visual V magnitude. The blue B magnitude is then obtained with their B band filter, etc. 1996). This catalog provides near-UV to near-IR I 0 spectra 1 for most of the brightest stars (V < 3). Its magnitudes (Alekseeva et al., 1994) are simply expressed as M 0 = −2.5 log I 0 with I 0 converted to cgs units of erg s −1 cm −2 cm −1 . From equations (3) and (4), one concludes that ZP = 0 in equation (5).
Therefore, the reference spectrum used to compute the Pulkovo catalog magnitudes is spectrally flat and equal to unity (I 0,cref = 160 1 erg s −1 cm −2 cm −1 ). Such a unit reference defines a "raw" photometric system (or "raw" magnitudes). Its SI-units value of 0.1 W m −2 m −1 is near the Vega irradiance maximum of 0.0796 W m −2 m −1 at 402.5 nm (when measured at 8.2 nm bandwidth) 2 .

Theoretical considerations
The starphotometer measurement principle is based on the Beer-Bouguer-Lambert attenuation law applied to the starlight 165 passing through the Earth's atmosphere (as described, for example, in Liou (2002)). The attenuation, due to the out-scattering and absorption of the incoming light by atmospheric particles and gases, is described by with τ being the total vertical optical depth, m the stellar airmass, I and I 0 the attenuated and unattenuated star irradiances, respectively. For a plane-parallel atmosphere approximation, m = 1/ cos θ, with θ being the zenith angle of a given star (the 170 approximation is generally valid for θ 80°, or m 6).
The law formulated in equation (6) can be more practically converted in a linear form, by expressing it in term of apparent magnitudes (M and M 0 ), as defined in equations (2) and (4). Taking the logarithm of equation (6), one obtains where e is the natural logarithm base. The exponential law then becomes a linear relation in terms of apparent magnitudes 175 M = M 0 + 2.5 log e · mτ = M 0 + (m/0.921)τ This expression, in conditions of approximately constant τ , can be used to retrieve the intercept M 0 from a linear regression of M versus m. This can be done, for example, by employing a series of irradiance measurements carried out over a clear night with significant changes in m (not always a given in the case of a high Arctic site). Such a procedure is referred to as the Langley calibration technique, or Langley plot (also described in Liou (2002)).

180
1 While the Pulkovo catalog irradiances are correctly expressed in SI units of W m −2 m −1 in the VisieR online database , their values in the published paper have to be divided by 10 5 to yield W m −2 m −1 . 2 Where the conversion is as follows: 1 W m −2 m −1 = 10 erg s −1 cm −2 cm −1 , then I 0,cref = 1 erg s −1 cm −2 cm −1 = 0.1 W m −2 m −1 .

Practical considerations
The measured star signal (F ) is expressed in counts per second (cnt/s). If F 0,iref is the unattenuated "instrument reference" ("iref") signal, defining the instrument photometric system (in cnt/s), the attenuated and unattenuated instrumental magnitudes (S and S 0 , respectively) can be expressed, in a manner analogous to equations (2) and (4)

185
One can convert F into I with an instrument specific conversion factor Applied to the two system references, the ratio becomes This represents a transformation (scaling) factor from the instrument to the catalog reference system. The unitless c/c ref ratio 190 then incorporates the photometric-system scaling (c ref ) as well as the optical and electronic throughput of the instrument (c).
In terms of magnitudes, we can define the instrument-specific calibration parameter Substituting S and S 0 from equations (9), as well as M and M 0 from equations (2) and (4) into equation (12), yields where the role of C as a conversion factor between the catalog and instrument magnitudes is made readily apparent by the elegant simplicity of this pair of equations.
If the catalog reference is the unattenuated source being observed, then I 0 = I 0,cref and M 0,cref = 0 as per equation (4).
Accordingly, from equation (14), C = −S 0,cref . Alternatively, if the instrumental reference is the unattenuated source being 200 observed, then F 0 = F 0,iref and S 0,iref = 0 as per equation (9). Equation (14) then indicates that C = M 0,iref (i.e. the catalog magnitude of the instrument reference source). Equating the C values for those two special cases yields M 0,iref = −S 0,cref . In addition, the calibration parameter may be expressed as with F 0,cref the instrument signal measured when observing the star catalog reference.

205
In practice, equations (9) are often expressed as This either implies that F and F 0 are unitless (i.e. measurements are already normalised to the instrument reference), or that the reference is conveniently chosen as F 0,iref = 1 cnt/s (ZP = 0). Such a unit reference, as in the case of the catalog system, 210 defines a "raw" photometric system (this is the system that is employed for our starphotometers). According to equation (11), having unit values for both photometric system references, implies that the scaling factor is also unity (c cref = 1). This yields The calibration procedure then reduces to the unattenuated measurement of any source of known irradiance. Equation (18) may be used in laboratory based calibrations, or in "in-situ" calibrations, by measuring any accurately known star spectra.

215
This may be done in a Rayleigh atmosphere (i.e. without aerosol or clouds), for which the attenuation can be accurately estimated (Bucholtz, 1995). Such conditions can generally be approximated at high elevation, calibration sites (supported by some independent estimate of the small but non negligible aerosol optical depth).
If we define, for simplicity 220 equation (8) can be rewritten as Substituting M from (13) into (20), yields a Langley calibration equation whose ground-based (τ dependent) component is expressed in terms of the instrument signal S This expression enables the retrieval of C when M 0 is provided by a catalog. However, if an accurate M 0 spectrum cannot be found, then equation (14) can be used to transform equation (21) into a pure instrumentation version so that a catalog is no longer required. Instead of finding C, one has to employ Langley calibrations to estimate S 0 for all stars that are part of the operational protocol of a given starphotometer. Equation (21), on the other hand, has the advantage 230 of casting the calibration procedure in terms of an explicit function of a single star-independent constant (C). C represents an intrinsic parameter that remains constant as long as the instrument characteristics do not change.

One-star method (OSM)
Considering that the main purpose of starphotometer measurements is to retrieve the optical depth (τ ), we rearrange equa-235 tions (21) and (22) to yield Restricting measurements to one star speeds up the acquisition process. This is particularly useful in the presence of rapid τ variations that one observes, for example, during cloud events. However, since equation (23) contains calibration values, any optical or electronic degradation of the instrument will propagate into the τ estimation. The Langley calibration enabled by equation (21) allows the direct retrieval of τ as the slope of a linear regression between S and x. In lieu of such a lengthy procedure (typically requiring hours of measurements acquired over a large range of x) or directly applying the instantaneous OSM equation (23), one can use the measurements of two different stars at two different airmasses in the Two-star method (TSM) 3 . Equation (21) yields where the subscripts "1" and "2" refer to a low star (large airmass) and a high star (small airmass), respectively. In order to minimize OD errors associated with this technique, the airmass difference between the two stars should be large. However, beyond airmass 5, the impact of higher measurement errors may overcome the benefits of a large airmass range (see Young (1974) for an optimization analysis). In practice, the high star is in the range of 1-2 airmasses, while the low star is in the range 250 of 3-5.
The "auto-calibrating" feature of equation (24) (i.e. no need for C), is limited in its applicability: there are temporal and spatial restrictions on the variation of τ between the two observations. It is therefore a method that is more appropriate for the typically weak and slow variations associated with aerosols (as opposed to the typically strong and high frequency variations associated with clouds). There are also restrictions on the optical throughput variation: specifically in terms of any dust, dew, 255 frost or snow deposition on the telescope optics or star vignetting and focusing variations between the two observations. In fact, the TSM can be interpreted as an OSM, with C being determined by regression through only two data points. The TSM-based calibration vector can be obtained from equation (23)

Optical depth accuracy
In reality, we cannot measure the starlight alone, since the measurement always includes a background signal B. The latter is mainly due to electronics readout signal and sky brightness. If R is the starphotometer measurement obtained while pointing towards the star, then B can be estimated by a slightly off-axis measurement. In dark sky conditions, B is dominated by the 265 instrument dark current. The desired starphotometer (starlight) signal is estimated as with attendant systematic error components For small relative errors δ F /F , one obtains δ S by taking the derivative |S | of S with respect to F in equation (16) 270 If the only errors are in S, then equation (23) yields However, the optical depth accuracy is subject not only to errors in the observational parameter (S), but also to all the other physical parameters (M 0 , C, x) involved in the starphotometry retrieval. All the contributions to the line-of-site observation 275 error can be explicitly listed by differentiating equation (23) δ The other components of the observation error that represent magnitudes (M 0 and C, as per equations (5) and (18)) can, in a similar fashion to equation (29), be expressed as

280
A comprehensive description of starphotometry related errors can be found in Young (1974) and Carlund et al. (2003). In the following sections we continue this work by quantifying the accuracy of each individual parameter of equation (31)

Spectrophotometric catalog (M 0 ) accuracy
In order to move from a star-dependent S 0 calibration, which is currently the standard (Rufener, 1986;Pérez-Ramírez et al., 285 2011), to the more convenient star-independent calibration in terms of C, one has to ensure that the exoatmospheric magnitudes M 0 are sufficiently accurate. The star dataset that we employed (Appendix B) was limited to stars with a maximum of 0.01 magnitude variation in the observations used to generate their Pulkovo catalog entry (that dataset was employed as the default catalog by the manufacturer of our instruments). They are mostly main-sequence stars (of luminosity class V) (Kippenhahn et al., 2012) at the most stable period of their life-cycle (five are of luminosity class II-III). Five are of "early-type" spectral 290 class B stars (i.e. B0-B3), one of "late-type" class A (i.e. A7-A9) and one of class F. They are all characterized by weaker absorption lines and cleaner continuum (Silva and Cornell, 1992). However, the "early-type" B stars may also experience nonnegligible (0.01 magnitudes) photometric variability (Eyer and Grenon, 1997). Beyond their intrinsic photometric stability, the M 0 accuracy remains a concern. Alekseeva et al. (1996) stated that: "to preserve the uniform absolute system for all our seasonal catalogues, we always used the same energy distribution of Vega based on the absolute calibrations by Oke and Schild (1970) and Kharitonov et al. (1978)". In other words, Vega data was calibrated to the accuracy level achievable about 50 years ago. In addition, Knyazeva and Kharitonov (1990) specified that their (Kharitonov et al. (1978)) calibration values were actually subject to systematic errors that could be as large as 10%. In spite of the shorcomings of the Pulkovo catalog, it remains the most accurate catalog in terms of representing the entire bright star dataset of Appendix B. By comparison, the Hubble Space Telescope (HST) dataset includes only a few of those stars. To better understand the impact of the Pulkovo 300 catalog shortcomings, we compared its absolute irradiances with those measured by the HST. This higher accuracy dataset only contains a few bright stars: Vega (HR7001) and Sirius (HR2491) from the CALSPEC Calibration database , and HR15, HR2618 and HR4295 from the STIS New Generation Stellar Library (NGSL) (Bohlin et al., 2001). Inasmuch as HST measurements are performed with a more recent technology, are not subject to atmospheric effects and have absolute errors below 1% (Bohlin, 2014), we considered them to be the reference. The corresponding magnitude differences between the 305 Pulkovo and HST spectra, computed in terms of the Pulkovo photometric system, are presented in Figure 4. Within a context of the potential impact of atmospheric errors, it is remarkable for a catalog derived from ground-based measurements, that more than half of the standard starphotometer channels (open circles) are characterized by errors of less than 2% or equivalently δM 0 < 0.02 (equation (32)). Based on the average difference of Figure 4, one nevertheless concludes that the Pulkovo catalog is characterized by a bias that is particularly large in the near-UV and in the 900-1000 nm range. These biases may, in part, 310 be attributable to uncertainties related to the stronger aerosol scattering effects in the UV and to water vapour effects in the near-infrared (NIR). The average bias found in Figure 4 could then be used to correct the Pulkovo catalog. However, a bias will not actually affect the optical depth measurements. For example, in the TSM mode, such a bias is canceled out in the M 0 magnitude difference of equation (24). Even in the OSM mode of equation (23), the bias will actually propagate into C during the calibration process. This bias transfer is attributable to the fact that a bias will only affect the intercept of the Langley plot, not its slope, as expressed by equation (21). The δM 0 standard deviation of Figure 4 (∼ 0.02), can, on its own merits, be compared with the accuracy of 0.015-0.02 claimed for the Pulkovo catalog (Alekseeva et al., 1996), although these values increase in the UV and water vapour channels. One should also note that for its primary reference stars, such as Vega and Sirius, the 0.02 Pulkovo catalog upper limit of error is halved. Such error levels will impact information extraction from optical depth spectra, especially as the required accuracy for aerosol retrievals sensitive to higher orders of the AOD spectrum   Alekseeva et al. (1996) re-processed the 5 nm measurements to synthesize a unique 10 nm bandwidth. Currently, we only use the 10 nm bandwidth version over the entire 310-1105 nm range. However, as noted in Young (1992), a bandwidth mismatch between the catalog and the instrument (i.e. 10 and 8.2 nm, respectively in our case), 330 may have an impact on the optical depth error and merits investigation. In order to asses the impact of the bandwidth mismatch, we compared the magnitude errors when using M 5.0 and M 10 , associated with the 5 nm and 10 nm bandwidths, instead of the actual magnitude M 8.2 at 8.2 nm bandwidth. We also assessed how a simple magnitude calculation (M 5.0 +2M 10 )/3 compares with the actual 8.2 nm bandwidth, in order to improve the actual 10 nm bandwidth catalog. We synthesised star magnitudes for those three different bandwidths by applying Gaussian bandpass filters to the HST data (originally at 1 Å resolution). This is, 335 in fact, a convolution operation that effectively blurs the stellar absorption lines.
In Figure 6 we compare the magnitudes computed for the three bandwidths, for a star of spectral class A0 (Vega). Figure 6a shows a spectral zoom about the 420 nm starphotometer channel. The increased broadening with increasing FWHM about the H γ and H δ Balmer lines demonstrates the blurring effect of the different bandwidths. The graph also shows that one may actually limit the blurring impact by optimizing the spectral location of a given channel. Moving the 420 nm channel to 423 340 nm will, for example, significantly reduce that impact. Figure 6b shows the contamination due to different blurring levels for the entire spectrum (contamination expressed in terms of δM 0 , which from equation (31) is, in the absence of other errors, The same exercise carried out for a star of early-type spectral class B (Adharaz) underscores the fact that the H Balmer lines are much weaker ( Figure 7a). One expects a similar behaviour for our "late-type" A and F class stars. Consequently, the blurring contamination over the entire spectrum ( Figure 7b) is, for the case that concerns us the most (M 8.2 − M 10 ) largely less than 0.01, except for the 958 nm channel that is too close to the 954.6 nm H Paschen absorption line. Inasmuch as all 350 our operational stars are of class A and B (except for one F class star), this analysis is representative. Since the bandwidth mismatch error is a bias that differs for the two star classes of Figures 6 and 7, it may be minimised by distinct photometric calibrations for each star class. However, this may be of limited applicability since the local sky does not present a sufficient array of photometrically stable stars of early-type B, late-type A and F spectral classes.
Up until this point, we have presumed a stable spectral calibration of the instrument. In Figure 8 we show SPST09 spectral of 0.5 nm from one year to the next. Such a spectrally variable drift is particularly harmful inasmuch as it will likely influence the spectral shape of the photometric calibration vector (i.e. C). A second consequence is that the channels may be subject to additional stellar absorption line contamination if the drift moves them closer to those lines.
A third broadband consequence of the spectral drift results from the stellar-magnitude spectra being generally characterized by a significant positive spectral slope, over the 400-1100 nm range, for both, A and B class stars, (c.f. Figure 9a and Figure 10a, illustrated in Figure 9b. These results indicate that the maintenance of photometric bias values below 0.01 magnitudes requires a spectral calibration within 1 nm (excluding the case of strong water vapour absorption in the near-infrared). The same exercise is presented in Figure 10 for the early-type B star. While there are individual channel differences with respect to the class A 370 star, the broad δM 0 results are similar because the M 0 slopes are similar.
As long we employ the same class (similar spectral signatures) for both high and low TSM stars, any spectral drift is mitigated in real-time (i.e. similar δM 0 trends produce common biases and thus the type of bias mitigation discussed in the case of Figure 4 will prevail). While the bias in the OSM case will be initially absorbed into the calibration constant, any additional drift will progressively propagate into post-calibration δ τ error. Based on the analysis of Figure 8 an annual spectral 375 calibration (preferably at the beginning of the observing season), will likely ensure that the spectral drift be constrained to values 0.5 nm with negligible effect on the measurement accuracy. Our experience indicates that the six absorption lines employed in the development of Figure 8 are sufficient to adequately characterize the spectral shift of all the starphotometer channels. The radial velocity (stellar center of mass moving away or towards an Earth-bound observer) of our Eureka stars, as retrieved from (Wenger et al., 2000), lead to 0.15 nm maximum Doppler spectrum shift at 1000 nm, and 0.06 nm at 400 nm, 380 among our stars. Therefore, this effect can be neglected during spectral calibration.
An M 0 catalog whose bandwidths match those of the instrument is preferred in order to avoid bandwidth mismatch errors.
One natural approach would be to generate a S 0 catalog by calibrating the starphotometer at a high altitude site. A single calibrating site may not, however, yield a sufficient number and class diversity of S 0 values (i.e. a sufficiently comprehensive catalog of stars) to satisfy the starphotometry requirements of a given operational starphotometer site. For spectrometer based starphotometers, it is necessary to retrieve S 0 at all available spectrometer channels (not just the nominal operational channels) since the spectral drift calculations need to be done at the highest resolutions. This S 0 catalog can then be transformed into a corresponding M 0 catalog by first resampling HST M 0 values of a selected reference (Vega or Sirius) to the spectrometer resolution and then employing equation (14) to compute C. With that HST-derived value of C in hand, the same equation can be rearranged to yield M 0 = C + S 0 values for all the other stars. Accurate C values and spectral calibration may also 390 be obtained in the laboratory with the help of a halogen calibration lamp (Paraskeva et al., 2013), or by doing simultaneous measurements on site with a collocated calibrated instrument.
The alternative to an instrument specific catalog is to use a general purpose high resolution spectrophotometric catalog, from which one can synthesize magnitudes at any bandwidth (as we did with the HST spectra). Given the maximum bandwidth mismatch errors found in the Pulkovo catalog (∼ 0.04 in Figure 6b for standard channels, at 8.2 nm bandwidth), we estimate 395 that a catalog with about 1 nm bandwidth, i.e. about a factor 10 less, would be enough to limit the errors to < 0.01. We note that the generally sub 0.01 mismatch errors estimated for a 1 nm spectrum shift (Figures 9b and 10b) are not inconsistent with this affirmation. In general a higher resolution catalog such as the HST catalog, with its 1 Å, resolution would be preferred. It is however surprising that there are no existing high resolution, near-UV to near-IR, spectrophotometric catalogs that achieve 1% accuracy (Kent et al., 2009) for the bright (V < 3) stars. The stars observed by professional astronomers are usually 400 much fainter (V > 6) in order to avoid saturating the detectors. This may explain the lack of interest from the astronomical community in improving the absolute spectrophotometry of bright stars. An effort to address this situation was pursued by Le Borgne et al. (2003), with their release of the STELIB catalog. However, we identified large biases in the blue/UV part of the STELIB spectra ( Figure 11) in comparison with the HST NGST catalog. The fact that the Pulkovo catalog also has the largest bias in that range (Figure 4), suggests a recurring issue for catalogs generated from ground-based observations (perhaps 405 due to the higher optical depth in the blue, and the deficient compensation for aerosol contributions), and accordingly, that an accurate catalog must be of extraterrestrial origin. Most of the ground-based measurements are focused on achieving 1% accuracy using broad-band photometry (Stubbs and Tonry, 2006). It is noteworthy however that Zhao et al. (2012) reported a new spectrophotometric (high spectral resolution) catalog (including our entire bright star dataset) derived from LAMOST 4 measurements that approached the same 1% accuracy. The spectral resolution and bandwidth of this catalog are variable, but 410 always sub-nm. The spectral range extends over most of our spectrum, but unfortunately not beyond 900 nm. A novel future approach for improving the ground-based catalogs would be to employ an accurately calibrated satellite light source in order to perform stellar differential photometry (Albert, 2012;Peretz et al., 2019).
As an alternative to satellite-based catalogs, the recent ACCESS rocket project (Kaiser and Access Team, 2016) was also a promising initiative, given their mandate to perform high spectral resolution photometry near the top of the atmosphere. Un-415 fortunately their list of V < 3 bright stars is limited to Sirius and Vega. Another recent initiative is the NIRS STARS campaign (Zimmer et al., 2016), whose mandate is to produce a bright star spectrophotometric catalog using lidar measurements to back out the atmospheric contribution. However, once again, the brightest stars (V < 3) are largely excluded from consideration.
The most promising option is the use of GOMOS satellite-based star observations (Kyrölä et al., 2004). This sensor employs high resolution (1.2 and 0.2 nm, depending on spectral bands) limb starphotometry to retrieve ozone and other atmospheric 420 components from space. Its off limb measurements, performed before each limb scan, can be used to build an exoatmospheric spectrophotometric catalog (Ivȃnescu et al. (2017)). Unfortunately, GOMOS' spectral ranges of 250-675 nm, 756-773 nm and 926-952 nm don't cover our entire 400-1100 nm spectrum. They do, however, cover the problematic spectral ranges experienced in ground-based measurements (the UV/blue and across the O 2 and the H 2 O absorption bands). The missing portions of the starphotometer spectra can be filled in by fitting the STELIB, LAMOST spectra, synthetic spectra (Rauch et al., 2013) or 425 averaged star-type spectra (Pickles, 1998) to the GOMOS measurements.
Beyond that, the broadband photometric stability of bright stars remains an open question (as emphasized in Appendix B) and has to be investigated. A uniform photometric variation over the entire observed spectrum may, however, be less critical than a non-uniform one. In an example of the latter case, star temperature variations would lead to spectral distortions with potential impacts on aerosol retrievals. The analysis of the GOMOS measurements should enable a characterization of spectral 430 variability. If however, an insufficient number of stable V < 3 stars are found (i.e. having differential M 0 variations of < 0.01 between channels), the use of fainter stars may be necessary: this would require the deployment of a larger starphotometer telescope. We will continue, in the short term, to employ the Pulkovo catalog spectra for the operational M 0 values of our star dataset. However we will use a synthesised 8.2 nm version, over the available spectral range, to mach our starphotometer bandwidths.
Systematic errors in the calculation of the airmass m (or alternatively x) can be significant (see, for example, Rapp-Arrarás and Domingo-Santos (2011) for a review of analytical airmass formulae). The following operational equation characterizes m, in a spherically homogeneous, dry-air atmosphere with an accuracy of better than 1% at m = 10 (Hardie, 1962): where z is the apparent zenith angle (the zenith angle of the refraction-dependent telescope line of sight). This expression only departs significantly from the plane parallel expression of m = sec z at values of m > 5. If the target star position is computed using astronomical data rather than a measured instrumental mount position then it is more appropriate to use the true zenith angle (z t ) formula of Young (1994). The computation of z t can be effected using star coordinates, site location and time. It ensures an associated maximum 0.0037 airmass error at the horizon (with respect to calculations made on a standard 445 mid-latitude atmospheric model).
One should note that the airmass depends slightly on the vertical structure of the atmosphere (Stone, 1996;Nijegorodov and Luhanga, 1996): an effect which is particularly distinctive in a polar environment. The relative errors due to such environmental variations are however below 0.2% up to z 82°(m 7), and below 1% at z 87°(m 15) (Tomasi and Petkov, 2014).
Differences in airmass associated with different atmospheric constituents (Tomasi et al. (1998) and Gueymard (2001)), have 450 negligible impact on the observation accuracy of starphotometry.
In spite of the generally high accuracy associated with airmass expressions, the airmass error can be significant if the recorded time stamps are inaccurate. Stars targeted by our starphotometers are recentered between several (3-5) consecutive exposures: a process that is of variable duration (usually 20-40 s). The airmass associated with the mean of all the measurement times (the one reported) may differ from the airmass associated with the mean observing time (the weighted mean time where the 455 weights are exposure duration times). A δ x error in x, induced by a δ t time error, is equivalent to a measurement error δ ≡ δ x τ (equation (31)). Figure 12 shows the variation of δ with altitude (for hypothetical observation sites at different elevations), for a δ t = +30 s case (i.e. time overestimation leading to δ x > 0 for a descending star), at λ = 400 nm, and for three different airmasses in a Rayleigh atmosphere (the condition of molecular scattering domination; see Bucholtz (1995) for the optical parameterization of a Rayleigh atmosphere).

460
The variation of δ with x is shown in Figure 13a for observations at 10 m (Eureka elevation) and 2360 m (Izaña observatory elevation). The real x variation at Eureka is weak (near the poles, stars carve out sky tracks that vary little in zenith angle). The δ variation, for unrestricted variation in x (up to x = 7), will be comparable for both sites. Figure 13b shows the corresponding 6 Observation (S) accuracy

Heterochromaticity
Wide-band optical depth calculations using starlight as the extinction source were first described in Rufener (1964) (in French).

470
A comprehensive description by Golay (1974) (pages 47-50) affirms that non-linear, wide-band radiation detection effects are negligible in terms of S estimation for spectral bandwidths narrower than 50 nm. The error associated with this non-linear component is about the squared ratio between the bandwidth and the central wavelength (i.e. (∆λ/λ) 2 ), Rufener (1986)).
A bandwidth of less than 40 nm is then sufficiently small to achieve optical depth errors < 0.01 at 400 nm. These optical depth (heterochromaticity) errors should be well below the negligible value of 0.001 for our sub 10 nm starphotometer-channel 475 bandwidths.

Lognormal fluctuations
The optical depth retrieval, as expressed by equation (23) or (24), is based on computing the instrumental magnitudes S through the logarithm of the measured star signal F . However, before doing so, one performs an arithmetic mean F over several consecutive exposures. Since F is subject to log-normal fluctuations induced primarily by scintillation effects (Roddier, 1981), one should characterize its probability distribution in terms of its geometric mean log F and its geometric standard deviation σ log F . The corresponding bias, called "misuse of least-squares" by Young (1974), is given by (a classical relationship between the geometric and arithmetic means). From equation (16) and the general definition of a standard deviation, δ S = 2.5δ log F and, similarly, σ S = 2.5σ log F . The bias then becomes Since a single OD measurement is effectively the arithmetic mean of 3-5 measurements, then only observation fluctuations with σ S > 0.22, which we basically never experienced (even at large airmasses), would lead to δ S > 0.01. One can conclude from equation (30) that xδ τ < 0.01 and thus that this issue is negligible in starphotometry.

Forward scattering 490
Forward scattering into the photometer FOV by atmospheric particulates increases the magnitude of S and thereby induces an underestimate of the optical depth. This "forward scattering error" can be estimated with the single scattering expression where P ∆Ω is the integral of the normalised scattering phase function P over the angle Ω = FOV/2 (Shiobara et al., 1994). Figure 14 shows a variety of forward scattering error calculations obtained using equation (36) at a wavelength of 400 nm.

495
The red curve represents a typical biomass burning aerosol example (Qie et al., 2017), based on P given by the widely used Henyey-Greenstein (HG) phase function (Zhao et al., 2018). It underscores its negligible forward scattering error on any practical FOV size.
For ice-crystals, ω is practically unity. Two ice-crystal effective diameters were employed: 10 µm (non-precipitating clouds, magenta curves) and 120 µm (precipitating clouds, blue curves). Three crystal habit models were employed to represent the The computations of Figure 14 assume that the contaminating particles (those that induce the FOV scattering effect) are also 505 the particles that one seeks to detect. Those δ τ /τ computations still apply when the contaminating particles differ from the particles to be detected as long as the FOV effect of the contaminating particles dominates the FOV effect of the latter (for example the FOV effect could be dominated by small OD ice clouds while one seeks to detect fine mode aerosols of significantly larger OD). Optical depth measurements using CIMEL-like instruments in the presence of clouds with δ τ /τ between 0.15 and 0.5 (the intersection of the dash-dotted CIMEL line with ice cloud curves in Figure 14), means that the δ τ < 0.01 requirement can only be fulfilled for cloud τ less than 0.015 or 0.05, respectively. If the clouds are thicker than that (which is generally the case), cloud screening is required to ensure accurate AOD. In the case of our starphotometers, those errors are negligible in the presence of non precipitating ice clouds (D ef f = 10 µm in Figure 14). Even in the case of precipitating clouds (D ef f = 120 µm in Figure 14), the SPST09/C11 instrument, for which δ τ /τ ∼ 0.01, still provides the required accuracy as long τ < 1.

Night sky background 515
Airglow, and potentially aurora, can be important contributors to the night sky background (see Chattopadhyay and Midya (2006) on the importance of airglow). Their high frequency temporal and spatial variability (Dempsey et al., 2005;Nyassor et al., 2018) complicates their elimination in a background subtraction process. This can lead to significant optical depth systematic errors. Their spectra are similar: in particular both exhibit a strong 557.7 nm [OI] green emission line, whose intensity is used for classification of auroral strength. Unique signature features of each phenomenon are those due to OH band emis-520 sions in the case of airglow and N 2 (first positive system) emissions in the case of aurora (Chamberlain, 1995). The emission line intensities are usually expressed in Rayleigh (R) units (effectively a measure of directional panchromatic radiance, as per Baker (1974) (provides a total illumination on the ground equivalent to full moonlight) (Chamberlain, 1995). We note that the assessment of the accuracy errors for those classes, may help to infer the effect of moonlight and moonlit clouds too. Figure 15a shows their emission density spectra (Rayleigh per unit wavelength), converted to 8.2 nm bandwidth of our starphotometers. The airglow data (the black solid curve) represent tropical nighttime observations made by Hanuschik (2003).
These include zodiacal light (sunlight scattered by dust from the solar system ecliptic plane). The actual airglow emissions 530 should accordingly be even weaker. The aurora density spectra (the colored solid curves) are a compilation of observations from Jones and Gattinger (1972), Gattinger and Jones (1974), Jones and Gattinger (1975) and Jones and Gattinger (1976).
Their spectra were adjusted to produce three curves, representing the first three IBC levels. Their common continuum (without respect to aurora class) is adjusted to 8 R/Å, the minimum value proposed by Gattinger and Jones (1974).
Figure 15b enables an appreciation of airglow and aurora effects on starphotometer measurements. It shows the ratio of 535 those spectra to the Vega spectrum (artificially attenuated to magnitude V = 3, the faint limit of our star dataset). The resulting estimates of optical depth error (equation (30) converted to observational error xδ τ of equation (31) the ratio of their solid angles, (57.3/36.9) 2 ). We note that, in spite of the fact that the C11 emission spectra are significantly 540 higher in the near IR spectral region, they are, except for the IBC3 case, generally less than 1%. Short term airglow variability induced by air density fluctuations engendered by gravity waves may occur (Nyassor et al., 2018). Figure 15b indicates however that typical airglow conditions have negligible error contribution. Even at twilight, when the Sodium emission lines, at 589.3 nm, can be enhanced by a factor of 5 (i.e. the "Sodium flash" reported by Krassovsky et al. (1962), the potential accuracy error remains negligible.

545
On the other hand, the aurora is characterized by a much higher temporal and spatial variability (Dempsey et al., 2005).
Beyond that, the aurora shown in Figure 15 is of the green type (i.e. main visible line at 557.7 nm), but one may have other types too, the most common being red, with the main visible line at 630 nm. Therefore, one may get spectral variation too.
Such variations may induce significant departures from the nominal emission background spectra of Figure 15a. Considering the results of Figure 15b the worst estimation of those variations, the optical depth error remains well below 0.01 for the C11 550 telescope, even when observing a weak V = 3 star during an IBC2 aurora (solid red line). An IBC3 Aurora can, given that a (factor of 10) IBC class change is equivalent to a magnitude change of 2.5, be accomodated by employing a sufficiently bright star: the IBC3 representation for a V = 0.5 star will decrease to the red (sub 0.01 error) IBC2 curve in Figure 15b. Fortunately, given the current location of Eureka in the auroral oval (Vestine, 1944), IBC3 aurora will only be seen occasionally near the horizon. Therefore, the accuracy errors of Figure 15b will only appear at airmasses above 5. However, this may change in the 555 next decades, given the recent fast pace migration of the magnetic pole (Witze, 2019;He et al., 2020).
The IBC definition provides a way to also infer δ F /F errors associated with the presence of thin moonlit clouds by simply arguing that the red IBC2 curve of Figure 15b also applies to the IBC2 analogy of "thin moonlit cirrus clouds". By definition, the δ F spectrum for such a case corresponds to the IBC2 radiance of Figure 15a. The F value for a V = 0 star in a thincloud atmosphere can be modelled, in an order of magnitude fashion, by assigning a value of τ x = 3 to equation (21). Using this attenuated star signal as a rough model for the IBC2 moonlit clouds analogy, we employ the same equation to show that the V = 0, cloud-attenuated star magnitude is equivalent to an unattenuated (τ = 0) V = 3 star. In other words, the same F is used to obtain the red δ F /F curve of Figure 15b but, with the added rider that the exoatmospheric star was a V = 0 star. Accordingly the acceptability of the sub 10 −2 red error curve in Figure 15b applies to the moonlit cloud IBC2 analogy, but for a V = 0 star. Actually, given the strong snow albedo in the Arctic, thin cloud brightness may even exceed IBC2 565 brightness during full moon conditions. Quantitative assessment of optical depth errors related to moonlit and twilight lit sky brightness, especially in cloudy situations, would however require the development of a radiative transfer model informed by starphotometer background measurements. Given the complexity and specificity of such endeavour, this will be addressed in a future study.
The typical polar wintertime night sky background spectrum at Eureka (in terms of catalog-photometric-system magnitude 570 per square arc-second) is shown in the Figure 16a, at two different times: mid-day (magenta curve, local time) and evening (blue). The evening sky is darker and approaches the detection limit of our instrument (as made evident by its noisier profile).
This detection limit may be the reason for the difficulty in identifying the aforementioned aurora and airglow lines in the visible range. An omnipresent weak line, unassociated with any major aurora emission lines, is however noticeable around 440 nm (436-445 nm band). Some absorption lines can also be identified: 532 nm, probably due to O 4 (Orphal and Chance, and 663 nm, probably due to N O 3 . The midnight sky is expected to be even darker (higher visible magnitude). One also notices a brighter infrared spectrum, rather constant throughout the day, confirming the J band measurements of Sivanandam et al. (2012). This may be associated with the airglow OH lines, but a factor ∼ 10 higher than estimated in Figure 15b. The evening sky background with respect to a magnitude V = 3 star (simulated by dimming Vega) exceeds the 1% mark beyond 900 nm (red curve in Figure 16b). With respect to a magnitude V = 0 star (Vega, blue curve), the 580 evening sky background remains below the 1% mark in the starphotometer spectral range, i.e. < 1050 nm. This indicates that accurate measurements, in the case of weakly radiating (V > 1) stars, can only be achieved by applying a reasonably accurate background subtraction for wavelengths larger than 1000 nm.

Calibration parameter (C) accuracy
The accuracy of the calibration parameter C retrieval is dependent on the performance of the calibration procedure and will 585 accordingly be addressed in a separate study. C accounts for the optical and electronic throughput: we asses here the instrument instability or degradation that may alter it.

Misalignment issues
One way to get throughput degradation is by losing flux outside the boundary of the FOV, due to focusing error (blurring), to off-axis star centering errors, or because the FOV is simply too small (design error). The instrument was originally built for 590 the M703 telescope specifications. The smaller FOV of the C11 telescope (almost half of M703's) is at greater risk of focusing errors, particularly in Eureka, where the star spots are larger (Figure 2). An analysis of the impact of design shortcomings on both instruments is an instructive exercise. Figure 17 illustrates the effect of defocusing the optical train within the context of the associated OD errors (case of the C11 telescope) and of star centering errors (cases of both, C11 and M703 telescopes).
The ∼ 1.36 mm/step along the axis, or equivalently ∼ 10 /step angular increase of the confusion circle), before and after passing through the on-focus position. For the centering error, they mean the spot shift, in pixels of the high resolution camera, before and after passing through the on-axis position.

600
Our focusing stage employs a continuously driven motor, subject to electronically controlled steps. Those steps represent approximately the same distance along the optical axis, based on a fixed driving time interval. The best that can be achieved is a half-step focus, for which the flux loss, in the C11 case, is a negligible ∼0.02%. In the absence of an automatic focusing procedure, the focus has to be checked and adjusted manually whenever there is an important temperature variation. This may happen because of weather changes or as the result of opening the dome (with significant optical impacts up to one 605 hour after the opening). Based on our Arctic experience, the focus must be corrected by one focus step for each 10°C change of temperature: if this correction is performed, the flux loss is a negligible 0.35%. Any focusing errors larger than that will significantly affect the optical depth estimation (Figure 17b).
Star centering is based on an automatic tracking procedure that ends once a specified centering tolerance δ c is satisfied (δ c is an input parameter required as part of the starphotometry measuring sequence). Such a tolerance has to be small enough to 610 ensure that, during the subsequent measurement, the star still remains in the accepted centering range, despite any drift due to its natural jitter (spot wandering due to the air turbulence). On the other hand, a faster centering procedure can be achieved using a larger tolerance. There is therefore a trade-off to be made between those two requirements. This is investigated in Appendix C, by taking the constraints posed by the FOV into account. We show, for a perfectly aligned star, that the maximum seeing that the FOV can accommodate is 16.7 , for our C11 Arctic telescope. This is borderline at m = 5 in Figure 2 (long 615 exposure case). Obviously, this somewhat too small FOV is a design shortcoming that can be fixed, for example, with a larger limiting diaphragm. In order to asses the accuracy, we employ the calculations of Appendix C to transform the Arctic star spot sizes of Figure 2 into the corresponding observation errors of Figure 18. The black curve represents a systematic throughput degradation due only to the flux loss at the edges of the star spot. Such degradation characterizes the case of a perfectly centered (δ c = 0 ) short 620 exposure star spot (ω s ).
The colored curves account for the attendant error due to different centering tolerance choices (δ c = 2 , 4 & 6 ). We compute them by quasi-quadratically summing (with a 5/3 Kolmogorov turbulence exponent) the natural jitter contribution and the position uncertainty inside the tolerance zone (i.e. (σ 5/3 θ + δ 2 c ) 1/2 , with σ θ from Appendix C). One has to keep in mind, however, that those calculations are based on the blue linear fit of data points used in Figure 2. The possible variations 625 about that line can be estimated inasmuch as the short exposures indicate a standard deviation of 5-10%. This is about the 5% difference between the long and short exposure spot sizes in the Kolmogorov turbulence case, as computed in Appendix C (the approximation ω s 0.95 · ω following equation (C2)). However, for the purposes of our error modelling, we retained an empirical 8% standard deviation case. The ω s values (of a Gaussian distribution in ω s ) may accordingly be greater than ω values 33% of the time (33% of the Gaussian distribution that extends across the red line of Figure 2 at one standard deviation 630 from its blue-line mean). This 1.08ω case is represented by the dashed red-colored δ c = 4 curve of Figure 18. The difference with respect to the plain red curve accounts then for the seeing variation. Since it already exceeds our accuracy limit of 0.01 at x = 4.4 (or m 5), it represents the maximum acceptable δ c for the constraints of our SPST09/C11 system.

Non-linearity
Non-linearity of detector response to incoming light flux is another source of systematic error. The onset of significant non-635 linearity conditions occurs at ∼8000 cnt/s (i.e. V = −0.47 with the C11 telescope, a level normally not reached by any star other than Sirius). If the sky brightness due to atmospheric scattering of sunlight is strong (at dawn or dusk at mid-latitudes, or for longer periods during seasonal shifts of the late and early winter in the Arctic), this limit will be exceeded. The culmination of the non-linearity is that, using our standard 6 s integration, the detector progressively approaches its saturation point at 2 16 counts, or 65535/6 = 10922.5 cnt/s (i.e. V = −0.8 for C11). The consequence, as illustrated in Figure 19, is an apparent de- The star brightness has been corrected for sky brightness (the latter has been subtracted out from the former). The separate background measurement (blue line) is not affected by the non-linearity of the detector since B < 8000 cnt/s, but the sum F + B used to compute S is, leading to M = S + C, equation (13).
crease of star brightness (artificial reduction of the difference between the star and the sky measurements) with a corresponding increase in the computer value of the optical depth. The onset of non-linearity in the case of Vega (whose signal is ∼ 5000 cnt/s at transit in Eureka) begins at a background (B) value of ∼ 3000 cnt/s (at a total signal of ∼ 8000 cnt/s as indicated above).
One should never employ an instrument such as the C11 to make Sirius (V = −1.46) attenuation measurements unless the OD > 0.5. Data whose signal exceeds 8000 cnt/s should be discarded unless a subtraction process that accounts for the onset of 645 non-linearity is applied.
The sky background is strongly influenced by O 3 absorption. This is likely due to the multiple scattering influence of the effective increase in the light path length (from the sub-horizon Sun to the telescope line of sight). This is underscored in Figure 20 where we compare, in a relative fashion, the starphotometer sky background measurements with sky irradiance (daylight) computations at the bottom of a standard atmosphere for a solar zenith angle of 48.12°"standard indirect solar 650 reference spectrum" (ASTM-G173-03, 2012). The presumed multiple scattering impact of ozone is almost negligible in the latter case, when compared with the starphotometer measurements for the sub-horizon Sun case. One should also note that other absorption bands, like O 2 at ∼ 760 nm or H 2 O at ∼ 940 nm (for example), remain comparable. This means that the nonlinearity, as well as saturation, happens first in the blue, leading to a distortion of the retrieved aerosols optical depth spectrum.

Delayed background
Unlike the majority of instrumentally related calibration-degradation influences discussed in this section, the particular problem of delays in background measurements (and the background contamination problem discussed in the next subsection) are of a combined instrumental and observational nature. It concerns bright background conditions, mainly twilight, when the only feasible observation mode is OSM. If the background subtraction is effected using a background measurement which is 660 delayed in time (∼ 30 s) relative to the star measurement (as is the case for our instruments), then S will sustain a systematic error, that becomes progressively worse as the sky brightness increases. Figure 21 illustrates the sky background increase for seven standard channels, as acquired at Eureka in support of a morning series of OSM measurements. When those channels (notably the longer wavelength channels of Figure 21) approach the saturation point, near 09:00, the relative rate of increase δB/B 0.01 over the 30 s delay (as computed from the local slope of the curves just before the onset of significant saturation).

665
This leads to an observation error xδ τ δB/F 0.01 · B/F . Accordingly, the sky brightness should never exceed the star's brightness if the OD error is to be less than 0.01. The minimum OD error due to 30 s delay in such anomalously bright (dawn or dusk) conditions is 0.01 · 5000/3000 = 0.017 for Vega (for other stars is larger, since their F is smaller).
One can nevertheless mitigate this error by extrapolation from outside the saturation regime and correct for it in postprocessing. This procedure is, however, less than ideal inasmuch as the duration spent on a given star measurement is not 670 known precisely due to the unknown duration of the star recentering process between exposures. In any case, one generally expects the residual δ B to be 10-20% of the initial. This yields OD errors < 0.01 for the entire linear range, even when observing V = 3 stars.

Background contamination
Background contamination can also be considered as both, an observation issue and an instrumental issue (i.e. affecting the 675 calibration parameter). This kind of error is a design shortcoming affecting our older instrument versions (SPST05 and 06 which both employ the Losmandy mount). The error has been corrected since it was first noted, but its existence is worth mentioning because the source of the problem was not obvious. As indicated above, background subtraction has to be performed subsequent to the star measurement. Based on the Appendix C calculations and on the Eureka star spot sizes shown in Figure 2, the position of the background measurement should be made at a star separation larger than 35 with the C11 telescope and 45 with the 680 M703 (i.e. 1.1 · ω at m = 5, plus half of the FOV) to make sure the FOV encases less than 1% of the star flux on its border.
A separation of 60 = 1 would then be a safe enough margin. A separation of 8 was, in actual fact, a feature of the original design (i.e. similar to the FOV of the high resolution camera). However, an oversight in the implementation of that design meant that, for some areas in the sky, the telescope mount fails to achieve the requested move. This can result in erroneous S values induced by the star spot signal contaminating the background measurement. Figure 22 shows one particularly extreme 685 event that occurred during the Halifax campaign (see Section A2 for details of the campaign). Fortunately, we could correct this type of error in post-processing by interpolating between the neighbouring low level, spike-free points on either side of the spikes seen in Figure 22.

Internal temperature variation
The dark current of our detector (S7031-1006 Hamamatsu CCD) varies exponentially with temperature according to the 690 manufacturer's specs. Our instruments incorporate two-stage temperature stabilisation controllers, in order to increase the ambient temperature operation range and accordingly, minimize any temperature sensitive OD retrieval errors. The first stage stabilizes the instrument enclosure to 30 ± 0.5°C. The instrument's cold-environment design features include internal heaters to help reach and maintain the temperature set point. It does not however, incorporate coolers to compensate for warmer temperatures. The influence of warmer temperatures may, as a consequence of the heat generated by the enclosed (quasi- The TEC can cool down 30°to 45°C below its environment (the instrument enclosure). However, from our measurements, this range is rather found to be 38.5°to 51.5°C.

700
In warm environments, one can maintain the control up to an enclosure temperature of 41.5°C (Figure 23). Above that, the dark current (the main component of B in dark sky conditions) increases exponentially with the temperature (slightly more pronounced in the near-infrared). In Figure 23 the exponential fit looks linear because of the short vertical range. In cold environments, the instrument enclosure can be subject to temperatures below the lower limit (28.5°C) of its nominal control range. This may happen, for example, during the instrument warm-up phase (Figure 24), or when the outside temperature drops 705 below −45°C and the internal instrument heaters struggle to maintain the +30°C set point. The resulting dark current variation (δ B ) is illustrated in Figure 24a). Because it decreases exponentially with the temperature, its variation is much weaker than that induced by temperatures above the upper limit of the control range. This nonetheless results in significant variation of the

Throughput degradation
Even in the type of clean environment typically found at a mountain top astronomical site, one can notice an optical throughput degradation due to dust deposition on telescope mirrors (Burki et al., 1995). Our starphotometers employ off the shelf (amateur) telescopes, with an optical corrector plate sealing off the optical train and being the main contact surface for any particle 715 deposition. The formation of dew, frost or the deposition of clear-sky snow crystals on that plate represents our greatest source of throughput degradation. Of particular concern is that humidity trapped inside the sealed telescope tube leads to dew or frost formation on the inside of the corrector plate (a degradation which cannot be easily removed by mechanical means). A dramatic event of frost formation, that occurred during the Barrow campaign, (in the absence of a dome or dew cap to protect the telescope), is illustrated in Figure 25. The auto calibrating TSM (c.f. Section 3.4.2) used to derive the green OD points, 720 shows little variation. This indicates that there was likely no aerosol and/or significant cloud OD variation during that period.
However, the computed OD associated with the individual high and low stars varies strongly and is, based on our photographic evidence, attributable to frost formation on the plate. One should note that the ramping effects in the OD plot result from progressive frost formation and growth, after two separate damp-cloth cleanings. This operation did not apparently remove all the frost (the OD values at the beginnings of the two ramps are higher than those acquired prior to the 09:00 time stamp). One should also note that the low star measurements (red data points) are less affected by the frost: this is because the throughput error, as represented by the OD variation from the baseline, is, as per equation (30), divided by a larger airmass.
While at mid-latitudes one usually uses a dew cap to avoid fogging the optics, in the Arctic it cannot be used since it becomes a container for accumulating snow flakes and renders their mechanical removal difficult. One can usually sublimate the snow and frost from the external side of the corrector plate by closing the dome and increasing the dome temperature by few 730 degrees. However, this doesn't represent the necessary real time solution for preventing throughput degradation. In addition, it doesn't remove any internal telescope frost. Other experiments with limited success were described in Ivȃnescu et al. (2014).
It seemed a rather impossible issue to solve, but a working solution was nevertheless identified, addressing both the frost and incoming crystals: a Kendrick Astro system using a controlled heating band wrapped around the telescope tube. It increases the temperature of the optics, particularly the corrector plate, by up to 10°C with respect to the environment. One expected this 735 to increase the blurring of the star spots due to micro-turbulence near the telescope, but such effect turned out to be negligible for our instruments.

Toward 1% accuracy
A relative photometric error of 1% in δ F /F represents, in turn, a magnitude error of δ S 0.01 and an observational error of δ = xδ τ = 0.01. We seek to achieve the δ < 0.01 required accuracy goal discussed in the introduction, by mitigating the 740 non-negligible systematic uncertainties identified in this paper.

Optimum channel selection
Some of the largest accuracy errors in starphotometry are, as explained in section 4, due to contamination by stellar and Earth atmosphere (also called telluric) absorption lines, photometer spectral drift, bandwidth mismatch between the instrument and catalog references, as well as airglow and aurora contamination, when present. These errors can be mitigated through a judi-745 cious channel wavelength selection. Avoiding the high frequency spectral influences is also a reason for having narrow (< 10 nm) channels. Since remote sensing photometry is historically based on sunphotometry and much influenced by AERONET star. Such stars have much weaker hydrogen absorption lines than the typical B and A stars of our catalog. Therefore, our channel selection needs to consider the starphotometry reality, with its specific constraints: mainly to avoid hydrogen (H) lines and insuring a star brightness (particularly challenging in near-infrared) much larger than the sky background. Also, selecting more channels than the sunphotometers may help to compensate for typically larger starphotometer observation errors. The process of selecting more channels in starphotometry is facilitated by the fact that the number of channels employed by our 755 (spectrometer based) starphotometers is not constrained by the time consuming constraints of an AERONET type rotating filter wheel system. In what follows we attempt to create an OD spectrum with the goal of identifying an optimal starphotometry band set in typical conditions. The method is constrained by an eventual fit to measured OD spectra.
The first step in our band selection process was to identify the spectral intervals free of stellar and aurora/airglow line contamination. To this end, we used the extraterrestrial (HST measured) Vega spectrum (also shown in Section 4), at 8.2 nm 760 bandwidth. Since Vega is a spectral class A0 star, its spectrum, strongly influenced by Balmer and Paschen H lines, is among the most affected by stellar absorption (Silva and Cornell, 1992). Inasmuch as the Vega spectrum can be considered the worst case scenario, the systematic errors due to characteristic stellar absorption bands should be weaker for other stars. In order to obtain only the stellar absorption spectrum, we subtracted the continuum obtained by fitting the magnitude spectrum on off-lines data points. The result shown in Figure 26a was divided by 1.6, to simulate the airmass of an actual star. An IBC2 765 aurora OD error spectrum with respect to a V = 2 star, together with an airglow 10 times larger than that of Figure 15b, was employed to produce the gold "airglow & aurora" curve. The bottom red bars in Figure  was identified as a potential replacement for the old set of 17 channels (dashed grey vertical lines) currently employed in our starphotometers (it also represents approximately 3 times the number of channels employed in the AERONET instruments).
The dotted green curve of Figure 26b shows the same dotted green cumulative spectrum of Figure 26a to which aerosol 780 scattering has been added. The aerosol scattering OD was assumed to vary as per the classical Angstrom expression of bλ −a , while b was incrementally perturbed until it matched an actual OD Vega spectrum (blue curve) measured at Eureka (a typical value of 1.3 was assumed for a). For reference, the same spectrum but without the stellar components is represented in purple.
The position of the 20 new channels are duplicated on Figure 26a in order to better appreciate the final total OD context for those positions.

785
The selection procedure identifies as many channels as possible, constrained by the avoidance of any absorption line contamination. The ultimate goal is the characterization of the low frequency (slowly varying) aerosol and cloud scattering spectrum.
Since there are large spectral intervals where that is not possible (mainly across the O 3 , O 2 and H 2 O absorption bands), one also needs to include channels that independently facilitate the extraction of O 3 and H 2 O column abundances (at least two channels per band, as they are noisier due to the strong absorption). The newly identified central channel wavelengths, as well 790 as their application and their reason for selection, are summarised in Table 2. a Spectral region that is more sensitive to the characterization of fine-mode (FM) aerosol properties such as FM aerosol OD. The total aerosol OD (FM OD + coarse mode OD) will be sensitive to the presence of FM aerosols. b O3 absorption is sufficiently strong to provide a retrieval of O3 columnar abundance and thus O3 OD from a spectrally dependent matching type of total OD retrieval and accordingly to correct (eliminate) the O3 OD from the total OD for all O3-affected channels.
The justifications for the 20 selected channels (sequentially ordered as per Table 2) are given below: 1. Avoidance or minimization of H contamination. It better constraints the UV/blue trend of the fine-mode aerosol spectrum.
2. Avoidance or minimization of H contamination. This is the optimum λ identified in Section 4.  . OD spectra of constituents that contaminate the retrieval of aerosol and cloud ODs in the visible and near-infrared. a) Starphotometer channel selection (vertical black lines) obtained by avoiding contaminants, such as the stellar (Vega) absorption OD spectrum (black curve), OD errors associated with airglow and IBC2 aurora (gold curve) as well as O2 (at least those parts of its red curve whose OD 0.01).
The red bars delimit intervals where those contaminants are non-negligible (τ > 0.007). The channel selection also includes strategically selected regions of H2O and O3 absorption that allows for the dynamic identification and characterization of their OD and subsequently, their removal from the total OD spectrum. The cumulative contaminant optical depth yields the total synthetic curve (green dotted curve). b) The synthetic curve (with an added aerosol scattering component) versus an OD spectrum (blue curve) retrieved from Vega measurements over Eureka (measured on 2019/11/03 14:01:02). The optimal fit shows generally good agreement except where the contaminant influence is misestimated. This is particularly true for O4 absorption which we realized, a posteriori, should have been included in the ensemble of contaminants. The numbers of the selected channels are superimposed for reference purposes.
7. Good channel for sampling the ozone profile shape while avoiding the 557.7 nm aurora and airglow peak.
8. We note that the difference between the measured and synthetic curves around 590 and 640 nm underscores what appears to be a shortcoming in our synthetic curve: the presence of significant O 4 absorption features (see Wagner et al. (2002) for information on O 4 absorption). Strong O 3 OD channel that lies between double ozone peaks. It's one of 3 bands 805 sensitive to O 3 abundance (and thus O 3 retrieval). It avoids side-bands of O 4 contamination and any possible twilight contamination by 589 nm Na flashes (Chamberlain, 1995). 20. Channel at the largest near infrared wavelength that still provides accurate measurements, while avoiding H lines. Even if this channel is relatively sensitive to airglow emissions, it can be considered reasonably reliable for V = 0-1 BA class stars, or bright (and colder) F class stars, such as Procyon, whose near-infrared flux is relatively strong.

830
The major changes and improvements with respect to the original channel set are: a new 402 nm channel to better estimate the UV attenuation due to fine-mode aerosols; 432 nm channel is optimised for minimal contamination; the ozone absorption profile is over sampled to allow better removal in post-processing; the 953 nm H 2 O channel was excluded (see it at the right side peak, on top of the H 2 O band, Figure 26a), inasmuch as it is likely influenced by a H Paschen line; the H 2 O baseline is better estimated with more strategically selected baseline channels (closer to the limits of significant absorption); 835 the original (persistently noisy) 1040 nm channel was excluded (the high frequency variations seen in the retrieved ODs above approximately 1030 nm in Figure 26b is a symptom of the noisy nature of signals in that region of the near infrared). In order to avoid near-IR airglow one needs only acquire data at wavelengths above 1050 nm: this however results in weaker star flux and the above-mentioned weak signal to noise. Finally, we remind that our channel selection process is optimized for the peculiarities of our starphotometer. Signal to noise considerations aside, the spectral bandwidth is one of those peculiarities : 840 different bandwidths may require slightly different channels.

Starphotometry recommendations
We recommend the general usage of the 20 starphotometry channels defined in the previous section. Those channels are dedicated to the extraction of aerosol and/or cloud ODs as well as the strong molecular absorbers of O 3 and H 2 O (either as corrections to achieve estimates of aerosol and cloud ODs or, as remote sensing targets on their own merit).

845
An important source of OD error is related to the accuracy of the spectrophotometric catalog. In the case of the Pulkovo catalog, we identified a particularly large bias in the UV and 900-1000 nm regions (c.f. the text associated with Figure 4), that could distort the retrieved aerosol spectrum. That bias aside, the errors in the individual star spectra are also prohibitive in terms of achieving the required accuracy. It is strongly recommended that a new and improved bright star catalog should be made, preferably with magnitude measurements acquired by a space-based instrument, to avoid the incertitude related to 850 telluric absorption contributions. As discussed in Section 4, the requirements for such a catalog are 1 nm bandwidth and < 1 nm (preferably 1 Å) spectral resolution, with less than 0.01 differential magnitude variation across the measured spectrum. In the mean time, we continue to use the Pulkovo catalog, but with its 8.2 nm bandwidth version that improves the bandwidth match with our instruments, and offers a wider bright star diversity than what is currently provided by HST. Alternatively, if the starphotometer is spectrometer-based, as is our starphotometer, then one can generate such a catalog from direct high resolution 855 observations (all spectrometer channels) at a high altitude site. Such a catalog would perfectly match the instrument bandwidth.
We recommend, that future starphotometer bandwidths be held to less than 10 nm : this is an easily attainable standard that ensures negligible heterochromatic errors (δ τ < 0.001). The employment of all the spectrometer channels ensures that any high resolution stellar features can be properly accounted for in future corrections of any spectral drift (before extracting a driftmodified set of optimized operational set of starphotometer channels). Observations and calibration should be preferentially 860 performed with a B0-B3 (early-type B) star, or A7-A9 (late-type A) and F stars, to avoid the incertitudes related to the strong stellar absorption lines. Also, annual spectral calibration is advisable in the face of the drift results of Figure 8. Alternatively, measurements of a high resolution spectrum of a particularly bright star of near-A0 type (notably Vega) could be carried out every few months. Its deep H α,β,γ Balmer lines will serve, together with the Earth's O 2 and H 2 O lines, as reference for spectral calibration in post-processing. One particular concern at mid-latitudes locations, is that N O 2 may be several times 865 larger (Cede et al., 2006) than Eureka (i.e. up to 0.03 OD at 400 nm) and its absorption will be no longer negligible. Since N O 2 absorption is impossible to discriminate from aerosol spectrum, it has to be assessed from independent sources.
Retrievals in the presence of rapid temporal variations of sky brightness (a measurement which must accompany every star measurement) must be corrected by interpolating from pre-an post-contaminated sky brightness measurements to the time of the star measurement. Signals greater than the threshold for the onset of non-linearity (8000 cnt/s in the case of our 870 starphotometer) should be discarded (Section 7.2). In such bright sky conditions, one may expect OD errors > 0.017, unless interpolating the sky brightness to the time-stamps of star measurements. One should be aware that, at sky brightness increase, the blue part of the spectrum saturates first, leading to a distorted aerosol spectrum retrieval.
Airmass accuracy should be ensured by the use of a GPS time server. OD errors associated with airmass uncertainties can also be reduced at a high altitude site, while they remain sensitive to time errors on low stars, i.e. at large x (c.f. Figure 12). The 875 internal instrument temperature should be monitored, inasmuch as the temperature controller may eventually fail (for example, at the very low environmental temperatures found in the Arctic). One particularly needs to wait for the system to warm up to its stabilised range, as the low temperatures have a larger error impact (c.f. Section 7.5).
The stability of the throughput has fundamental impact on the calibration process. Due to the excessively small FOV of the SPST09/C11 configuration (36.9 ), the optical alignment proved to be critical to ensuring stable throughput. As demonstrated 880 in the discussion surrounding Figure 18, the centering tolerance error should not exceed 4 for this instrument in Eureka (2 CCD bins). The focusing error of SPST09/C11 should always be within one step adjustment step (confusion circle variation of ∼ 10 /step, as per the legend of Figure 17). This means, for example, that the focus must be adjusted by one step for each 10°C change in outside temperature.
Turbulence analysis using star spot imaging revealed another large source of throughput degradation, which is acerbated at 885 Arctic sea-level sites: possible vignetting of star spots at large airmasses. This problem was ascribed to the small SPST09/C11 FOV in the context of the excessively larger seeing of Eureka. The worst case (m = 10) scenario of the red curve in Figure 2 (for which the star spot is ω 19 ) can be accomodated by a FOV of 2.3ω 45 (see Appendix C for details). In the light of the forward scattering error analysis, one should not increase it beyond ∼ 47 (roughly where the mean of the most demanding cases of Figure 14, the "120 µm (ASC) ice cloud" case, crosses the 0.01 value of the δ τ /τ axis). This FOV limitation also 890 ensures accurate measurements (sub 1% errors associated with the brightness contamination cases of Figures 15 and 16) during faint aurora (IBC2) events (or their illumination-equivalence of thin moonlit cirrus clouds) for even weak (V = 3) stars (with the near infrared exception of Figure 16 where a bright, V = 0 star such as Vega, is needed to achieve the 1% threshold).
A 45 FOV, which would be obtained with a 0.61 mm diaphragm in the C11 case, therefore appears to be good compromise between the conflicting requirements of maximizing the FOV to accommodate all star spot sizes (red curve of Figure 2) and 895 limiting the FOV to minimize the largest forward scattering errors (blue curves of Figure 14). The small FOVs employed in starphotometry ensure that this technique is significantly less dependent on the intrinsic and artificial OD reduction induced by scattering into the FOV by optically thin clouds. This singular capability of starphotometry renders it rather unique in extinction-based photometry inasmuch as sun-and moon-based techniques require (or at least traditionally use) much larger FOVs and accordingly suffer from much larger FOV scattering contamination. 900 We demonstrated (Section 7.1) that observations at airmasses higher than ∼ 5 should not be made with the C11 because of the influence of vignetting. Calibration may nevertheless be performed beyond this airmass limit, as long the S values still show a linear dependence on x. This may happen in weaker air turbulence conditions than those of Figure 2. Throughput degradation due to frost/dew or ice crystals deposition on the telescope was a longstanding problem of our Eureka starphotometer (with critical accuracy implications). The use of the Kendrick system (or similar heating bands), together with a small wind shield, 905 proved to be a reliable solution which would be appropriate for most of Arctic observing sites. If all these recommendations are followed, one may aspire to achieve a reduction of each zenith OD error component to well bellow 0.01 and the total zenith OD error to 0.01 (i.e. the stated 1% photometric accuracy). Even if these goals are, in certain cases still under development, any progress that substantially approaches the goal of 0.01 total zenith OD error would represent a significant advance in starphotometry reliability.
With the ultimate goal of improving starphotometry accuracy, we analysed a large variety of sources leading to systematic (absolute) errors and classified them by their impact on each parameter involved in the optical depth retrieval. The contamination from stellar and telluric gas absorption lines may potentially induce large OD errors. One of the newly identified contaminants are O 4 absorption lines, that affect O 3 estimation and removal, leading to distorted aerosol OD spectrum. Such errors are nev-915 ertheless mitigated with proper channel allocation: this we demonstrated using synthetic and measured OD spectra to extract a set of 20 optimal channels. In order to minimize further the absorption lines induced OD errors (stronger hydrogen lines tend to spill over into different bands), one may favour the starphotometry observations using early-type B, late-type A and F spectral class stars, that have weaker hydrogen absorption lines. Therefore, we may particularly prefer them for calibration purposes.
Inaccuracies in the current exoatmospheric photometric catalog can be partly addressed in the TSM observation mode (where 920 the catalog bias is cancelled out) or by circumventing the catalog with lengthy calibrations involving each star that one wishes to employ as an extinction target (calibrations using Langley calibrations at a high altitude site, for example). Given such restrictive options, the community is strongly encouraged to prioritize the development of a new spectrophotometric catalog with improved accuracy, supported by magnitude variability characterization. This will increase confidence in the accuracy of a star independent calibration, and render that approach more operational and reliable.

925
Problems related to the instrument instability (including spectral drift and star spot vignetting) were identified and appropriate observation strategies and design improvements were proposed. Beyond the current accuracy assessment study, we will pursue starphotometry reliability improvement by also characterising the non-systematic, random errors, as well as those re- As an original by-product of this study, we developed a semi-empirical expression for estimating the seeing (star-spot blurring) profile from radiosondes measurements.

935
Code and data availability. The Matlab code and data employed in the generation of the figures are freely available (Ivȃnescu, 2021).
network has been reduced to the Eureka and Sherbrooke sites. Additionally, campaign-based observations took place in Halifax (NS), Barrow (Alaska, USA) and Izaña (Canary Islands, Spain).

A1 The instruments
Our starphotometers were built by Dr. Schulz & Partner GmbH, a German company that has now ceased operations. A total of 9 instruments, with serial numbers SPST01 to SPST09, were produced. The first three were initial-development versions 945 (now decommissioned) and the remaining six are still in operation. Three of these are German owned: SPST04 is still at the manufacturer, SPST07 and SPST08 are being operated, respectively, at the Lindenberg observatory in Germany and by the where f is the focal length and D the diameter.
The telescope "plate scale" P s on the focal plane, can be computed (Carroll and Ostlie, 2007) with where P s , having units of [ /mm] with k c = 3600 · 360/(2π) = 206264.8 /rad, is a radian to [ ] conversion factor, and f and D are in [mm]. The version to version improvements concern mainly the robustness of the instrument. However, the throughput of the SPST09 instrument is a factor of 3.2 better than the previous version due to the use of an 11-inch-diameter 17 (multi-pixel) bands were selected by the manufacturer as a standard for regular operation (see Table 1). Near-star, night sky radiance for background subtraction from the stellar signal is measured by pointing the photometer about 8 (arc-minutes) off-target. The star-acquisition procedure is based on star centering by two auxiliary SBIG ST-402ME-C2 CCD cameras. A square 504 × 504 pixels (px) sub-frame of the available 510 × 765 px CCD frame is employed. For speed and sensitivity, the acquisition mode uses 3 × 3 bins of 9 µm square pixels (i.e. 27 × 27 µm bins). The initial wide-field centering uses a 67 mm 970 diameter refractive auxiliary telescope with a fast f # = 4 focal ratio. Its P s = 12.65 /mm (arc-minutes per mm) plate scale provides a 57.4 field of view (FOV) on its camera, with 20.5 /bin (arc-seconds per 3 pixel bin). The subsequent centering is done at high angular resolution, using the main telescope. The P s = 73.7 /mm plate scale of the C11 telescope provides a 5.6 FOV, at 2 /bin. The P s = 114.6 /mm plate scale of the M703 telescope provides a 8.3 FOV, at 3 /bin. Based on the Nyquist-Shannon sampling theorem (Shannon, 1948), one can track star spots at the maximum precision of 1 bin if one has 975 at least one bin per standard deviation of the star spot (Robertson, 2017) Table B1, there are 24 such V < 3 stars in the Pulkovo catalog.
Given the uncertainty in star variability, as evidenced by discrepancies between the ∆V and ∆Hp columns of RA values (meaning that they are fairly close in azimuth) and, together, cover the entire 24 h period (while ensuring airmasses < 6 for low stars). We note that the spectral subclasses differ substantially for all three pairs (i.e. the 0-9 class suffix): however if we loosen the RA criterion then the alternate (HR 5191, HR 1790) pair may be of sufficiently similar subclass.

Appendix C: FOV constraints
The long exposure PSF is characterized by a FWHM ω and a standard deviation σ = ω/2.355. A large part of the PSF is 1025 due to random star-spot movements, called jitter (θ). The fact that the short-exposure movement is tracked dynamically by the starphotometer means the low frequency jitter (θ L ) is largely reduced and thus will not contribute to the star spot fed into the photometer. We estimate σ 2 θ L by integrating the jitter power spectrum, from 0 up to the tracking bandwidth (i.e. half of the low-frequency value given in equation (15) of Glindemann (1997)). When the tracking bandwidth tends to the sampling frequency (1/t), the missing jitter contribution to ω is (ibid) where this equation and all equations in this appendix are homogeneous as a function of angle (i.e. the use of a nonstandard angular argument such as arc-seconds scales coherently on both sides of homogeneous equations). The turbulence length parameter was found to be r 0 0.01 m for Eureka (section 2) and v 10 m/s is the typical effective wind speed. The operational starphotometer exposure value of t = 6 s yields, 1035 ω θ L = 0.215 · ω (C2)  (Pickering, 1908). Their HD (Henry Draper catalog; Cannon and Pickering (1918)), and HIP (Hipparcos catalog; van Leeuwen et al. (1997)) codes are also listed in order to facilitate their identification. The subsequent columns show their affiliated rank (Greek letter) and constellation, their common name, Right Ascension (RA) and Declination (DE) coordinates at epoch 2000, visual magnitude (V). GCVS and Hipparcos peak-to-peak magnitude variations (∆V and ∆Hp, respectively) are indicators of star stability. The next column shows the spectral class (Sp) of the star (including its 0-9 numerical subclass) and its luminosity class (Lum). The last column is specific to the Arctic; it indicates the TSM role of each Arctic star ("High" or "Low"), as described in section 3.4.2). The FWHM of the t = 6 (starphotometer) short exposure spot in a Kolmogorov turbulence is ω s = (ω 5/3 − ω 5/3 θ L ) 3/5 0.95 · ω or more, depending on the performance of the tracking system. This means that the tracking basically applies a negligible correction to ω. Averages of the Figure 2 (ω s ) points for a given value of m indicate a ratio relative to ω that is somewhat smaller than the 0.95 implied above. In effect, the preparation of Figure 2 necessitated short-exposure time reductions from 1040 the 6 s operational standard in order to circumvent problems such as signal saturation: that figure is more realistic in terms of providing a cross section of short-exposure times that might be used by starphotometers in general.

Telescope
FOV Short-exposure star-spot (FWHM ≅ ) Centering error Figure C1. Schematic of one possible position and size of the short exposure star spot (bright red) relative to the SBIG-camera CCD-grid (dashed lines), for the case δc = σ θ H . The telescope FOV is shown in black: its center (solid black circle) nominally defines the origin of the SBIG camera grid. We define the "centering error" as the distance between the center of the (black) telescope FOV and the center of the (red) short exposure star-spot.
Based on equations (77) from Tyler (1994) and (5) from Racine (1996), the standard deviation of the total jitter σ θ is σ θ = 0.42 · (λ/D) 1/6 · ω 5/6 (C3) For our instruments and telescopes (λ/D) 1/6 1 . One can show that, for 5 < ω < 15 (i.e. the ω range at m < 5 in Figure 2), 1045 one can approximate ω 5/6 0.7 · ω and recast equation (C3) as σ θ 0.4 · ω 5/6 0.3 · ω (C4) Using equation (C2), one retrieves the high-frequency component of the jitter as σ θ H = (σ 2 θ −σ 2 θ L ) 1/2 0.21·ω. This represents an ω dependent estimation of the star spot displacement between the starphotometer measurements. A centering tolerance of δ c = σ θ H , specified in the star-centering process (for an assumed Gaussian probability distribution of the random jitter), ensures 1050 that about 2/3 of the subsequent, short exposure measurements, will still be centered (see Figure C1 for a schematic of star spot positions and their defining parameters). The (m = 5) long exposure ω values at Eureka and Sherbrooke of 14.7 and 8.9 , respectively (Figure 2), imply a (δ c 0.21 · ω = 0.2 · ω s ) centering tolerance of 3.1 and 1.9 , or roughly 2 and 1 pixels, respectively. This is consistent with Baudat (2017) ω/4 rule of thumb suggestion of acceptable tracking error. Figure C2 shows a snapshot of the C11 short exposure tracking process for a high and low star (4 = 2 pixels centering 1055 error and 3 choices of centering tolerance). The high and low stars illustrate, notably in the latter (m = 4.9) case, the flux loss beyond the FOV boundaries for even short exposure star spots. Using Gaussian distribution calculations and the ω s 0.95 · ω relationship, one can show that the flux loss will be <1% if ω < FOV/2.3. This translates to a maximum seeing (ω value) of  Figure C2. SPST09/C11 tracking of a 6 s short-exposure, high-star spot (a) and a low-star spot (b). These two illustrations represent highresolution SBIG camera data that correspond to two Eureka points on Figure 2, but whose airmasses of 1.3 and 4.9 were obtained from the intersection of their ωs values with the blue regression line. The a) and b) fluxes are normalized to their maximum flux (their log-scale colour legend is shown to the right). The FOV, the 1% and 50% flux levels and the centering tolerances are represented respectively by a magenta circle, the 2 white contours and the 3 black concentric circles (radius of 2, 4 and 6 ). The spots are horizontally shifted by a 4 centering error w.r.t. the FOV.
25 for the M703 telescope. However, the same calculation gives 16 as the maximum seeing that one can accommodate for a perfectly centred star, using the Arctic C11 telescope. This value is problematic since it is close to the spot sizes at airmass 1060 m = 5 (Figure 2).