Calibrated digital images of Campbell – Stokes recorder card archives for direct solar irradiance studies

A systematic, semi-automatic method for imaging the cards from the widely used Campbell–Stokes sunshine recorder is described. We show how the application of inexpensive commercial equipment and practices can simply and robustly build an archive of high-quality card images and manipulate them into a form suitable for easy further analysis. Rectified and registered digital images are produced, with the card’s midday marker in the middle of the longest side, and with a temporal scaling of 150 pixels per hour. The method improves on previous, mostly manual, analyses by simplifying and automating steps into a process capable of handling thousands of cards in a practical timescale. A prototype method of extraction of data from this archive is then tested by comparison with records from a co-located pyrheliometer at a resolution of the order of minutes. The comparison demonstrates that the Campbell– Stokes recorder archive contains a time series of downwelling solar-irradiance-related data with similar characteristics to that of benchmark pyrheliometer data from the Baseline Surface Radiation Network. A universal transfer function for card burn to direct downwelling short-wave radiation is still some way off and is the subject of ongoing research. Until such time as a universal transfer function is available, specific functions for extracting data in particular circumstances offer a useful way forward. The new image-capture method offers a practical way to exploit the worldwide sets of long-term Campbell–Stokes recorder data to create a time series of solar irradiance and atmospheric aerosol loading metrics reaching back over 100 yr from the present day.


Introduction
Several studies have examined the evidence for links between recent changes in climate and changes in the amount of direct solar irradiance reaching the earth's surface (Stanhill and Cohen, 2001;Stanhill, 2005;Wild, 2009Wild, , 2012;;Wild et al., 2009).Some authors have associated these changes with changes in the amount of atmospheric aerosols (e.g.Ruckstuhl and Norris, 2009;Gilgen et al., 2009).Widespread reductions and increases in radiation at the earth's surface due to aerosol changes over long periods are termed "dimming" and "brightening", respectively.Typically, the spatial scales of "dimming and brightening" changes range from "regional" to "global" (e.g.100-10 000 km) and their temporal scales range from decades to century (10-100 yr), for example, Xia (2010); Pinker et al. (2005); Ruckstuhl et al. (2008).
Studies of long-term dimming and brightening are mostly based on data acquired since the International Geophysical Year (1957/1958) (Wild, 2009) and, as Wild (2005) observes, data quality was a problem for trend analysis until the establishment of the Baseline Surface Radiation Network (BSRN) in 1992.Here we describe a method that may be used to fill this significant data gap by estimating amounts of direct solar radiation from long-term measurements of bright sunshine that have been conducted at globally dispersed locations over the past century or more.The modern Campbell-Stokes sunshine recorder is the result of the development of the instrument created by J. F. Campbell in 1853.Practical modifications by R. H. Scott (recorder cards, 1877) and Sir G. G. Stokes (card holder frame, 1880) produced the pattern of instrument that is recognisable to this day (see Stanhill (2003) for a fuller description of the instrument's history).With further minor modifications instruments of the Campbell-Stokes (henceforth abbreviated to CS) recorder pattern have been used in many parts of the world as the standard instrument for recording bright sunshine for over 100 yr, which makes it a good source of long-term, self-consistent data.Instruments of the CS pattern have been constructed by a number of makers, for instance Negretti and Zambra or Casella and instruments of the CS recorder type can be reported using the maker's names (e.g Bentley, 2011), but the design of these instruments is so simple and reliable that the method we describe here is applicable to any instrument similar to the CS recorder pattern.The current UK Met Office's (National Weather Service) standard instrument consists of an optical-quality glass sphere of 101.6 mm nominal diameter and focal length 74.9 mm mounted in a bronze frame.The frame carries three pairs of slots designed to hold pieces of card of different patterns at the focal point of the sphere at different times of year.During periods of bright sunshine the direct solar irradiance chars the card at the focal point diametrically opposite the sun.The motion of the sun causes the focal point to move across the card producing a burned track on the card -hence any point on the track can be associated with a time of day. Figure 1 shows a portion of a well-charred card that shows how the width of the burn (orthogonal to the progress of the burn track) can vary.The burn is typically wider/narrower in the vertical (i.e. at right angles to the track) when the direct solar irradiance is stronger/weaker.It follows that the instrument is effectively recording the rapid fluctuations of direct irradiance as it is affected by atmospheric processes, e.g.clouds, aerosols, and the path-length of sunshine through the atmosphere subject, of course, to instrumental efficiencies and lags.
In everyday use the burn signal is normally integrated by the meteorological observer and reported as total hours of bright sunshine per day, where bright sunshine is defined as an occurrence of direct irradiance above a threshold intensity needed to char the recorder card.Therefore the CS record cannot be regarded as a complete record of direct solar irradiance, but we hypothesize that there is sufficient information to make a study of the archive profitable.We discuss the required threshold in Sect.7. Painter (1981) notes a correlation between sunshine duration measured by the CS recorder at Kew Observatory and a co-located pyrheliometer.It has long been recognised that increased clear-sky atmospheric opacity can prevent charring of the recorder card following the Lambert-Beer law (Maurer and Dorno, 1914), and also that the overall duration of bright sunshine is not a direct measure of clear-sky atmospheric opacity because of obscuration by clouds.Helmes andJaenicke (1984, 1986) and Jaenicke and Kasten (1978) developed a method to measure atmospheric turbidity based on the lowest solar elevation that produces charring under clear skies, which worked well for the polluted conditions of mid-twentieth century northern Europe, so long as concurrent observations of cloudiness were available.More recently Horseman et al. (2008) showed how some of this card-burn information can be extracted and related to atmospheric aerosol loading using a process that targets periods of low solar elevation towards the ends of the day (when the paths of sunshine through the atmosphere are relatively long) and that does not need the concurrent cloud occurrence data required by Helmes and Jaenicke (1986).
The development of inexpensive digital imaging offers the prospect that cards can be digitally archived with metadata; preserving detailed information on bright sunshine for solar irradiance studies.Both Boardman (2010) and Wood and Harrison (2011) describe how a digital image of a card can be analysed to quantify the amount of burn and linked to strength of irradiance that caused it at temporal resolutions of the order of minutes.The widespread and long-term use of the CS recorder makes it theoretically possible to obtain a time series of direct solar irradiance variation reaching back over 100 yr over a range of sites covering different climates, latitudes and longitudes.However, the construction of a longterm record of irradiance is still limited; the by-eye analysis method of Horseman et al. (2008) is constrained by the subjectivity of human observation and the scanning processes of Boardman (2010) and Wood and Harrison (2011) take a significant amount of time to perform.
Here we show how inexpensive scanning equipment can be used to extract a metric related to direct downwelling short-wave radiation from CS recorder cards.We show that this information has similar characteristics to direct sunshine measurements made concurrently by pyrheliometer.The method can be used to extract data from many cards covering long periods and many stations and so has the potential to fill the information gap on radiation trends for dimming and brightening studies.A subsidiary design aim was to make the data capture process robust so that in future it would be possible for a consistent archive to be built by a network of agencies or even by volunteer "citizen scientists".As a test of this semi-automatic method, we have processed cards from the meteorological station at Lerwick in the Shetland Islands, Scotland.Lerwick was chosen as it has a long CS recorder card archive and, as a member of the BSRN, maintains a pyrheliometer which can be used for comparison.

Overview of the process
Initial trials using digital scanning (including Boardman, 2010) showed that to produce a consistent set of images employing different operators using different equipment, controls and constraints must be applied to the image capture process.An important part of the process is then managing the operator, which is achieved using a regulated and standardised procedure using bespoke software that guides the user through a defined process to ensure that image quality is maintained and that appropriate metadata are gathered.
The ergonomics of the operation are important to ensure that a high-quality image archive covering decades of cards is constructed at a practicable rate.For instance, by working through a set of cards in chronological order the date and card type can be predicted; the operator then only has to intervene in case of anomalies, e.g.where the wrong type of card for the time of year had been used by the recording station.These simple measures save a little time per card, but when multiplied by thousands of cards, make the task more tractable.It was the intention from the outset that any archive of images must be suitable for repeated reanalysis, for example, to explore different signal detection algorithms, rather than combine the scanning and analysis.Therefore the method produces rectified and registered images of each card to simplify geometrical calculations during analysis, i.e. so that analysis to extract solar data can use straightforward Cartesian coordinate geometry.In this case rectified means transforming the curved card images into a rectangular form and registered means that all images use a common baseline for time information.The process of adding a card image to the archive is divided into four stages as illustrated in Fig. 2.

Image capture
Our initial experiments showed that consumer-grade scanning equipment can capture images with geometric distortion levels far below significance to card analysis and that an image resolution of 200 dpi (dots per inch, which is the unit convention used for specifying scanners and is the equivalent of pixels per inch) is sufficient to capture any significant charring of the card.The consistency of image colour representation is more of a problem, hence colour calibration Fig. 2. Schematic illustration of the image capture method.The steps are executed in numerical order and are: (1) capture a quality image -the operator enters meta-data at this point, (2) apply a colour correction using the embedded calibration profile, (3) identify the card within the image, and (4) extract and rectify the card image.
22 Fig. 2. Schematic illustration of the image capture method.The steps are executed in numerical order and are: (1) capture a quality image -the operator enters metadata at this point, (2) apply a colour correction using the embedded calibration profile, (3) identify the card within the image, and (4) extract and rectify the card image.
of the scanner is a fundamental part of the archive building process.

Scanner colour calibration
Building a self-consistent long-term card image archive will require many thousands of scans.The sensors in consumergrade image scanners alter with time.Consequently it is important to calibrate the scanner at regular intervals.The calibration process used is adopted from the field of commercial digital imagery.A standardised colour test card or "colour target" is scanned and the image compared with a digital reference copy of that particular target provided by the colour target manufacturer.Colour management software is then used to compare the captured and digital reference images to quantify deficiencies in the scanner's colour representation.These deficiencies are stored in a calibration profile conforming to the International Color Consortium (ICC) standard.This type of procedure is widely understood and does not require detailed description here; see http: //www.jiscdigitalmedia.ac.uk/ (last accessed 10/02/2013) for more information on the management of digital imagery.We used the open source Argyll Color Management System (CMS); the CMS's website (http://www.argyllcms.com/last accessed 10/02/2013) describes the procedure to be used in detail.The optimum frequency of calibration is difficult to determine precisely, we followed empirical evidence and our software prompts the user to calibrate the scanner using standard IT8.7/2 colour target every 1000 scans.

Image type and resolution
The charring or burning process can produce a range of effects from a discolouring of the surface to perforation completely through the card (see Fig. 1).The presence of burn can be related to the time of day of the bright sunshine, and the width of the burn gives an indication of the strength of the direct downwelling short-wave radiation.The cards have three different patterns, one each for the summer (12 April to 2 September inclusive in the Northern Hemisphere), winter (from 15 October to the last day of February inclusive) and the intervening equinoctial periods.The dimensions of the card are standardised although detailed documentation of the current standard is difficult to obtain; the most recent design description available to the authors is Bilham (1932).The basic design of the instrument has changed little in the intervening years although the dimensions of the sphere and therefore the frame and cards have altered slightly.Our method does not require knowledge of these exact dimensions as the image manipulation uses an empirical method based solely on the scanned image.The optical geometry of the instrument means that both the summer and winter cards are portions of the same annulus, the summer cards being longer to accommodate the greater number of daylight hours.Here "longer" indicates the dimension in the direction recording time and width orthogonal to the time direction.The equinoctial cards are straight with a trapezoidal shape and are wider than the curved cards as the rate of change of solar zenith, and hence lateral position of the burn track, is greater around the equinoxes than the solstices.
For the archive we used a standard image format, 24bit RGB TIF (Tagged Image Format), i.e. three 8-bit colour channels (Red, Green and Blue), with the EXIF (EXchangeable Image Format) extension to the image used to carry the metadata.The metadata captured are site name, site latitude and longitude, card pattern type, card trimming, the equation of time (see Sect. 6) and a quality code.The quality codes are subjective, e.g.physical state of the card and qualitative assessment of the strength of the burn, but give a basic method of screening poor cards at the point in the process when every card will be examined by eye.
Cards of the design used by the UK Met Office at the stations we examined have a blue-green face with hour markers and other information left unprinted; this shows as the card's natural off-white colour.On used cards, the burned traces are for the most part black, occasionally with some grey ash.This means that in standard RGB (Red, Green and Blue channel) images the majority of the direct solar irradiance related data can be found in the green and blue channels.A bright red sheet fixed to the scanner to serve as an image background means that pixels with high red, and low blue and green channel values can be considered not part of the card, simplify-ing the detection of the card within the image.A caveat to this simplification is that the card frequently burns through in which case the red background appears within the card area.This problem is minor and is dealt with in the description of the method.

The operator's task
Automation of much of the card scanning process simplifies the operator's task, reducing the scope for operator error and making the construction of a long archive practical.Later in our process the portion of the image that represents the card must be located, so to make the location step more straightforward the operator is required to place the card on the scanner platen in a particular orientation.The positioning need not be precise, which avoids compromising the speed of manual card handling.In our case the curved cards are placed with the longest edge towards the top left of the resulting image and the straight cards roughly vertical in the image with the longest edge to the left of the final image.
The operator works through a series of cards in chronological order; our experience suggests that observers already archive the cards in this manner.The date of the first card in the sequence must be input at the start, along with other information regarding the CS recorder site.After this is done, dates and hence the different patterns of cards can be predicted and offered as a default input that the operator can override if necessary.
After a scan of the card is taken the operator is shown the raw scanned image to evaluate and accept, reject or repeat a scan.If the scan is accepted the operator is then prompted to make a subjective judgement about the quality of the card from a number of predefined category codes.This evaluation is necessary as meteorological observers sometimes write on the card face and cards with very long burns close to the card edge can tear.Acceptable images are saved using an automatically generated filename time stamp.If the card shows no evidence of burn the scanning stage can be skipped for speed, but the fact the card is blank is logged so as not to skew later time series analysis.All of the metadata regarding the card, including the quality code and description, are embedded in the saved image's EXIF data fields to permit automatic filtering later in the processing and analysis.

Image manipulation
The first step in extracting the sunshine card data from an image is to use the ICC profile embedded in the scan to produce a colour corrected image.All pieces of digital image production equipment inherently influence the way the colour content of an image is interpreted, and the ICC profile mechanism is intended to control this.Therefore the colour correction process also requires a target profile to correct to.In previous methods (Boardman, 2010) the identification of the card within the image has required manual intervention thus slowing the process of archive building; our method automates this stage.In the following sections the image described should be thought of as an A4 size page in portrait orientation with the card's longest edge to the left of the page.

Image location
With cards correctly placed, the edge of the card can be found to the left of the image by reading across each row of the image starting from the left-hand edge and searching for a transition from the bright red background to colours associated with the card, i.e. white (markers) or blue-green (background).If the correct card has been used for the time of year any burn will not reach the card edge.The thickness of the card does produce a small edge-shadow and the card edge itself does not appear cleanly cut at 200 dpi resolution, so the position of pixels representing the edge is smoothed using a moving average filter.The position of the edge and hence the ends of the card in the image along with knowledge of the card patterns provide sufficient information to locate the rest of the pixels representing the card.

Card area identification
The next step in the method is to identify the image area containing the card and is dependent upon the shape of that card.The method diverges at this point to treat curved and straight cards separately.The curved summer and winter cards are the more difficult to process so the rest of this description covers these types; equinoctial cards can use a subset of the method.The size of and shape of card type is known and standardised, so could be used as a mask to select the image area containing the card.However, standard meteorological observation practice requires that the ends of the cards be trimmed to prevent obscuration at certain latitudes.In addition, the exposure of the cards to the elements in the CS recorder mean that used cards may be distorted and not conform to a stan-dardised mask, so the method is designed to extract the actual card image in a consistent manner.
The detected left-hand card edge closely approximates a circular arc, so we recover the radius of the arc using the following procedure.A series of chords are constructed on the winter card's lower edge (upper edge of summer cards), which corresponds to the left-hand edge described earlier.Simple geometry shows that a straight line that passes through the middle of a chord and is orthogonal to that chord will pass through the centre of the circle that the arc is part of.The equation of a straight line for a chord can be readily determined from two points selected from the list of pixel positions of the card edge already obtained.Chords of at least 300 pixels in length are used because short chords will amplify the effects of any fine-scale deviations in the card edge.The equations of a straight line of the chords can be transformed into equations of radial lines by rotating the gradient through 90 • and calculating a new y-intercept based on the coordinates of the middle of the chord (Eqs. 1, 2), where x, y are the coordinates of a point on a straight line, c is the y-axis intercept and m and n are the gradients of the original straight line and a straight line orthogonal to it, respectively.The centre of the card arc is then found by calculating where pairs of these radial lines intercept.A large number of chords and radii are used to improve the accuracy of the location of the arc centre.To further reduce the effects of defects and distortions in the detected card edge the coordinates of the intercepts are obtained for all permutations of pairs of radial lines, and the mean of these x and y coordinates taken as the centre of the card arc.For the same reason, the radius in pixels of the outer card edge is not predetermined and is obtained by calculating the average distance between all of the card edge pixels and the calculated centre.
Inspection of a selection of the used cards showed that the width of the card is relatively unaffected by card alterations or distortions, so this can be converted into distance in pixel space and the radius in pixels of the inner card edge found.The portion of the original image that contains the card now falls within an annulus whose inner and outer radii have been obtained.
This stage of the process for an equinoctial card is much simpler.The left-hand card edge is found in the same manner as for curved cards and is used as a baseline for the direct extraction of a rectangular area sufficiently large to enclose the whole card.

Midday marker location
The registration and rectification process requires that the midday marker of the card is placed at the middle of the long edge of a rectified and registered output image (see Fig. 2).The midday marker is found by first locating the ends of the cards.This is achieved by searching for background colour pixels along a notional arc of radius halfway between that of the inner and outer edges.The search is started from positions known to be beyond the card ends and works towards the position of the card until the ends are found.Inspection of sample cards, including those left untrimmed, shows that the midday marker does not always fall exactly in the geometric middle of a card.Consequently the detection of the ends of the card does not provide enough information to locate the midday marker.To overcome this uncertainty a longitudinal (parallel to the longer card edges) search for the white pixels of the midday marker line is performed around the middle of the card area.The search area is longitudinally constrained to avoid mis-identifying other hour markers and laterally constrained to exclude the area of potential burn, which can be estimated using the card date from the image metadata.For instance, as Fig. 1 shows, the burn occurs in the lower portion of the card in the middle of the period between 12 April to 2 September when summer pattern cards are used.Towards the ends of this period the burn will occur in the upper portion of the card.

Card image extraction
At this point sufficient information has been diagnosed from the image to define the region to be extracted.The radii of the inner and outer arcs define the top and bottom card edges of the rectified output area and the midday marker sets the middle of the image.The dimensions of the output image for the curved card shapes are fixed for each card pattern to ensure all of the potentially trimmed cards will be extracted and to simplify future analysis.The dimensions of this area are derived from the size of unmodified cards.The method ensures that the midday marker falls in the middle column of the rectangular output image regardless of the way the card has been cut.
The rectification and registration extraction process is then performed as shown in Fig. 3.Each row of the rectangular output image corresponds to a circular arc with a radius that falls between that of the inner and outer edges found earlier.These arcs share the same centre that was also found earlier in the process and are arranged so that the midday marker will be located in the middle of the output image.The radii of successive extraction arcs are incremented in 0.127 mm steps (corresponding to 200 dpi) to create an output image that has the same resolution as the original scan.The columns of the output image are also spaced at 0.127 mm to create an out- Diagram showing the method of extracting pixels from an image of curved Campbell-Stokes recorder card that are required for the production of a rectified output image.The filled circles represent pixels that will form the bottom row of the rectangular output image.The empty circles represent pixels for the second from bottom row.C denotes the centre of the circular arc that is the inner edge of the card image (R inner is radius of this arc).The pitch of the selected pixels is 0.127 mm along the inner edge so that selecting other pixels on the radial lines shown will also be spaced at 0.127 mm to maintain 200 dpi pitch in the output image.The proportions of the card geometry in this diagram have been distorted to fit the page.23 Fig. 3. Diagram showing the method of extracting pixels from an image of a curved Campbell-Stokes recorder card that are required for the production of a rectified output image.The filled circles represent pixels that will form the bottom row of the rectangular output image.The empty circles represent pixels for the second from bottom row.C denotes the centre of the circular arc that is the inner edge of the card image (R inner is radius of this arc).The pitch of the selected pixels is 0.127 mm along the inner edge so that selecting other pixels on the radial lines shown will also be spaced at 0.127 mm to maintain 200 dpi pitch in the output image.The proportions of the card geometry in this diagram have been distorted to fit the page.put with a 1 : 1 aspect ratio in scale.The transformation of a row of pixels selected from a series of arcs with increasing radius into a corresponding series of straight rows with constant pixel spacing means that the outer arcs will be undersampled compared to the inner ones, i.e. the along-arc spacing of the pixels selected for the output image will increase with the arc radius.
The location of the pixels to use is simplified by calculating the required position in polar coordinates using the radii values discovered above.The angular (along arc) spacing is derived from the angle at the centre of a sector defined by two radial lines with the radius of the inner card edge and a curved edge 0.127 mm long.The calculation of the coordinates of the pixels to select is done using floating point arithmetic for precision, and are truncated to integer when extracting the pixels.A refinement of the method would be to derive a pixel value interpolated from the four nearest neighbours of the calculated position.Registration is achieved by using a radial line that passes through both the centre of the card arc and the midday marker located earlier.This line will bisect each of the arcs shown in Fig. 3. Equal numbers of pixels are then selected from each side of the bisection line to produce rectangular images of 2150 × 244 pixels for winter and 3000 × 244 pixels for summer cards.Figure 4 shows the effectiveness of the rectification and registration process.Equinoctial cards do not require the rectification process and are never trimmed, so the midday marker and ends of the card can be easily detected, producing images that are typically 2170 pixels along the longest side.As has been previously described, the width (shortest side) is fixed at 308 pixels to accommodate the wider equinoctial card pattern.All rectified and registered images have the midday marker in the middle of the longest side and have a temporal scaling of 150 pixels per hour.

Equation of time
An objective of this archive building process is to be able to convert the burn on the CS recorder cards into a time series of direct solar irradiance strength with a temporal resolution of the order of minutes.The standard meteorological use of the CS recorder is to measure the duration of bright sunshine within a day.This only requires the temporal scale to be self-consistent and need not be synchronised to any time standard.If the CS record is to be compared with time series of atmospheric parameters from other solar instruments the card's temporal scale must be standardised.Meteorological standard practice is to level the instrument and align it to face due south, which in turn means that the midday marker on the card represents 12:00 in local apparent time (LAT) or true solar time.Levelling of the instrument means that a burn track will be orthogonal to the hour markers, which means that they can be used as a time reference.The clock time at which the burn actually passes through the midday marker varies due to precession and nutation of the earth's motion and can be as much as 17 min before or after LAT.Correction of these variations uses the equation of time (see Müller, 1995, for a detailed explanation and calculation) and yields a uniform time termed Local Mean Time (LMT).LMT has a fixed departure from UTC solely due to the longitude of the meteorological station.To correct for a station's longitude, four minutes must be added to the recorded time for every degree west of Greenwich and equally four minutes subtracted for every degree east of the prime meridian.The processing program calculates the value for the equation of time for each card when it is scanned and embeds this information in the image metadata.The equation used to calculate the equation of time requires just the date of the card and is taken from the Astronomical Almanac for 2012, which states a precision of 3.5 s for years between 1950 and 2050.The value can then be applied by later analysis if UTC (Universal Time Coordinated) is required.

Comparison with pyrheliometers
The purpose of the method at this point is to transform the scans of CS recorder cards into a self-consistent archive of rectified and registered images and metadata in a form that can be readily analysed.The rectified and registered images show the tracks of the burns do not conform to perfectly circular arcs.This is most clear in a close inspection of the summer cards (see Fig. 4), which shows that the burn track deviates towards the lower edge of the card at both the ends of the day.This evidence suggests that the deviation is not due to misalignment of the card in the instrument and is most likely due in part to the continual change in solar zenith and in part to the increased effects of atmospheric refraction at low elevation angles.
The purpose of the automation and future analysis is to provide a record of direct solar irradiance data similar to that available from pyrheliometers, but from locations and times beyond the existing solarimeter network.The development and optimisation of the analysis techniques are the next stage of the work, but the results of a simple thresholding approach show promise and are worth describing here.Wood and Harrison (2011) showed that it is reasonable to assume that the amount of card burned at any point is related to the strength of the direct solar irradiance focused on the card at that time.This is supported in Fig. 5, which compares the direct solar irradiance from a pyrheliometer colocated with the CS recorder at Lerwick (Fishwick, 2007).The Lerwick site was chosen, in part, because it has a number of co-located solarimeters that form part of the BSRN (see http://www.bsrn.awi.de/ last accessed 10/02/2013).The solar data used here comes from a Kipp & Zonen CH1 pyrheliometer measuring direct solar irradiance at normal incidence using a sun tracker.Figure 5 shows data from the pyrheliometer and the CS recorder on a common time base.The CS recorder data are a metric derived by applying simple thresholds to the rectified and registered image for the appropriate day.In the figure the pyrheliometer data are measured in W m −2 whereas the CS recorder data are simply the count of pixels in each column of the image that pass a particular value threshold.Two thresholds are used, one to capture dark charred areas and the other to record pixels where the card has burned through.Although no attempt is made here to derive a scaling factor to translate these pixel counts into W m −2 , it is readily  The card image has been cropped and scaled to match the timescale used for the data panels.There is a clear correlation between the data from the two sources in both time and amount despite this being only a moderately sunny day.
apparent in Fig. 5 that the patterns of the CS recorder and pyrheliometer signals show considerable similarities in both time and signal strength.The World Meteorological Organisation suggests that solar irradiance must exceed a nominal threshold of 120 W m −2 for the card to start charring (WMO, 2008) and the comparison in Fig. 5 bears this out.Hence it should be borne in mind that the day chosen for this example was only a moderately bright spring day; on a clear day at this time of year the pyrheliometer can register irradiance intensity over 800 W m −2 for periods of several hours.In general, the data from the CS recorder shows a systematic lag with respect to the pyrheliometer data of approximately 3-4 min.This may be because the card takes a small amount of time to reach charring temperature and also to extinguish, but is more likely to be due to a slight misalignment of the instrument with due-south.

Future enhancements
The current pixel thresholding is sufficient to show a correlation between burn amount and the level of direct solar irradiance, but needs to be more sophisticated as the burn habit goes through three phases that are evident in the resulting burn colours.The weakest burn is a discolouring as the card chars through brown to black, the next level is where the card is consumed leaving a light coloured ash.The ash is fragile and is usually lost, but in certain circumstances the ash can remain and is difficult to distinguish from the unburned card by colour alone.However, in these cases the light coloured ash is surrounded by a border of black char, so a more sophisticated thresholding system could correctly count these ash coloured pixels as burned.
The Lerwick Meteorological Station is operated 24 h per day, consequently the cards in the CS recorder are changed during the hours of darkness.It is more common for the cards to be changed at 09:00 (UTC) which means that one card will record sunshine from 09:00 one day to 09:00 the following day.In periods of fine weather this is often visible as a "step" in an apparently continuous burn track as the solar zenith varies from day to day.An enhanced analysis process can use the image's UTC corrected time-base to split the burn data at the "step" point and associate it with the correct day.

Building the archive
Using the method we described here we have built an archive of high-quality CS recorder card images.Our Lerwick archive covers the periods 2000-2006 and 1978-1986 and includes every card that shows evidence of burn; an archive of over 3000 images.By refining the basic process of capturing an image we found that a practised operator could scan all the usable cards from a whole year in less than one working day.This still implies that capturing an archive of a century of cards would require approximately 100 working days, but by constraining the data gathering process the procedure could be performed relatively quickly using pieceworkers or citizen scientists.
In this work we have shown that it is possible to used readily available technology to mine the archives of a simple, reliable and widely used meteorological instrument to provide a proxy record of direct solar irradiance strength.This is a step towards building an archive of direct solar irradiance data at a resolution of the order of minutes that extends back to end of the 19th century.The fine-scale information lends itself to long-term statistical analysis of cloud frequency.

Fig. 1 .Fig. 1 .
Fig. 1.Excerpt from a scan image of a sunshine recorder card that illustrates how the amount of card burn can vary in width (across track).The maximum width of burn in this image exceeds 5 mm.This card is from Lerwick for 3 June 2006.

Fig. 4 . 24 Fig. 4 .
Fig. 4. Illustration of the rectification and registration method using a card from 17 July 2003.The straight yellow lines in the figure are parallel with the longer image edge.One line passes through the "+" markers on the card which are on the centre-line of the card.The other line is placed to show the deviation from linearity of the burn track described in the text.The image has been split to better fit the page.

Fig. 5 .Fig. 5 .
Fig. 5. Comparison between the solar data extracted from a rectified and registered Campbell-Stokes recorder card and that from the co-located pyrheliometer at Lerwick.These data are for 21 April 2003The card image has been cropped and scaled to match the time scale used for the data panels.There is a clear correlation between the data from the two sources in both time and amount despite this being only a moderately sunny day.
output types or "colorimetric intent".The image output of this work is only for numerical analysis so the particular output profile chosen is not important as long as it is used consistently throughout the process and does not require extreme corrections that may clip the image's dynamic range.The choice of profile will affect the optimum values of the thresholds used in detecting card features, but these thresholds are empirically determined and their absolute values are not important, again, as long as they are used consistently throughout the card archive.This work used the profile in file sRGB v4 ICC preference displayclass.iccavailable from http://www.color.org/(last accessed 10/02/2013) as it was intended for RGB colour spaces used in later analysis.In our system the conversion is performed using the open source ImageMagick toolset http://www.imagemagick.org(last accessed 10/02/2013).