There are several other papers already describing CRDS installations at field sites. The novel aspect of this paper is that two different drying methods were deployed -- each for a significant amount of time, and laboratory tests have been performed to quantify their errors. Currently the manuscript does not adequately describe the extent to which systematic errors introduced by the CRDS water correction are mitigated by calibrating through the nafion. This is a crucial question and should be relatively easily addressed with existing data. Specific suggestions for figures to address this question are given below. If it is the case that passing calibration gases through the nafion substantially mitigates the systematic errors, then the initial configuration would seem to be clearly superior in terms of ease of implementation/maintenance and remaining systematic errors.
It would be especially useful for the discussion to consider implications of complex and frequent instrument specific annual water correction strategy versus ease of operations for the original configuration. If it is the case that dried air samples versus humidified calibration gases have nearly the same water content, then any bias associated with the picarro water correction should be nearly or entirely calibrated out (see specific comments below). To what extent does the daily measurement of the ambient standard remove bias introduced by the CRDS water correction? I think the systematic errors should be estimated based on the difference between the humidity of humidified drift-tracking standard versus the humidity of the dried ambient sample. It would be useful to provide some details in section 2.3.8 about the extent to which standards are humidified and how much variability in ambient humidity is observed between daily (drift) calibration episodes. It would be interesting to include a figure in the paper or in the supplement showing the complete h2o time series prior to removal of the nafion, i.e. showing the CRDS measurements of post-nafion humidity with enough detail to show how it varied over different timescales, including seasonally and diurnally (some variability in drying performance is expected given that the effectiveness of nafion drying is strongly dependent on the temperature of the membrane). Also consider including a panel(s) showing typical humidity difference between ambient sample air and during a daily drift calibrations(i.e. how much water does the calibration gas pick up and does it dry out perceptibly during the course of a 20 minute calibration interval?). It is critically important to understand the extent to which calibrating through the nafion once per day mitigates errors associated with the crds water correction. If the amount of water stored in the nafion membrane is sufficient to (1) humidify the standard to the same humidity as the ambient sample is dried and (2) to largely mitigate within-day variability of sample h2o, then errors related to h2o are likely negligible. It seems that the original nafion strategy may have had only very small systematic errors and that these could likely be further reduced if necessary by increasing the frequency of the daily CRDS analyzer drift calibrations through the nafion.
For the second phase, where the nafion dryers were removed, the authors have gone to great trouble to track the instrument specific water corrections,and the findings are very interesting. In the end, I wonder if perhaps the errors associated with using a single pre-deployment water correction are actually tolerable for the application of estimating UK scale fluxes (at least in the case where some drying is used to keep h2o < ~2%). Looking at Figure 8 a and b, and given the large differences among the h2o corrections for humidity > ~2%, it seems that the errors are unacceptably large, and unlikely to be improved even with weekly testing.
It would be useful to discuss the significance of systematic errors with respect to the current model errors and goals of the UK network (i.e. for national-scale evaluation of reported emissions by atmospheric monitoring to be useful). Based on all of tests and analysis, what are the recommendations about how to simplify the operations and reduce the systematic errors going forward? My reading is that the systematic errors would be greatly reduced if nafion dryers were to be reinstalled, so long as the calibration gases are also routed through the nafion dryer and so long as the membrane provides an adequate reservoir of h2o such that the humidity of cal gases and ambient samples is the same. It is worth reemphasizing the need to calibrate through the nafion for any system that has a large partial pressure gradient of co2 and/or ch4 across the membrane.
page 4, line 19: uncertainties related to other components (such as?) sample collection
page 7, line 2: Nafion drying performance is a strong function of temperature (see product literature). How much did nafion membrane temperature vary seasonally and/or diurnally. Might nafion temperature be the driver of output sample humidity variability rather than input sample humidity? In order to understand the extent to which water correction systematic errors are calibrated out, it would be useful to include a figure showing the extent to which humidified calibration gas air typically resembles dried sample air (and/or worst case difference between sample and standard humidity as measured by the CRDS). To what extent do daily drift correction calibrations capture variations in humidity at the CRDS? (see Figure 7 of Andrews et al., https://www.atmos-meas-tech.net/7/647/2014/amt-7-647-2014.pdf)) If nafion output humidity is very smooth in time, then the daily drift corrections may entirely mitigate errors due to the CRDS water correction. If this is the case, then the original configuration with the nafion would seem to be the superior method, since instrument-specific and time-dependent h2o corrections would be unnecessary. What can you say about the extent to which variations in the dried sample air humidity drive day-to-day drift in the CRDS signal during the period when the nafion was installed?
page 8, line 10: wet/mean(dry)?
page 9 line 12: Here, dry air from four cylinders with varying CO2... was humidified.
page 9, line 22: An experiment was designed *to* observe
page 9, line 24: Not sure why it is necessary to mention the series of inconclusive experiments
page 11, line 15, "all components of the cylinder air path between the DPG and the multiport valve, excluding the water trap, and the pump"
--> does the water trap refer to the cryotrap? This is confusing because according to the text and the drawing neither the cryotrap nor the pump are located between the DPG and the multiport valve.
Page 12, equation (1), it seems unnecessary/trivial to define TrueCP = CPin. I can't think of a reason to suspect that it would be otherwise.
Page 13, line 12, "These experiments assume that any changes in the CO2 or CH4 mole fraction are driven solely by the Nafion drying process..."
--> Is it not the case that the experiment was designed to isolate the impact of the other possible sources of eror or bias that are identified, i.e. I think the setup aims to measure differences that are solely due to the Nafion. Maybe I do not understand how the data are analyzed.
page 14, line 4: dependent, not depended
page 14, line 25: according to Stanley et al 2018, the daily standard is measured for 20 minutes. Please state that here also. Is the standard always measured at the same time of day? Or does it run on e.g. a 21 or 23 hour cycle so that it rotates throughout the day in order to check for sensitivity to room temperature variations or similar.
page 15, line 16: Can you rule out the worst case scenario of -0.155 ppm bias (i.e. based on the likely fraction of added industrial air in your standards)? Possible bias of -0.155 is obviously quite large compared to other systematic errors related to water that are the focus of this paper. Why has there not been more effort to quantify this possible source of bias for these installations? It could be easily addressed by having the cylinders measured for 13CO2. Perhaps the labs that supplied the calibration gases could provide information about the isotopic abundance of their spiking/dilution cylinders so that you could make a specific estimate of the errors (and correct them) for each set of unique calibration gases used at the sites. Given the amount of verbiage devoted to the 0.02 ppm nafion error above, the authors seem remarkably unconcerned about the possible impacts of isotopic composition of the standards on the reported ambient CO2 values.
page 26, line 20, is the average standard deviation of the 15 min block means really 0.002 ppm, or is that the standard error (Where the std dev has been divided by sqrt(N)).
page 26: A nafion membrane can store quite a lot of water. I think the results of this test may depend on how the membrane was conditioned. A membrane that has been conditioned with ambient air at e.g. 2.5% humidity for several days may be more permeable than a membrane that has been conditioned with dry air from a cylinder. The permeability of the membrane might also depend on it's temperature. How might these tendencies map onto the dataset (i.e. how was the membrane conditioned prior to the experiment and how does the level of water stored in the membrane compare to what is expected in the field in the summer during a high-humidity event)? Might the membrane be more permeable if it were equilibrated with high-humidity air). Perhaps the day-to-day variations of the ambient standard for the period when the nafion membrane was installed at the field sites might reflect changes in nafion permeability. It would be useful to repeat this experiment to explore whether permeability is impacted by the amount of water stored in the membrane (i.e. flow air with >2% h2o through the nafion for several days prior to starting the test). Andrews et al. also reported permeability of nafion for co2 (page 652,653), but with a somewhat different configuration. They found cross-membrane transport of 0.1 ppm CO2 for a partial pressure gradient of ~1700 hPa - 265 hPa = 1453, which is approximately proportional to your finding of 0.02 ppm co2 difference for a ~400 hPa gradient.
Page 28, line 3: I don't agree with the current wording. So long calibration gases are passed through the nafion then no significant bias is introduced by this implementation of the nafion drying method. The rest of the paragraph is fine, but I think the opening sentence is confusing.
Page 28, line 15: I am confused about how these data are being post-processed to apply the calibration data from the field analyzer. The text says: "Altering the drying method to better match the moisture content of the calibration gases to the sample may minimise this error." For the early period where the nafion was deployed, is it not the case that the nafion humidified standards and the nafion-dried ambient air samples have the same humidity by the time they enter the CRDS? Or does the nafion membrane dry out significantly during the 20 minute calibration interval.
Page 28: In this conclusions section, it would be helpful to make a clear distinction between the early period where the nafion was installed versus the later period with no nafion. It seems that the systematic errors associated with the h2o correction are likely substantially larger during the latter period. Also it would be appropriate to include a brief discussion/summary of the pros and cons of each method, including complexity of implementation for the latter period when annual water calibrations for each crds analyzer were used but evidently with little reduction in uncertainty (given the relatively large variability seen in the daily and weekly lab water corrections.)
page 28, line 20: How are you computing the max errors reported in Table 5? It looks like they are perhaps based on the residuals plots shown in Fig S5. But how are you accounting for uncertainty in the annual h2o correction itself? For cases such as BSD 2016 vs 2017 in Fig 8a, there are are large differences between subsequent H2O corrections. Given that and the weekly variability for UoB in the weekly tests, how do you estimate the errors associated with assuming that a single realization of the H2O correction equation is valid on a particular day?
page 28, line 27: "While drift in the instrumental water correction typically small it is important that it is identified and accounted for through regular water tests." This could be clarified and elaborated upon. My reading is that the annual h2o tests don't seem to provide much reduction in uncertainty, given the results in Fig 8e and f versus Fig 8a and b. Weekly tests would apparently provide little benefit given that the week to week scatter is comparable to the differences among the annual tests. I do think it's a good idea to check the h2o correction at least annually if possible, but the droplet test has documented problems and returning an instrument to the lab for a more rigorous test could lead to lengthy data gaps. Given that the nafion errors are quite small based on tests done to date, it would be useful to include a few sentences about whether they should be reinstalled in the future.
page 29: line 6: Perhaps reprocessing of the data to implement a post-hoc water correction to remove/mitigate the systematic and humidity dependent errors that are evident in Fig S5 should be considered. Such a correction would be different for the nafion and no-nafion periods. Perhaps no correction would be required for the initial period with the nafion driers.
Figure 3: define DPG in the legend or caption.
Fig 4: Have these been filtered? It is surprising that 42m CO2 is not significantly higher than upper levels durning summer nighttime.
Fig 5: caption refers to mean diurnal cycle, but that seems to appear in Fig 6.
Fig 6, 7. Figures are hard to read due to small size and overlapping symbols/bars. Maybe it is not practical to show figures for all molecules in the body of the paper.
Figure 8: references to the individual panels in the caption are awkward and confusing. Also, (c & d) look to be daily and (e &f )look to be weekly, which seems to be opposite from the caption
Figure 10: I don't understand the point of Figure 10 b & d, which seem to show only that the cryogenic drier was working.
Supplement page 1, line 16: error in stated range for F3 0.5 - 0.5 L per min
Figure S1 caption: instead of writing "TOC" in the caption, consider "writing gas generator used to supply the counterpurge flow"
Figure S5g and Figure S6: What is the explanation for the negative values at high water in the CO2 water correction residuals at U of Bristol but not seen at the field sites?
Figure S5: Given that the residuals show similar structure for most sites/years (i.e. CO2 and CH4 residuals are negative at low H2O), could the water correction be improved by using a piecewise correction or a post-hoc correction to remove the systematic residuals that are typical with the current approach?