In their manuscript “Mobile air quality monitoring and comparison to fixed monitoring sites for instrument performance assessment”, Whitehill and coauthors present analyses from stationary and mobile comparison measurements of ozone, NO, NO2, and Ox (O3 + NO2), measured on-board mobile platforms and in air quality network stations to be used to identify instrument bias during ongoing field measurements with the mobile measurement platforms. This is a strongly revised version of a previous manuscript, in which the data analysis was taken a lot further, compared to the initial submission.
While in the initial version of the manuscript, the authors have mainly based their analysis on linear regression analysis and have presented multiple correlation plots, they picked up my suggestion to show the quality of agreement between air quality network results and their mobile platform measurement results as a function of distance between both measurements for different road types separately. They have taken this further and focus now on average or median bias instead of slope and intercept. In addition, the authors have added a discussion on how well their approach of mobile comparisons could be used for other types of pollutants, depending on their nature and homogeneity of their concentrations in the environment.
The revised version of the manuscript goes well beyond the first version in analysis depth and coherence. In addition, the authors have addressed all relevant points raised in my first review to an acceptable degree. There are several minor points to be worked on, as detailed below. After these points are addressed, I suggest publishing the manuscript in AMT.
Detailed comments:
L24: “… highways showing the most variance.” – Shouldn’t it be “… highways showing the strongest biases.”?
L53: “In addition, natural variability …” “In addition, natural spatial variability …”
L82: “… commonly measured and air quality …” “… commonly measured in air quality …”
L121: “… next-generation air quality instruments”: This sounds like very sophisticated air quality instruments. Don’t you just mean “low-cost sensors”?
L141: The drivers were instructed to park facing into the wind when possible. Was assessed, whether the measurements were affected by the own exhaust under still conditions or when the wind was from the back (which both could also occur during the mobile measurements, e.g., when stopping the vehicle). Were such self-sampling intervals removed from the data sets?
L151, Figure 1: From the satellite view and the scale at the bottom of the image, the maximum distance for the CAMP site seems to be rather 35 m and not 85 m.
L162: The QA evaluations showed instrument bias of 3%-6% for O3 and 3%-8% for NO. How do these percentage biases translate into ppbv biases? If they are calculated from the span gas concentrations, 80 and 360 ppbv, this would mean an observed bias of up to 5 ppbv for O3 and up to 29 ppbv for NO. This is larger than the minimum necessary bias for observation as stated in the abstract and in the results section (4 and 8 ppbv). How can this be?
L206f: The mean or median DeltaX values are claimed to be a more direct assessment of systematic bias than slope and intercept of a linear regression. This is true for an offset in the instrument data. However, a bias due to a change in instrument sensitivity would rather be detected by linear regression than using mean or median DeltaX values.
L230: “… differences in concentrations due to the distance between the locations …” - The differences in concentrations are less due to the distance between the locations but rather due to the differences in distance to the sources of emission plumes. Otherwise, it would only result in more scatter, but not in a systematic bias.
L239: Why does the influence of local traffic emissions reduce the applicability of a linear regression approach but not of a measurement bias approach?
L244-246: The authors state that mobile collocations allow to sample a larger amount of air in the same sampling duration. Mobile measurements sample from the amount of air that passes by the moving vehicle (i.e., depends on difference of velocities of ambient air and vehicle). Stationary measurements sample from the amount of air that passes by the station due to the ambient air movement. On average (i.e., when moving with and against the ambient wind the same amount of time), both are similar in size. In the case that the vehicle moves with the air (in wind direction), the mobile measurement would even probe less volume of air, compared to the stationary measurement.
L257-258: “While travelling on high traffic roads, the cars are more likely to be impacted by direct emission plumes.” - This is likely true. Nevertheless, I think that the critical point is the traffic density in the vicinity of the mobile measurement vehicle, not only the road classification. There can also be several cars in front of the mobile measurement on residential roads, e.g., at intersections or traffic lights. This would also strongly affect the measurements of NO and O3. While the road type classification is an easy to perform start, a more sophisticated approach would be desirable for the future.
L271: “… bias values can reflect …” “… bias values from in-motion collocations can reflect …”
L272: I would remove the “random” in “… generally reflect the random variability …” because non-random variability, e.g., due to persistent spatial differences, would also be reflected in r2.
L288: “the relationship” seems to be not the right expression since there is no interaction between the stationary and the mobile collocation measurements. Better something like “in comparison to”.
L292: The larger magnitude of bias for the Highway road class, compared to the other road classes, does not indicate larger spatial variability (this would be shown by larger r2 values) but larger (or smaller in case of O3) average or median concentrations.
L305, Figure 3: Please use consistently in the text and in the Figures either r2 or R2.
L306: What does "maximum distance" mean in this context: are the median and mean bias values taken from all measurements up to the respective distance buffer shown, i.e., for larger distance buffers also the measurements from closer to the fixed site are included; or are only the increments in distance buffer used as basis for the respective data? The former provides only very indirect information on how the measured bias depends on distance to the fixed site. It should be more clearly stated, which data are included in the individual data points.
This approach is not very helpful in the analysis of how strongly spatial variability affects the comparison measurement, because it is strongly affected by the number of data points in the individual distance buffer increments. E.g., for residential roads there seem to be barely any data points available beyond a distance of 2 km. This results in barely any change in bias values beyond this distance, which is not a result of spatial homogeneity (as it seems) but a result of a lack in data points in this distance range. Normalizing the contribution from individual distance buffer increments by the number of data points within the respective increments would provide a more direct information on the influence of distance on the comparability of the measurements and would therefore allow to apply the results also to other environments, where the distribution of various road types might be different.
L326-331: In order to judge whether the observed bias values are significant and whether they would allow a reasonable identification of measurement bias, it would be interesting to also see the average absolute concentration values and their variability during the measurements. This is shown in Figure S4 and S5, but this information would be essential in the main text as well.
Furthermore, according to the time series, e.g., for NO it looks like that rather than average or median values the minimum or better something like the 5% percentile would represent the measurements not affected by local plumes (i.e., those of the stationary sites) much better.
L426: “… random spatial biases.” “… random spatial variability.”
L433: I agree that the higher r2 for NO2 could be due to the larger dataset used, however, it also may be due to the fact that only 1-hour averages were used there, where short concentration peaks are largely averaged out.
L439f: This statement again shows that without taking the number of data points per road type or distance increment into account, the results reflect to a certain degree the distribution of road types and not necessarily the actual spatial variability of pollutant concentrations.
L458: Indeed, it seems advantageous to remove Highway road segments from the data. However, it also would probably be advantageous to remove all data points which are from short-time peaks (i.e., from plumes of nearby sources) – also from the data of the other road types.
L491-496: It is not really clear to me how the upper and lower traces were calculated. Are these the lowest and highest 1-hour medians within the running median? Or is this the minimum and maximum of the running medians (over different window sizes) of the 1-hour medians? It would be desirable, if this is explained a bit more clearly.
L519-525: So, this means that under typical conditions, the response time would be likely several weeks, correct? In this case, wouldn’t it be easier to have a quick calibration check every couple of weeks where a calibration gas mixture is probed for a couple of minutes by the instrument setup, compared to the ongoing analysis of mobile collocations with their higher bias uncertainty
L567: Why are here averages of data instead of medians (as in the rest of the manuscript) used?
L577; Figure 10: Why does the 40-hour running median start (and end) at the same time as the data points start (and end)? Shouldn't there be a lag in the start with the running median starting after 40 hours only (as in the previous figure)?
L589: How does the analysis has implications for the spatial heterogeneity? I guess the latter one is unaffected by the analysis.
L608: Removing highway-related data does not reduce the spatial heterogeneity of pollutant concentrations, it reduces the influence of local emission plumes onto the measurement data.
L633: Not "the influence of highways" needs to be removed, but the influence of emission plumes on the measurements – which are more frequent on highways, compared to other road types. |