|I reviewed the previous version of this paper. My main concern then was that it did not provide enough technical detail on the TROPOMI product and its performance relative to OMI to inform a data user. I also thought the case studies of recent major aerosol events weren’t fleshed out enough for that to be the main focus. In this revision the authors have expanded the study significantly, especially in terms of TROPOMI-OMI comparability, which removes my main concern. This version has sufficient new content and aligns better with the scope of AMT; it is also clearer to read. In these respects it is a very good paper and will be of a lot of interest to the community.|
Unfortunately the authors did not address a statistical problem (invalid use of linear least squares regression) I pointed out in my previous review. The data violate the assumptions made by this analysis technique. This concerns sections 3.1.2, Table 2, Figure 3, and some later discussion (including the Summary at the end of the paper). Here I will try to articulate more clearly why this is an issue. I therefore recommend further revisions to fix this.
The remedy is simple: just delete the lines, intercept, and slope, and the discussion. Removing it will not harm the paper. I appreciate the authors adding cautionary wording (page 7) not to over-interpret but it would still be better to delete these.
I don’t believe it is responsible to publish bad statistics, especially when authors and editor are aware of the fact; it does nothing except inform people they can get away with it. I am open to a valid counter-argument to this but am yet to hear one; “it is common” (as here) is not a scientific argument to me. I am not trying to shut down the paper, it is a good paper other than this.
As an alternative, the authors could overplot binned median and standard deviation of error (or similar) as a function of AOD on figure 3 instead of the regression. It will be more informative as to the actual distribution of retrieval errors. We can look at the data, at the correlation and RMSE (which are not ideal but are less problematic diagnostics for the present purpose), and see TropOMI is better.
The regression just muddles the issue as it invites the authors and readers to make an interpretation which is flawed because of the use of inappropriate statistics. As a case in point, Table 2 gives intercepts around +0.25 for Figures 3b and 3c. If you look at the data, it is clear that the point cloud of AOD up to about 0.5 is not pointing towards those being the actual intercepts if an AOD of 0 were measured. The true intercept looks smaller (but still positive). There are a small number of outliers pulling it up which are likely not reflective of the actual data. Regression amplifies these because the outliers are more extreme than the technique assumes. The position and torque of that cloud (AOD up to 0.5) may be different from that for higher AOD. So the relationship is not linear on aggregate. And as the authors note a relative uncertainty means those latter points shouldn’t be weighted as heavily anyway. All of which is why you get an artificial high intercept and low slope.
Sure, TROPOMI calibration issues likely are real and cause a bias but it seems a stretch to imply this is causing an offset of +0.25 in low AOD. Figure 3a (for OMI) has a similar issue: regression intercept is +0.1 with again a small number of extreme outliers pulling it up. If you look at the OMI AOD, when AERONET AOD is low most of the time it OMI is in fact around or below the 1:1 line. So what is this intercept telling us that is useful? Nothing, it is misleading us compared to if we look at the data. Yet these are the numbers highlighted in the paper’s Conclusion. The regression adds nothing of value and hides information in a biased way. Just take a close look at Figure 3.
I am not trying to be negative. I respect the authors’ work a lot and they (here and elsewhere) do a very nice job getting the most out of spaceborne UV measurements. They continue to make improvements which enable people to do new and exciting science unavailable from other platforms. I just want bad statistics to stop being published when there is no need to.
I had a couple of other small comments:
Figures 5, 6: How are standard deviations here calculated? I am not sure if this the standard deviation of all retrievals in the month, or between days from a daily average, or spatially from a monthly average, for example. This should be stated in text or caption. I ask as some of these (e.g. eastern US, Jan 2020 AOD) have a very high standard deviation and I am not sure if this is attributed to spatial variability across the region or an event causing temporal variability within that month, or something else.
Page 11 lines 9-10: “where TROPOMI measured monthly average AOD in the vicinity of 1.0 0.9 are reported. downwind over the southeast” It looks like some text got cut off here as “1.0 0.9” does not make sense and then there is a loose sentence fragment.