Comment on amt-2021-353 Anonymous Referee # 1 Referee comment on " Satellite measurements of peroxyacetyl nitrate from the Cross-Track Infrared Sounder : Comparison with ATom aircraft measurements

This statement does not look so clear to me, looking at Fig.6. Without any dependence, we would expect a randomly distributed differences (around the mean bias of -0.08 ppbv), while it looks like in Atom1 and 3, the biases are larger for latitudes 5-40°N, and in Atom2 and 4, larger for about latitudes 30°S-10°N (although less clear here, due to smaller sampling). This could have been due to a systematic bias effect and not a latitudinal effect if CrIS was showing a systematic bias with the aircraft, but the comparisons show a slope of 0.99, so only a constant absolute bias is expected (and furthermore Fig.3 is not showing maximum of PAN at these latitudes, except maybe for Atom2). Can the authors explain more why they assume that there is no dependence on latitude? What could be the reason of the larger bias in Atom1/3 in 5-40°N?

This statement does not look so clear to me, looking at Fig.6. Without any dependence, we would expect a randomly distributed differences (around the mean bias of -0.08 ppbv), while it looks like in Atom1 and 3, the biases are larger for latitudes 5-40°N, and in Atom2 and 4, larger for about latitudes 30°S-10°N (although less clear here, due to smaller sampling). This could have been due to a systematic bias effect and not a latitudinal effect if CrIS was showing a systematic bias with the aircraft, but the comparisons show a slope of 0.99, so only a constant absolute bias is expected (and furthermore Fig.3 is not showing maximum of PAN at these latitudes, except maybe for Atom2). Can the authors explain more why they assume that there is no dependence on latitude? What could be the reason of the larger bias in Atom1/3 in 5-40°N?

-Section 3.1 Retrieval algorithm and strategy
You mentioned in Sect. 4, that you also used Rodgers 2000 for deriving estimated uncertainties (noise, model parameters, ...), and that the theoretical error values are too small by a factor or 2-3. You are talking about only the random part I guess. Do you have an estimation of the systematic error budget as well (spectroscopy, …)? Providing here in Sec. 3.1 a "theoretical" random and systematic uncertainty budget for individual PAN retrievals would be helpful for all the users (in addition to the estimated value for the precision provided in this study by the std of the comparisons).
P5, l-130-135: Do I understand well that PAN is retrieved in the Red windows, but that the previous steps (temperature, H2O, …) are fitted in the large window 760-860 cm -1 ? Because I guess fixing H2O is important if the strong H2O line is not included in the PAN fit. Maybe clarify.

Fig.5:
It's nice to see what is measured by CrIS within one day. I guess OE retrievals take too much time to obtain global seasonal maps, and so you have to focus on short periods and collocated regions (such as for these aircrafts campaigns)? It would have been nice to see seasonal global distribution of PAN from CrIS: how long would it take to process, for e.g. one year of data? -Section 3.2 Initial guess and a priori constraints: "The initial guess profile values for these CrIS PAN retrievals are set to a vanishingly small number." Do I understand from that sentence that the a priori profile (used in Eq. 1) are different than the initial guess profiles? If yes, why not using a priori profiles as initial guess?

-Section 3.2 (should be 3.3) Vertical sensitivity
Could you give the mean DOFS and std for, for example, a typical day as in Fig. 5. And if the std is large, a little bit more information on which conditions give the best DOFS (e.g. enhanced PAN values, but maybe other factors are influencing the DOFS)? Also you could provide the mean and std DOFS for the PAN retrievals used in the comparisons, which are reflecting more "background conditions". Also, how much do you lose in % of DOFS by taking 800-300hPa information instead of total columns? Section 4 Results P7, l. 198-199: "(most of the aircraft measurements are at the low end of the range)": What do you mean? Not clear for me.

P7, l. 203-205:
The correlation with a priori is already good. Especially with the GT_CIMS, where the improvement with retrieved data could appear limited (0.64 compared to 0.53 for a priori). But looking at Fig. 7, it might be due mainly to the isolated point at 0.46 ppbv. Maybe use robust statistics to derive the correlation and the slope and reduce the effect of a single point (it might reduce correlation with the a priori while keeping good correlation for retrieved values and then strengthen the added value of your retrievals). The slope with retrieved values would also be more accurate by using robust statistics.

Section 5 ("Discussion and Conclusion")
P8, l. 234-235: While the comparisons are made very carefully and the conclusion "The results … demonstrate the ability of the CrIS PAN retrievals to capture variation in the "background" PAN values observed over remote ocean regions from Atom" is certainly reached, I would be less assertive concerning the statement "Based on this study, we expect this bias to apply to all parts of the world".
Indeed, while we see in Fig. 2, that the aircraft measurements often sample PAN levels up to 0.48 ppb, the comparisons with CrIS is limited to 0.32-0.34 ppb (with the exception of only one coincidence), I guess due to collocation. When we look at Fig. 5, where the PAN retrievals can reach 0.90 ppb in some regions, I think that the present study is not covering all the gradient of PAN concentrations to make this statement. Validation at high concentration conditions should be done before concluding that the bias would be the same. It is quite usual to have different satellite biases over clean and over high concentration sites (e.g. TROPOMI HCHO, Vigouroux et al., 2020;TROPOMI tropospheric NO2 Verhoelst et al., 2021;…). Of course, with the interesting approach used here (validating 800-300 hPa, and not tropospheric column), the results might be different but I think it's worth looking at additional validation at high concentration conditions before concluding that the bias will apply to all parts of the world.