|Thank you for providing additional information and clarifications. Though most of my concerns were addressed adequately in your response, some issues still remain. I wrote additional comments concerning those issues below. Where relevant, I reproduced my original remarks/questions and your responses.|
Original comment: Section 3.3.3: Since you do not precisely explain what the Matlab Neural Code does and how the blue trace is derived, I suggest you remove that part and shorten this section.
Author: I have had several discussions at NDACC lidar meetings and at the last IRLC about using machine learning and MatLab’s neural network toolbox to estimate lidar profile backgrounds. I think that is plot will be interesting to several people.
Reviewer: I do not question potential interest within the community in machine learning and using Matlab’s toolbox. Machine learning is a complex and much hyped topic, but many issues regarding e.g. reproducibility of the results are not yet well understood, as results often depend on how the particular network was trained and what training data were used.
If you do not provide any information about how you set up the neural network and how it was trained, nobody will ever be able to reproduce your findings or compare with other results. Thus, publishing those results will be worthless for the science community.
You may show results obtained with the Matlab Neural Network tool in your paper. However, you should briefly explain what you did. Just saying “I used this toolbox and that is what I got” is no scientific work. On the other hand, you could say “I took the Fourier transform of time series x and here is the spectrum”, as the Fourier transform is a well-defined algorithm which will always produce the same results when applied to the same data sets. With neural networks many things can go wrong because their behavior is not so well-defined. In your case that is especially important given your statement “the software requires an exhaustive set of example bad profiles which we cannot supply”. So how was the network trained without a representative training data set? Did you use any trick nobody else knows?
Coming up with a precise description of what the neural network did will likely be difficult. For that reason, and because the neural network part is not essential for your paper’s main results, I suggested to remove this part. However, you may insist on showing these results. In that case I will insist on a description of the neural network. That description may be brief, but all essential information which allows someone to repeat your steps must be given.
Line 245: “bad scans” -> “bad profiles”?
Original comment: Lines 267-269: What is the reason for choosing “the point where the signal to noise equals one in the density profile”?
Author: It seems like a reasonable choice for deciding on an arbitrary starting point. We were
motivated by getting temperatures in the UMLT. However, the point is well taken. I’m
aware that other groups use other definitions. It would be good to see a study devoted specifically on this topic.
Reviewer: I may sound harsh, but your answer “it seems like a reasonable choice” does not convey any scientific value. To be more precise, why is that a reasonable choice? If you did not investigate this question, it is ok to say so (given the limited time, no study can be perfect). I am aware of different groups using different definitions for the starting point, and I was just curious whether you have any convincing arguments for a particular definition.
Original comment: Section 3.5.3: You should not attempt to “correct” signal induced noise. It is fundamentally impossible to characterize properly signal induced noise in lidar signals because the noise is superposed on the atmospheric signal. Determining the signal induced noise from the background signal above the lidar signal is bound to fail because you are essentially observing the noise at different times outside the period where you actually are interested in. Signal induced noise is highly non-linear and therefore it is impossible to properly correct it. The data should be regarded as corrupt and not be used in lidar analysis. Besides, significant signal induced noise (e.g. blue trace in Figure 9) indicates that detectors are operated outside safe limits or there is a general technical problem with the lidar. If you insist on using the questionable data, you should assess how the retrieved temperature profile changes when you tweak your model representing the signal induced noise (e.g. cubic versus linear). How do your retrieved profiles compare to independent observations e.g. radiosondes at lower altitudes?
Author: I disagree with the conclusion that we should not make the attempt at a SIN correction. You are quite right that a perfect correction might be impossible. However, we have found that a correction of the sort described in the paper, for the types of signal induced noise that we see at OHP, can be adequately applied for the purposes of our temperature retrievals. The effects of this signal induced noise in our profiles, when uncorrected, is to warm the upper altitude regions of the temperature profiles. Conveniently, we have two measurement channels (the high and low gain channels) which make coincident measurements in this region. Typical count rates within this region are are well within the linear response regime of the high gain channel; therefore dead time correction is not required at these altitudes, and we can believe the high gain channel temperature profile in this region. The quadratic correction for signal induced noise in the low gain channel brings the resulting low gain temperatures into agreement with those from the high gain channel at these high altitudes.
While it would be wonderful to eliminate every stray source of noise in the lidar, we cannot do this for the measurements going back 40 years and more - which form a valuable data set. We also point out that the effect of this quadratically-characterized signal induced noise is negligible at low altitudes: For example, in Fig. 9, the SIN contribution at 30 km is less than 100 counts, compared to a bg + signal value in the tens of MHz (see fig03). In terms of contribution to temperature, this is so small as to not be observable.
I did some initial quality testing between my 3 channel lidar temperature retrieval and the radiosondes launched from the station at Nimes (~150 km west) and the results are reasonable. There’s some expected differences but the results can be very good when the sonde travels directly east. That said the focus of this paper is above 30 km and a full radiosonde comparison study with calculated air mass trajectories would be a good project for the next student.
Reviewer: I agree, we can’t change the past and need to work with the data at hand. Above you mention the agreement between high and low channel temperatures, which improves when the quadratic correction is applied. That is very valuable information, because it gives credibility to your approach, and should be mentioned in your manuscript as well. On the other hand, your validation works only for the lower channel. In the absence of any validation for the upper channel which shows a completely different behavior (linear versus cubic), you could at least provide an estimate of the magnitude of the correction, e.g. x K at 75 km, where x is the difference between temperature profiles retrieved with and without correction. If x is sufficiently large (e.g. >1 K), that should be acknowledged as potential source of error, as signal induced noise is a dynamic phenomenon which commonly depends on several factors (e.g. peak intensity, average intensity, particular type of detector) and thus likely varies over a broad range of time scales (from pulse-to-pulse to months). The problem is that signal induced noise causes a non-Gaussian error, so integrating longer does not help you. Your correction most likely helps alleviating the problem, but it won’t be perfect. How well it really works – we don’t know. E.g. you may unknowingly overcorrect the noise resulting in a cold bias, or undercorrect and still retain a warm bias. Only the comparison with an independent data set can tell whether your correction is working as it is supposed to. However, if your x is small (I think it is. Unless both lidar systems show exactly the same behavior, I would expect larger differences in Fig. 15 for a large x.), you may argue that the effect of signal induced noise on temperature is small as well.
Because of the problems it causes, most groups try to avoid signal induced noise by limiting the peak count rate to safe levels.
Original comment: Figure 13: It is hard to estimate absolute temperature differences. I suggest you use a segmented color bar with 6-10 different colors. Can you provide a plot showing combined temperature error estimates of both lidar data sets? There is a period in mid 2001 with distinct blue color (negative temperature differences) between 30 and 55 km altitude. Could these observations also have been affected by misalignment? A similar area can be found in right after the last marked region in 2011.
Author: The same information is already presented in a more compact way in Fig14 I’ve added the following text: ‘For reference, a typical LTA temperature profile with an effective vertical resolution of 2 km has an uncertainty due to statistical error of 0.2 K at 40 km; 0.4 K at 50 km; 0.6 K at 60 km; 0.7 K at 70 km; 1.8 K at 80 km; and 602 K at 90 km. For reference, a typical LiO3S temperature profile with an effective vertical resolution of 2 km has an uncertainty due to statistical error of 0.3 K at 40 km; 0.5 K at 50 km; 1.0 K at 60 km; 2.7 K at 70 km; and 10 K at 80 km.’ I cannot account for the blue regions in Fig13 based on either lidar uncertainty budget or through geophysical explanations. Yes you’re correct the blue bias between 30-50 km is likely due to misalignment. Given 5 mirrors in LTA and 4 mirrors in LiO3S there are many possible ways to be misaligned. As well the severity of the misalignment.
Reviewer: If blue (or red) biases outside the boxes may be caused by misalignment, then misalignment is obviously a major source of error which ultimately limits the accuracy (and, depending on time scale, also precision) of your measurements.