Determination of the Emission Rates of CO2 Point Sources with Airborne Lidar

Anthropogenic point sources, such as coal-fired power plants, produce a major share of global CO2 emissions. International climate agreements demand their independent monitoring. Due to the large number of point sources and their global spatial distribution, the implementation of a satellite-based observation system is convenient. Airborne active remote 10 sensing measurements demonstrate that the deployment of lidar is promising in this respect. The integrated-path differentialabsorption lidar CHARM–F is installed onboard an aircraft, in order to detect weighted column-integrated dry-air mixing ratios of CO2, below the aircraft along its flight track. During the Carbon Dioxide and Methane mission (CoMet) in spring 2018, airborne greenhouse gas measurements were performed, focusing on the major European sources of anthropogenic CO2 emissions, i.e., large coal–fired power plants. The flights were designed to transect isolated exhaust plumes. From the resulting 15 enhancement in the CO2 mixings ratios, emission rates can be derived via the cross–sectional flux method. On average, our results roughly correspond to reported annual emission rates, with wind speed uncertainties being the major source of error. We observe significant variations between individual overflights, ranging up to a factor of 2. We hypothesize that these variations are mostly driven by turbulence. This is confirmed by a high–resolution large eddy simulation that enables us to give a qualitative assessment of the influence of plume inhomogeneity on the cross–sectional flux method. Our findings suggest 20 avoiding periods of strong turbulence, e.g., midday and afternoon. More favorable measurement conditions prevail during nighttime and morning. Since lidars are intrinsically independent of sunlight, they have a significant advantage in this regard.


Introduction
CO2 causes the strongest radiative forcing among all anthropogenic greenhouse gases (e.g., Myhre et al. (2014)). Therefore, it plays a crucial role with respect to human-induced climate change. In 2018, CO2 has reached a global annual average of 407.4 25 ppm at the Earth's surface, an increase of 47 % compared to the year ~1750 (Friedlingstein et al., 2019). One-third of all anthropogenic CO2 emissions stem from localized point sources, in particular coal-fired power plants (Oda and Maksyutov, 2011). For Europe, they even account for 45% of CO2 emissions (Super et al., 2020). The Paris Climate Agreement aims to reduce anthropogenic greenhouse gas (GHG) emissions by means of nationally determined contributions (NDCs), which are based on national capabilities and the level of economic development (UNFCCC, 2015). Therein it is foreseen that as of 2023 30 a global stocktake will take place every 5 years. This requires independent measurements to verify each nation's emission reports of CO2, but also of other greenhouse gases, such as CH4. Currently, there is no independent global emission verification system available, and a complete record of all emissions globally is still far from reality. To achieve this goal, satellite missions are indispensable. Satellite missions are expected to detect CO2 emissions from large power plants and cities, e.g., the future European Carbon Constellation CO2M (Bézy et al., 2019;Broquet et al., 2018;Kuhlmann et al., 2020), and other mission 35 ideas still in the pre-development phase Strandgren et al., 2020). Furthermore, CH4 emissions can also be detected, as is done by GHGSat-D for coal mine ventilation shafts (Varon et al., 2020), or the Sentinel-5 Precursor for the oil and natural gas-producing sector (Pandey et al., 2019;Zhang et al., 2020). Under particularly favorable conditions, it is already possible to detect CO2 emissions of power plants from space, as is done with data from NASA's OCO-2 mission (Nassar et al., 2017;Reuter et al., 2019). However, at the moment no operating satellite mission is able to quantify emissions 40 from large power plants, on a regular basis. In the development phase for potential missions, airborne measurement campaigns serve as a test of the methods. During the operating phase, they are needed for verification of the space-borne results.
In May/June 2018, the CoMet (Carbon Dioxide and Methane mission) field campaign took place. The objective of CoMet was to investigate the fluxes of the major human-influenced GHG on local, regional, and sub-continental scales. These fluxes were to be determined more precisely than previously possible. Furthermore, supporting activities for GHG stocktaking were 45 provided. The CoMet campaign saw the deployment of a suite of airborne instruments to measure atmospheric CH4 and CO2, alongside a variety of ground-based instruments. In particular, the synergetic use of active remote sensing (lidar) Wildmann et al., 2020), passive spectrometry (Krautwurst et al., 2021;Luther et al., 2019), and in situ measurements Gałkowski et al., 2021;Kostinek et al., 2020) supported by modeling activities , as well as the validation of existing (e.g., Sentinel-5P, GOSAT (Greenhouse Gases Observing Satellite)) and the 50 preparation of upcoming (e.g., MERLIN (Methane Remote Sensing Lidar Mission)) GHG satellite missions were aimed at.
Hereby, the German research aircraft HALO (High Altitude and Long Range Research Aircraft) acted as the airborne flagship of that campaign. HALO was equipped with the new airborne CO2 and CH4 IPDA (integrated-path differential-absorption) lidar CHARM-F (CO2 and CH4 Remote Monitoring-Flugzeug) built and operated by DLR as an airborne demonstrator for the upcoming MERLIN mission . CHARM-F simultaneously measures the column-averaged dry-air mixing 55 ratios of carbon dioxide (XCO2) and methane (XCH4) between aircraft and ground . The influence of other trace gases, in particular H2O, on the mixing ratio measurements is negligible. As a result of the pulse repetition frequency (50 Hz, double pulse) and divergence (~1.5 mrad) the pattern on the ground is a sequence of overlapping footprints. The vertical column measurements are insensitive to vertical redistribution of the trace gases. The insensitivity towards optically thin clouds, aerosol layers, and varying surface albedo, and the instrument design with e.g., active laser frequency control is a 60 further strong asset of the IPDA lidar approach. Albedo variations basically affect the measurement precision (statistical uncertainty), whereas the influence on the bias is negligible (Amediek et al., 2009).
During the CoMet campaign, HALO probed various local plumes of different coal-fired power plants. As a case study, the paper in hand focuses on the measurement flight of 23 May 2018, where the CO2 exhaust plume of the power plant plant. An established method for quantifying emission rates of point sources is the cross-sectional flux method, which is a product of mean wind speed and an integrated concentration enhancement along a cross-sectional overflight of the exhaust plume. This principle has been applied for air-/space-borne nadir-viewing remote sensing (Krings et al., 2018;Menzies et al., 2014;Varon et al., 2018;Reuter et al., 2019), mobile ground-based sun-viewing remote sensing (Luther et al., 2019), as well as airborne in situ measurements (Cambaliza et al., 2014;Conley et al., 2016;Fiehn et al., 2020;White et al., 1976). Amediek 70 et al. (2017) have described how this principle can be realized with CHARM-F. Using CHARM-F data from the respective overflights, we strive to accurately assess the error and advance the general methodology.
When determining the cross-sectional flux, one of the major error sources is the local wind field. On the one hand, because the wind speed is directly included in the calculation, on the other hand, because atmospheric turbulence can broaden or constrict the spatial extent of the exhaust plume. This is a well-known problem, which contributes significantly to the measurement 75 error (Kuhlmann et al., 2019;Luther et al., 2019;Strandgren et al., 2020;Varon et al., 2018;Jongaramrungruang et al., 2019;Kumar et al., 2020). Consequently, the observed CO2 column enhancements between subsequent plume transects may vary considerably, despite a constant emission rate. We hypothesize that these turbulence-induced variations dominate the measurement error of the emission estimates rather than the GHG column measurement uncertainty itself. To assess the impact of this atmospheric turbulence on our measurement results, we perform a large eddy simulation (LES) in order to resolve local 80 plume structures. By doing so, we can to compare different ambient weather and turbulence conditions. We aim to separate more and less favorable conditions, to determine an adequate distance between emission source and measurement locations, and to find out how many independent plume measurements will be necessary, in order to obtain an appropriate emission rate accuracy, as a function of those environmental conditions. This paper is organized as follows: Section 2 introduces the IPDA lidar method, describes the retrieval of the emission rate, 85 and the methodical errors. Section 3 reports on the plume measurement results. Section 4 provides the simulation setup, while the subsequent results are presented in Sect. 5. A discussion is given in Sect. 6, followed by the conclusion in Sect. 7 and the outlook in Sect. 8.

Flux calculation 90
The dataset underlying this work originates from IPDA lidar CHARM-F. A more detailed description of the lidar system can be found in Amediek et al. (2017). At its core, CHARM-F consists of a pulsed, tunable laser source and a detector. Installed on an aircraft or satellite, the nadir-oriented lidar emits two laser pulses that propagate through the atmosphere until they are backscattered at a surface. The two backscattered laser pulses are detected by the lidar. The wavelength of one laser pulse corresponds to the absorption wavelength of the greenhouse gas under consideration. In the following, this laser pulse is 95 referred to as online. Due to molecular absorption the intensity of the online laser pulse decreases while propagating through the atmosphere. The wavelength of the other (offline) laser pulse is slightly shifted such that almost no absorption by the greenhouse gas takes place, but the interaction with the remaining atmospheric components is unaltered.  Amediek et al. (2017). The measurement geometry of a lidar, illustrated by the example of CHARM-F carried on the aircraft HALO. Two laser pulses are emitted towards the earth with a delay of 500 µs. The laser pulse with the wavelength λon is on the absorption line of CO2 (1572.02 nm), the laser pulse with the wavelength λoff is not (1572.12 nm). By comparing the backscattered intensities, the CO2 concentration in the measured cone volume can be calculated. The measured volume is usually referred to as a vertical air column, since the column length, i.e., the aircraft's altitude above ground (~ 6500 m), is very large compared to the diameter of the reflecting surfaces (~ 10 m). The distance of consecutive laser pulse pairs is 3 m. The black line in (b) schematically depicts the measurement principle, not the actual spectral absorption line shape of CO2.
Using a beam splitter, a small part of both the online and the offline laser pulse energy (Eon/off) is deflected onto a detector while still in the lidar system. Together with the radiation fluxes entering the lidar telescope Pon/off the differential optical 100 absorption depth (DAOD) can be calculated (Ehret et al., 2008): Note that for the DAOD a single value is obtained for the entire vertical air column. It is a metric for the greenhouse gas concentration of the measured column and is also defined by the following relationship: Here, DAODb is the background differential absorption optical depth, ΔDAOD is the enhancement in the DAOD induced by the plume, M is the molecular mass of CO2 in gram, and Δc(z) is the enhanced CO2 density induced by the plume in gram per cubic meter. Δσ(z) is the difference between the CO2 absorption cross section for the two laser wavelengths given in square meter (cf. Fig. 1). It is referred to as the differential-absorption cross section. Generally, Δσ(z) is not constant over the plume's vertical extension, due to the decreasing pressure with altitude. However, the decreases in pressure associated with typical 110 vertical plume extensions are small. As an approximation, we use the mean value over the vertical extent of the plume Δσ .
This aspect is discussed in more detail in Sect. 3. The vertical integral limits are the ground (z = 0 m) and the respective height of the aircraft fl. Variations in flight altitude, as well as topography, may cause variations in the surveyed column length and thus ultimately in the measured DAOD. In this study, these variations are negligible, since the flight altitude was deliberately kept constant and the topography around the power plant under consideration is sufficiently flat. 115 This DAOD dataset is used to determine the CO2 emission rate of a point source utilizing the flux calculation method introduced by Amediek et al. (2017). As schematically depicted in Fig. 2, a crossing of the exhaust plume leads to a DAOD enhancement. This is caused by the additional absorption of laser radiation by the CO2 molecules of the plume. The instantaneous flux through the lidar cross-section, in the moment of the overflight, is given in kilogram per second and denoted by q: 120 Given in meter, the parameter A corresponds to the integrated DAOD enhancement over the background DAOD in the direction of the aircraft flight track as shown in Fig. 2b. In the following, it is referred to as integrated enhancement. The mean horizontal wind speed u is given in meter per second and the angle between the wind direction and the aircraft's flight direction is denoted as φ (in the following referred to as relative wind direction). Furthermore, it is assumed that no uptake by the soil 125 takes place when the gas plume hits the ground and that the flight altitude is high enough (i.e., well above the Planetary Boundary Layer) to cover the entire vertical extent of the plume. The two closely spaced sounding wavelengths are selected in such a way that the impact from unknown particles is minimized while keeping the absorption by water vapor is as low as possible. This is due to the very weak water vapor differentialabsorption cross section, which is more than 4 orders of magnitude smaller than the differential-absorption cross section for 130 CO2. Thus the influence of additional water vapor in the plume released by the cooling or coal drying systems of the power plant is negligible . Moreover, the selected CO2 absorption line is sufficiently temperature-insensitive, such that the influence of temperature variations within the plume can be neglected (see also Kiemle et al. (2017)). Under these conditions, the flux error is mainly driven by uncertainties of the four parameters A, Δσ , u, and φ. Assuming that these parameters are not correlated, the relative accuracy in the flux calculation can then be estimated by error propagation means: 135 With δA/A, δ( Δσ )/ Δσ , δu/u denoting the relative uncertainties of these parameters. From this, it is obvious that crossing the plume perpendicular to the wind direction as displayed in Fig This minimum value is also referred to by Sharan et al. (1996), arguing that above this threshold, advection dominates over diffusion.

Background separation
For the calculation of the integrated enhancement A and its uncertainty, it is crucial to distinguish between the DAOD value 145 attributable to the background concentration of CO2 and the fraction attributable to the exhaust plume of the point source. As shown in Eq. (2), the measured DAOD along the flight track is the sum of background term DAODb and the enhancement due to the plume interaction ΔDAOD. A complicating factor is that the background term may not be constant. There are small variations in local CO2 concentration from other anthropogenic sources (traffic, cities, etc.) or local interaction with the biosphere. Also, small CO2 gradients caused by the sounding of different air masses in the vicinity of the plume may have an 150 impact on the background term. In the following, we describe a suitable method that enables us to extract ΔDAOD from the measured dataset, while allowing the background term to be variable.
An example of the plume extraction procedure is shown in Fig. 3. The plume must first be detected as an enhancement not attributable to noise in the data. For this, we examine a 0.2 km running mean of the DAOD dataset (Fig. 3a). The choice of 0.2 km is made because it corresponds to the diameter of the pixels of the simulation (see Sect. 4). The larger the window for the 155 running mean, the less noise is present and the clearer the plume enhancement can be seen. Then again, peaks threaten to be blurred if the window width becomes too large. Figure 3: Plume crossing at a point source distance of 1.53 km. In (a) the gray curve shows the raw data, with a standard deviation of 5.2 %, while the black curve shows a 0.2 km (64 data points) running mean (RM), with a standard deviation reduced to 0.9 %. In (b) the green curve is a 4 km (1293 data points) RM. Green vertical dashed lines mark the intersections between the 0.2 km RM and 4 km RM, which are defined as the plume's limits. Colored purple in (c) shows the region of the data used to construct a mean value of the data before and after the plume's limits. This mean value is used to bypass the plume enhancement and is also colored purple. In (d) again a 4 km running mean over the bypassed dataset is shown in brown. This data, which has slight variability, is used as the background term DAODb. Finally, in (e) and (f) the enhanced term ΔDAOD, i.e., difference between 0.2 km RM and DAODb, is plotted in black. Note the different scales on the y-axis. In (e) the area underneath the curve is colored red, as an example of the parameter Asum determined with a Riemann sum. Alternatively, a Gaussian fit can be applied to ΔDAOD, providing the parameter Afit as a fit-parameter, as shown as a blue line in (f).
Starting from the middle of the plume enhancement we define the plume's limits as the intersections between the 0.2 km running mean (RM) and another 4 km running mean (Fig. 3b). Applying a running mean broadens and flattens the plume. For larger running mean widths, such as 4 km, the flattening is so severe that the plume is only distinguishable from the background 160 as a raised plateau (see Fig. 3b). The limits are then defined as the intersections between the 0.2 km running mean and the 4 km running mean. For the calculation of this mean we consider a window with a width equal to that of the plume, colored violet in Fig. 3c. At last, we execute another 4 km running mean over the raw dataset, with bypassed plumes, resulting in the background term DAODb, shown in brown in Fig. 3d. This procedure allows for a variability in the background term on a scale of a few hundred meters. Smaller scale gradients cannot be attributed to the background and are incorporated in the 165 enhancement term ΔDAOD, thereby not being distinguishable from noise.
The mean wind speed u and its mean relative direction φ are extracted from model data provided by ECMWF. The molecular mass M and the differential-absorption cross section Δσ are physical properties of CO2 and available in various databases such as HITRAN2016 (Gordon et al., 2017). The only parameter that results from a measurement by CHARM-F, or a respective simulation, is the integrated enhancement A. For this purpose Amediek et al. (2017) described two distinct methods. The first 170 method is a Riemann sum over all enhancement values ΔDAODi, multiplied with their respective spatial distance Δyi between two successive data points: The second method makes use of the fact that, on average, the plume is subject to Gaussian dispersion behavior. According to the function F(y) in Eq. (6), a nonlinear least squares fit is applied to the ΔDAOD values of the plume. 175 By doing so the integrated enhancement is obtained as the fit parameter Afit. A1 is the peak's position along the flight track and A2 the turbulent dispersion parameter, which is a measure for the width of the plume. The fit method yields very low values for the uncertainty of the parameter A. However, the Riemann sum is not depending on any model assumption for the calculation of the integrated ΔDAOD along the flight track. Both methods were investigated in the course of this work and 180 showed nearly identical results. Therefore, the results in Table 1 correspond to the mean value of the two methods.  Figure 4 shows the flight track of the aircraft, along with a picture of the cooling towers. In total, the point source was flown over seven times downwind, two times upwind, and once directly overhead the cooling towers. In three of the downwind overflights, no enhancement in DAOD is visible. For these transects, 190 the distance to the point source is greater than 4.6 km. At such distances, it can be assumed that the exhaust gases are too diluted with the surrounding air to generate a measurable signal.  Table 1) are calculated using the slender plume approximation Seinfeld and Pandis, 1997): In this equation, the ground (z = 0 m) and max = 4000 m denote the integration boundaries. h is the height of the cooling 210 towers. The key parameter in this equation is the turbulence parameter σz, which is a proxy for the plume extension in the vertical direction, at the respective distance. Different expressions for this parameter for various atmospheric stability conditions can be found in the literature, e.g., Seinfeld and Pandis (1997).
Assuming a moderately turbulent atmosphere, we found plume widths of σz = 170 m and 600 m for the two distances. However, if the atmospheric turbulence is less pronounced, the vertical plume widths are only σz = 90 m and 250 m, respectively. Due 215 to the lack of further information on turbulence characteristics during our measurements, we consider both plume widths in the calculation below. The change of Δσ(z) versus altitude above ground in Eq. (8)  The overscore indicates the mean value of the aforementioned turbulence scenarios with corresponding plume vertical widths 225 at each distance and the errors indicate the differences. Close to the source (~ 1500 m), the relative cross section uncertainty is ~ 0.6 % and therefore negligible, whereas, at a distance of 4700 m, the relative error is ~ 3.2 % and not negligible in the overall error budget outlined by Eq. (4).
Possible systematic errors, due to uncertainties of the line parameters, are less than 2 % (Gordon et al., 2017). Errors resulting from the wavelength setting with the CHARM-F instrument are considered very small compared to the other contributors and 230 therefore need not to be extensively discussed in this study (~ 0.5 %, see Amediek et al. (2017)).
The wind data are taken from operational analysis data of the ECMWF model. This is done by first interpolating the 4dimensional gridded model data onto the flight path at the altitude of the power plant's exhaust shaft. Secondly, a mean value of the wind speed and direction along the flight track, together with an estimate of their relative errors are calculated according to Ackermann (1983). 235  can be attributed to the uncertainty of the mean differential-absorption cross section of CO2 δ( Δσ )/ Δσ and the mean relative wind direction δφ/tan(φ). The major contributor to the flux uncertainty, 245 however, is the uncertainty of the mean wind speed δu/u, which accounts for  Table 1). These variations cannot be explained by our uncertainty estimation, but rather by atmospheric 250 turbulence that distorts the plume. Therefore, this work further investigates the influence of atmospheric turbulence and the resulting inhomogeneity in the propagation of exhaust plumes. To achieve this, we make use of the mesoscale numerical weather prediction system model WRF (Weather Research and Forecasting model).

Simulation setup
To investigate the influence of atmospheric turbulence and the resulting inhomogeneity in the propagation of exhaust plumes, 255 we use WRF-ARW, the Advanced Research Version of the Weather Research and Forecasting model (Skamarock et al., 2008).
It is a well-established platform to investigate the transport of plumes (Zhao et al., 2019;Bhimireddy and Bhaganagar, 2018;Yver et al., 2013). The model configuration can be found in Table 2. Considering typical source distances of the measurement crossings (see Table 1), in addition to the spread of the plumes (see Fig. A1), it is clear that our investigations need to be implemented with a horizontal resolution in the sub-kilometer range. To 260 achieve this, we introduce three nested domains, with the coordinates of the middle cooling tower as the center of the domains (see Fig. 5). The domain configurations can be found in Table 3. As meteorological initial and boundary conditions, operational ECMWF analysis data is used with a horizontal resolution of 9 km.  (Moeng et al., 2007). Several studies show that WRF-LES is an adequate tool to model 265 plume trajectories, in conjunction with turbulence and passive tracer dispersion (Nunalee et al., 2014;Nottrott et al., 2014).
Only the plume of the power plant is simulated, without any CO2 background field. WRF-ARW has the option to predefine a tracer variable tr(t,x,y,z) which has the properties of a passive tracer, as used in Blaylock et al. (2017). It represents a fourdimensional field of space-time. A detailed description of the calculation of simulated DAOD can be found in Appendix A.
Therein Eq. (A3) is used to calculate the DAOD enhancement corresponding to the horizontal dispersion of the tracer, as 270 shown in Fig. 6. The WRF simulation provides a data output every 2 minutes. One virtual plume crossing is evaluated for each output time step at a point source distance of 1.5 km. This corresponds to our measurements (see Table 1). Since neither background field nor noise is simulated, it does not matter at which distance to the point source the virtual flyover takes place. Nevertheless, we try to match the virtual survey as closely as possible to real conditions. Just as in the real measurement, the virtual crossings are 275 arranged perpendicular to the propagation direction of the plume (cf. Sect. 2.1). However, in a turbulent atmosphere, it is not trivial to precisely identify this direction of propagation. In this work, we consider the center of mass of the emitted tracers within a radius of twice the point source distance, i.e., 3 km. A connecting line between this center of mass and the point source corresponds to the propagation direction. , at the bottom α denotes the local solar altitude. The first colorbar represents the DAOD enhancement and refers to the respective middle panel, which shows the horizontal dispersion of the plume. The second colorbar represents the mass per area and refers to the top and right panels, which show the vertical dispersion. In a corresponding measurement, DAOD enhancement values beneath 0.008 would not be distinguishable from noise and are therefore displayed blue. Values higher than 0.01 exceed the noise and can be identified as plume enhancement in a real measurement. A ΔDAOD value of 0.02 corresponds to an enhancement of 4 % with respect to a background of 0.5 (cf. Fig. 3). The color maps follow the guidelines for a perception-based color map presented by Stauffer et al. (2015).
For the calculation of the virtually retrieved emission rate, the mean wind speed and direction are needed (see Eq. (3)). To 280 obtain these from the simulation, the following procedure is performed. First, for each data output step the horizontal wind components at the mean height of the plume are retrieved by vertical integration, weighted with tracer mass content. Second, the resulting 2D wind field is linearly interpolated onto the virtual flight path, yielding a 1D field with the horizontal wind components along the flight track. Last, the wind components are integrated, weighted with the DAOD along the flight track, resulting in the mean wind used for calculation. 285

Simulation results
WRF is able to simulate realistic plume dispersion. The DAOD enhancement values correspond to our measurements.
Exemplary snapshots of the simulated plume during the course of a whole day can be found in Appendix A in Fig. A2.
Additionally, an animated GIF of the simulated plume can be found under https://doi.org/10.5281/zenodo.4266513 (Wolff, 2020). In the nocturnal absence of solar irradiation, the turbulence decreases, leading to narrow, homogeneous plume 290 dispersion, within a laminar flow. The exhaust plume follows Gaussian behavior, as depicted in Fig. 6 at 23:30. Contrary to this, we find boundary layer turbulence during daytime.
Strong solar heating of the surface generates convective air masses, which in turn cause a cascade of eddies. Consequently, locally reverse and counter-gradient flow, i.e., flow opposite to the main wind direction, emerges. This results in local puffs of above-average column concentration enhancements within the exhaust plume, while eddy-generated local flow in the same 295 direction as the ambient wind causes constrictions of lower column concentrations in a plume (Stull, 1988). Such plume structures deviate from Gaussian behavior, as can be seen in Fig. 6 at 17:54. Figure 7: Histogram of virtually retrieved emission rates. The histogram shows a slightly skewed distribution towards smaller emission rates than the input emission rate (red dashed line). It depicts all 1980 emission rates retrieved over 66 simulated hours between 12 UTC on 21 May and 6 UTC on 24 May.
Locally increased CO2 column concentration results in a high value in the integrated enhancement A, in contrast to an overflight over a constriction. Following Eq. (3) this corresponds to a high value of the emission rate q. It should also be stressed that the spatial extent of such puffs is smaller than that of complementary constrictions. Therefore, a skewed 300 distribution of the retrieved emission rates is to be expected, as Fig. 7 confirms.
On 23 May 2018, four measurement flyovers of the power plant Jänschwalde took approximately one hour, as presented in Sect. 3. As spin-up we discard the first 6 hours of the simulation (see Table 2). That is 66 hours of simulation, which leaves us with a total of 1980 virtual plume flyovers. The corresponding results of the emission rate, which are calculated using Eq. (3), are displayed as a histogram in Fig. 7 and as a time series in Fig. 8a. 305 Figure 8: Virtual overflight results in the course of the day. In (a) it can be seen that rising solar altitude α entails turbulence. Especially, midday turbulence causes deviations of the retrieved emission rates q from the input emission rate qin. In (b) the integrated enhancement A shows equivalent behavior, while the variations in wind speed u are comparatively small. It is during the midday turbulence, that the virtual flight tracks are not exactly perpendicular to the instantaneous wind direction at the plume crossing, which becomes apparent in the correction factor sin(φ) in (c). In the night hours, as well as the morning the retrieved emission rates agree very well with the input emission rate qin. The wind speed u surpasses the threshold value of 2 m/s at all times.
In Fig. 8a it can be seen how the diurnal course of solar altitude α influences the retrieved emission rates q. The random occurrence of inhomogeneities in the plume propagation, caused by local turbulence, leads to large variations in the results of successive crossings. Turbulence lags behind solar altitude because the surface needs time to heat up. It is also apparent that the emission rate deviations vary from day to day, both in intensity, as well as in dwell time.
The implications for the measurement results can be reduced by averaging over a multitude of retrieved emission rates. Next, 310 we investigate how often the exhaust plume must be surveyed, to achieve a mean emission rate with satisfactory accuracy.
From experience with the Jänschwalde measurement presented in Sect. 3, as well as other point source measurements during the CoMet campaign, which are not presented in this work, we assume a time delay in the range of 6 to 18 minutes between two successive crossings. With typical wind speeds in the range of 5-8 m/s and spatial scales of puffs and constrictions of about 1-2 km, our range of time delay exceeds the residence time of coherent plume structures, thus preventing repeated 315 measurements of identical air masses. The model setup provides one measurement every 2 minutes, resulting in a vast number of permutations of successive virtual crossings available for merging (see Table A1). For each of these permutations, a mean value is calculated, which is then compared with the initiated emission rate qin. To evaluate the turbulence-induced inhomogeneity in the daily course, we compare two-hour time frames. We execute a total of 60 virtual overflights in such a two-hour time frame. The number of possible permutations increases exponentially to 5000 if four crossings are merged and 320 even on to 312500 if seven crossings are merged (see Table A1). This high number of permutations is based on the identical 60 single crossing emission rates, which are displayed in the histograms in Fig.9. Figure 9: Box-whisker-plot of the relative difference to the input emission rate qin within two-hour time frames. The right axis shows the associated retrieved emission rates. The width of the distribution decreases with a higher number of crossings. For all time frames it can be stated that with an increasing number of merged crossings the width of the distribution decreases. The largest differences to qin are observed in the afternoon. Different colors represent a different number of virtual crossings merged for averaging. The inner boxes range from the first to the third quartile, thus containing 50 % of the values. The median is marked within as a black dash. The upper whisker is drawn up to the 95 th percentile, while the lower whisker is drawn to the 5 th percentile Consequently 90% of the values are in between the two whiskers. All values outside the whiskers are outliers and plotted as dots. Figure 9 presents the resulting distribution of this relative difference to the input emission rate as a box-whisker-plot. The spread of the respective box-whisker-plot is an indicator of turbulence. It is evident that with an increasing number of overflights merged for averaging the spread of the relative differences decreases, while the measurement precision increases. 325 A high emission rate measured by a single overflight scanning a puff is compensated if the subsequent overflight measures a lower emission rate. With a higher number of overflights averaged, it is more likely to measure both high and low concentration air masses. Yet, although the precision can be improved by increasing the number of overflights, even for ten overflights it is inferior to the precision of nighttime measurements. Additionally, not only the precision but also the accuracy is compromised during times of strong turbulence, i.e., in the afternoon. As aforementioned, the spatial extent of turbulence-induced puffs is 330 smaller than the one of the complementary constrictions. Therefore, such puffs are likely to be less frequent and only partially scanned when measured at a low sampling frequency. Consequently, the retrieved emission rates will be biased low. This is an effect that occurs especially during strong turbulence. In Fig. 9 a strongly turbulent day (22 May) is compared to a less turbulent day (23 May). Both precision and accuracy are superior on a less turbulent day.
In contrast, the night hours show little turbulence and high precision. Even with a single overflight, small differences to the 335 true emission rate are to be expected. Here, a higher number of overflights will only cause minor improvements. At this point, it should be mentioned that the representation of nightly plume propagation must be critically reviewed. The plume height decreases so much that the propagation takes place only in the lowest four model layers. The fact that a bias of approx. ±5% remains at night is not surprising from this point of view. This study should therefore be understood as a qualitative assessment.
The key finding is that avoiding situations of high turbulence brings an enormous improvement for both precision and accuracy. 340 Even with a significantly higher number of measurement overflights, a comparable improvement cannot be attained.

Discussion
Regarding the lidar measurements during the CoMet campaign on 23 May 2018, we find that the mean emission rate, derived from repeated plume overflights, is in rough agreement with the average emission reported by the power plant operator for the year 2017. The cross-sectional flux method is straight forward to apply. The exhaust plume generates column enhancements 345 in the differential absorption optical depth (DAOD), with a good signal-to-noise ratio, in the order of 10 %. The product of enhanced column concentration integrated along the flight track and mean wind speed, provides the flux through the lidar cross-section, at the instant of the overflight. This instantaneous flux of an individual overflight measurement can be determined with an error ranging between 8-10 %. This error is mainly driven by uncertainties of the integrated enhancement, the mean differential-absorption cross section, the mean relative wind direction, and the mean horizontal wind speed. On 350 average, we find that 2 10 of the flux error can be attributed to uncertainty in the determination of the integrated enhancement, i.e., the integrated enhancement of the DAOD signal, which is the only parameter that needs to be derived from the IPDA lidar measurement.
1 10 can be attributed to the uncertainties of the mean differential-absorption cross section of CO2 and the relative wind direction, taken together. The main source of error, however, is the mean horizontal wind speed, with a contribution of 7 10 . This highlights the need for more accurate wind information. Future studies will examine CHARM-F measurements in the 355 Upper Silesian Coal Basin to determine CH4 emissions from coal mines. In this area, ground-based Doppler wind lidars have been installed. It is expected that nudging the simulation towards the wind soundings will result in an improvement of the wind vector estimation, ultimately reducing the overall error in the flux determination.
It is necessary to distinguish between instantaneously measured flux and actual emission rate. In theory, an exhaust plume behaves Gaussian on average and the mean emission rate of the point source lies within the error range of the instantaneously 360 measured fluxes. Contrary to this, our overflights reveal large variations between the individually retrieved instantaneous fluxes, which cannot be attributed to measurement uncertainties. These variations do not occur because the measurement error increases, but because plume segments with varying CO2 content are probed. The actual measurement error is minor compared to these variations (cf. Table 1). As described in Sect. 5, strong solar heating causes turbulence, which forces the plume to deviate from Gaussian behavior. This deviation can be restricted by averaging over a multiple of instantaneous fluxes, as the 365 results from our measurement flights suggest.
To analyze this effect in more detail, we employ the atmospheric transport model WRF in a high-resolution large eddy simulation (LES) setup. We find that the model simulates realistic plume dispersion. Typical DAOD values, as well as turbulence-induced distortions, show the same order of magnitude as our measurements. However, as we evaluate only four overflights in the measurement, we cannot make any statement about the absolute accuracy of the simulation, which is also 370 not the intention of this work. Qualitatively, the simulation provides the following insights. During the night the plumes are weakly distorted and have a Gaussian shape because laminar flow dominates. Over the course of the day, turbulence increases, reaching its peak in the mid-afternoon and distorting the plumes to non-Gaussian shapes. Thus, with increasing turbulence, a larger number of crossings is required for averaging in order to obtain sufficient emission rate precision. According to our simulation, nighttime measurements require fewer overflights. Under such conditions, even a single instantaneous cross-375 sectional flux measurement yields an accuracy of up to ~95 %. In cases of very pronounced turbulence (i.e., in the afternoon), even an impractically high number of overflights will neither reach the precision, nor the accuracy of a single nighttime overflight.
At this point, we cannot derive any limits for solar altitude or local times that should be avoided, as the simulation reveals that the turbulence intensity varies from day to day (see Fig. 8 and Fig. 9). Generally, we find that the most significant turbulence 380 occurs in the afternoon. For future campaign planning, we recommend to also perform measurements at night or in the morning, which is possible with lidar.

Conclusion
The present study continues the investigations by Amediek et al. (2017) on the quantification of fluxes of local greenhouse gas emission sources using the integrated-path differential-absorption (IPDA) lidar CHARM-F and the cross-sectional flux 385 method. While the preceding study was concentrated on CH4 emissions from hard coal mines, here, we exploit the results from the CoMet campaign in 2018. We investigate CO2 plume overflights of the coal-fired power plant Jänschwalde, conducted to quantify its emission rate and to assess how accurately the cross-sectional flux method can be applied. Since CHARM-F measures both greenhouse gases simultaneously, our findings also apply to isolated CH4 point sources.
With regard to cross-sectional flux measurements, the current work suggests avoiding mid-afternoon periods of strong 390 turbulence. On the one hand, this is because the uncertainties in the wind speed are most pronounced at these times, being the major source of error in a single measurement. On the other hand, this is due to the distortions of exhaust plumes in a turbulent wind field, which lead to substantial deviations from Gaussian plume dispersion. Under strong turbulence, the cross-sectional flux method cannot provide an accurate measurement of the emission rate, not even in the average of a vast number of overflights. Therefore, measurement flights performed during nighttime are beneficial. In this respect, intrinsic independence 395 from solar irradiation is a clear advantage of active remote sensing over passive approaches. Whenever sunlight is needed to perform the measurement, less turbulent conditions, for example in the morning after sunrise, or winter, should be preferred.
Further, it shall be pointed out that, with a lidar, cross-sectional plume measurements can also be performed over water bodies, whose detrimental reflective properties often impede the use of passive remote sensing (Gerilowski et al., 2015;Krautwurst et al., 2021;Larsen and Stamnes, 2006). Therefore, plumes from offshore installations can also be addressed with this approach. 400 Independent of the location of the point source, there are restrictions regarding the adequate distance of a plume overflight to the point source. We report that at a point source distance of more than 4.6 km no enhancement is visible and therefore no plume detection can be performed. In addition to that, we find that the uncertainty of the mean differential-absorption cross section increases with a larger vertical extension of the plume, which correlates with distance. At a point source distance of 1.5 km, this uncertainty is negligible. Concerning the detectability of the plume, we can locate distinct enhancements at a 405 distance of 1.5 km. Nevertheless, the closer to the point source the overflight takes place, the more constrained the plume and consequently the more pronounced the column enhancement is. It must be considered, however, that the horizontal extension is also smaller and thus fewer data points lie within the plume. In the case of CHARM-F, this can be compensated by a higher repetition rate.

8
Outlook 410 Apart from the CHARM-F measurements, the CoMet campaign also saw the deployment of other airborne instruments to measure atmospheric CH4 and CO2, supported by a variety of ground-based, in situ and remote sensing instruments. They were predominantly based in the vicinity of one of the major hot-spot regions of CH4 emissions in Europe, the Upper Silesian Coal Basin (USCB). Investigations of local and regional CH4 emissions from this region are, in view of the preparation for the upcoming MERLIN mission, a particular field of interest. The possibility to synergistically use active remote sensing (lidar), 415 passive spectrometry, and in situ measurements supported by modeling activities, allow for unique cross comparisons, which are beyond the scope of the present paper. Such cross comparisons will be subject of subsequent investigations, as well as other HALO measurement flights, as it flew along latitudinal trajectories, performed regional survey flights (e.g., over the Mount Etna) and also probed the local plume of not only Jänschwalde, but also Bełchatów in Poland, which is considered Europe's largest coal-fired power plant, in terms of CO2 emission. The measured data can make an important contribution to 420 the validation of existing satellite missions (e.g., Sentinel-5P, GOSAT). Further aircraft campaigns (e.g., CoMet-2.0) are foreseen which will provide additional opportunities for methodical refinements, including advancements on modelmeasurement synergies.  Table 1. Red vertical lines mark the smallest data extract used for the Riemann sum, as described in Sect. 3.

Calculation of simulated DAOD:
At the initialization position of the power plant, the value of the tracer variable is increased by the value 1 at each time step.
In this study, it is defined only in the inner domain D3.
( + Δ , , , ) = ( , , , ) + 1 (A1) 430 Here, Δt corresponds to the computational time step of the third domain, i.e., one second (Table 3) For the input emission rate qin a constant value of 760 kg(CO2)/s (24.0 Tg(CO2)/yr) is initialized, which corresponds to the 435 total annual emission for the year 2017 reported to the European Environment Agency by the operators (E-PRTR, 2020). Δx and Δy correspond to the temporally and spatially constant horizontal size of a grid point (0.2 km). The vertical layer size Δz(t,x,y,z) corresponds to the spatial distance between two model levels. In the simulation this distance is computed in pressure coordinates and depends on all four dimensions. Since the pressure varies only slightly between successive time steps, the temporal dependence of Δz is small. At locations with flat topography the dependence of Δz on the horizontal coordinates x 440 and y is also small, at locations with large topographic changes (e.g., steep slopes) the dependence is more significant. The product Δx•Δy•Δz(t,x,y,z) corresponds to the volume of the respective grid box. Within this volume the value of the tracer variable and thus the concentration is constant.
In order to compare the simulated data with an IPDA lidar measurement, the concentration array must be summed up vertically and multiplied by the quotient of the mean differential-absorption cross section and the molecular mass: 445 ( , , ) = Δσ ̅̅̅̅ · ∑ ( , , , ) · Δ ( , , , ) =1 (A3) The index j marks the respective vertical layer. Consequently j∊{1,56} applies, and zj is defined to correspond to the lower edge of the respective layer.
Exemplary snapshots of simulated DAOD: 450 Figure A2: Exemplary snapshots of simulated plume and virtual flight track. At the respective top local time is given in Central European Summer Time (CEST, i.e., UTC+2), at the bottom α denotes the local solar altitude. The daily solar irradiation causes a deep, convective boundary layer with turbulent plume dispersion within. In the nocturnal absence of solar irradiation, the boundary layer shrinks, leading to narrow, homogeneous plume dispersion, within a laminar flow. Every two minutes a virtual measurement is performed yielding 60 measurements within a time frame of 2 hours. One representative snapshot within the two-hour time frame is shown. Some snapshots show disjointed exhaust plumes. This is due to vertical wind shearing and the resulting different vertical advection directions. 10 9765625 Data availability.
The data are available from the author upon request and will be made publicly available through the HALO Data Base www.halo.dlr.de/halodb/ in due time.
Author contribution.
SW analyzed the measurement data, performed the simulation and wrote most of the manuscript. AF, CK and GE supervised 455 the measurement data analysis, as well as the simulation post processing. AA, AF, MQ and MW developed the lidar system and operated it during CoMet. AF was principal investigator of the CoMet mission. AA, AF, CK and GE designed the measurement flights. AF, CK, GE and MQ contributed to the manuscript's text and figures.

Competing interests.
The authors declare that they have no conflict of interest. 460