Facility level measurement of offshore oil and gas installations from a medium-sized airborne platform: method development for quantification and source identification of methane emissions

Abstract. Emissions of methane (CH4) from offshore oil and gas installations are
poorly ground-truthed, and quantification relies heavily on the use of
emission factors and activity data. As part of the United Nations Climate
& Clean Air Coalition (UN CCAC) objective to study and reduce short-lived
climate pollutants (SLCPs), a Twin Otter aircraft was used to survey CH4
emissions from UK and Dutch offshore oil and gas installations. The aims of
the surveys were to (i) identify installations that are significant CH4
emitters, (ii) separate installation emissions from other emissions using
carbon-isotopic fingerprinting and other chemical proxies, (iii) estimate
CH4 emission rates, and (iv) improve flux estimation (and sampling)
methodologies for rapid quantification of major gas leaks. In this paper, we detail the instrument and aircraft set-up for two
campaigns flown in the springs of 2018 and 2019 over the southern North Sea
and describe the developments made in both the planning and sampling methodology
to maximise the quality and value of the data collected. We present example
data collected from both campaigns to demonstrate the challenges encountered
during offshore surveys, focussing on the complex meteorology of the marine
boundary layer and sampling discrete plumes from an airborne platform. The
uncertainties of CH4 flux calculations from measurements under varying
boundary layer conditions are considered, as well as recommendations for
attribution of sources through either spot sampling for volatile organic compounds (VOCs) ∕ δ13CCH4 or using in situ instrumental data to determine
C2H6–CH4 ratios. A series of recommendations for both
planning and measurement techniques for future offshore work within marine
boundary layers is provided.


Abstract. Emissions of methane (CH 4 ) from offshore oil and gas installations are poorly ground-truthed, and quantification relies heavily on the use of emission factors and activity data. As part of the United Nations Climate & Clean Air Coalition (UN CCAC) objective to study and reduce shortlived climate pollutants (SLCPs), a Twin Otter aircraft was used to survey CH 4 emissions from UK and Dutch offshore oil and gas installations. The aims of the surveys were to (i) identify installations that are significant CH 4 emitters, (ii) separate installation emissions from other emissions using carbon-isotopic fingerprinting and other chemical proxies, (iii) estimate CH 4 emission rates, and (iv) improve flux estimation (and sampling) methodologies for rapid quantification of major gas leaks.
In this paper, we detail the instrument and aircraft setup for two campaigns flown in the springs of 2018 and 2019 over the southern North Sea and describe the developments made in both the planning and sampling methodol-ogy to maximise the quality and value of the data collected. We present example data collected from both campaigns to demonstrate the challenges encountered during offshore surveys, focussing on the complex meteorology of the marine boundary layer and sampling discrete plumes from an airborne platform. The uncertainties of CH 4 flux calculations from measurements under varying boundary layer conditions are considered, as well as recommendations for attribution of sources through either spot sampling for volatile organic compounds (VOCs) / δ 13 C CH4 or using in situ instrumental data to determine C 2 H 6 -CH 4 ratios. A series of recommendations for both planning and measurement techniques for future offshore work within marine boundary layers is provided.

Overview
Methane is a potent greenhouse gas in the atmosphere, with a global warming potential 84 times that of carbon dioxide when calculated over a 20-year period (Myhre et al., 2013). Increases in atmospheric CH 4 mixing ratios are expected to have major influences on Earth's climate, and emission mitigation could go some way toward achieving goals laid out in the UNFCCC (United Nations Framework Convention on Climate Change) Paris Agreement .
Offshore oil and gas fields make up ∼ 28 % of total global oil and gas production and are expected to be significant sources of CH 4 to the atmosphere, given that 22 % of global CH 4 emissions are estimated to be from the oil and gas (O&G) sector (Saunois et al., 2016). Some emissions arise from routine operations or minor engineering failures , while others stem from large unexpected leaks (e.g. Conley et al., 2016;Ryerson et al., 2012). In some O&G fields, large amounts of nonrecoverable CH 4 can be flared or vented due to a number of factors. Thus, the composition of O&G emissions can be influenced by several variables, including the targeted hydrocarbon product (oil or gas), extraction techniques and gas capture infrastructure. O&G installations co-emit volatile organic compounds (VOCs) such as alkanes, alkenes and aromatics in addition to CH 4 . Some of these VOCs are toxic and can have direct health impacts or, together with NO x , can produce ozone, having an impact on the regional air quality . VOC and δ 13 C CH4 measurements can be utilised to fingerprint the main processes or likely location responsible for associated CH 4 emissions (Cardoso-Saldaña et al., 2019;Lee et al., 2018;Yacovitch et al., 2014a).
A recent study has also demonstrated the cost-effectiveness of airborne measurements for leak detection and repair at O&G facilities relative to traditional ground-based methods (Schwietzke et al., 2019).
There is thus a need to develop reliable methodologies to locate emissions, determine sources in sufficient detail to allow for the quantification of emissions and validate against publicly reported inventory emissions to enable the design of suitable mitigation. To date, a number of approaches have been used. Airborne measurements of both individual and clusters of facilities, along with production data, have been used to scale up to an inventory of CH 4 emissions for the US Gulf of Mexico (Gorchov Negron et al., 2020). Shipbased measurements of CH 4 and associated source tracers have been made in both the Gulf of Mexico (Yacovitch et al., 2020) and in the North Sea (Riddick et al., 2019). The latter reported fluxes of CH 4 from offshore O&G installations in UK waters that were derived from observations made from small boats at ∼ 2 m above sea level. This approach has advantages in terms of cost, but the authors recognised a number of key uncertainties in their approach associated with assumptions around boundary layer conditions and a lack of 3D information (i.e. Gaussian plume modelling and assump-tions of constant wind speed). Measurements from aircraft can provide this 3D spatial information, enabling better characterisation of both plume morphology and boundary layer dynamics.
Here we report a project that was designed around the use of a small-aircraft with flexible instrument payload suitable for agile deployment. Key objectives were (i) to identify and quantify emissions of CH 4 from a suite of offshore gas fields within a limited geographical area and (ii) to develop methodologies that can be applied to gas fields elsewhere to assess emissions at local scales. The project was part of the United Nations Climate & Clean Air Coalition (UN CCAC) objective to characterise global CH 4 emissions from oil and gas infrastructure. Targeted observations of atmospheric CH 4 and C 2 H 6 plus sampling for VOC and δ 13 C CH4 analysis were made from a Twin Otter aircraft operated by the British Antarctic Survey (BAS). Two campaigns were conducted, one in April 2018 and one in April-May 2019, with a total of 10 flights (∼ 45 h) over the two campaigns.
The specific aims of the surveys were: 1. CH 4 surveying of facilities with a range of expected (from inventories) CH 4 emissions 2. resolution of types of emission from installations (such as flaring, venting, combustion and leaks) using carbon-isotopic fingerprinting and analysis of coemitted species (including VOCs).
3. estimation of total CH 4 emissions for the target region 4. improvement of flux estimation (and sampling) methodologies for rapid quantification of major gas emissions.
Here, we provide an overview of the measurement platform configuration and sampling strategy during these campaigns, including instrument comparisons for hydrocarbon plume detection, spot sampling strategies for VOCs and δ 13 C CH4 , and flight planning to cope with complex boundary layer meteorology to allow for the estimation of emission fluxes. Analysis methods to determine diagnostic hydrocarbon plume characteristics such as C 2 H 6 -CH 4 ratios and δ 13 C CH4 source attribution are also discussed. A sister publication will present the estimated facility level emissions in detail and discuss the results in a regional context.

Experimental
A DHC6 Twin Otter research aircraft, operated by the British Antarctic Survey, was equipped with instrumentation to measure atmospheric boundary layer parameters, including the boundary layer structure and stability, as well as a number of targeted chemical parameters. These included CH 4 , CO 2 , H 2 O and C 2 H 6 as well as whole-air sampling for subsequent analysis of δ 13 C CH4 and a suite of VOCs. Here we describe the aircraft capability, aircraft fit and the instruments deployed.

Aircraft capability
The maximum range of the Twin Otter aircraft during the flight campaigns was approximately 1000 km. Although the aircraft is capable of flying up to 5000 m altitude, most of the flying was limited to below 2000 m; in regions with no minimum altitude limit, the aircraft could be flown at the practical limit of 15 m above sea level. The instrument fit included use of a turbulence boom, which limited the speed to a maximum of 140 kn (∼ 70 ms −1 ); throughout the campaigns, the target aircraft speed for surveying was 60 ms −1 . The aircraft was limited to a minimum safe separation distance of 200 m from any O&G production platforms. The total weight of the aircraft on take-off is limited to 14 000 lb (6350 kg). Allowing for fuel and crew, this left 2086 kg for the instrumentation. The total power available on the aircraft is 150 A at 28 V, and inverters were used to provide 220 V to those instruments that required it. Altitude and air speed were determined by static and dynamic pressure from the aircraft static ports and heated Pitot tube, logged using Honeywell HPA sensors at 5 Hz. A radar altimeter recorded the flight height at around 10 Hz. An OxTS (Oxford Technical Solutions) inertial measurement system coupled to a Trimble R7 GPS was used to determine the aircraft position and altitude. This system gives all three components of aircraft position, altitude and velocity at a rate of 50 Hz. The chemistry inlets on the Twin Otter are similar to those fitted to the FAAM (Facility for Airborne Atmospheric Measurements) BAe (British Aerospace) 146 large atmospheric research aircraft (e.g. O'Shea et al., 2013) and were fitted with the inlet facing to the rear (Fig. A1). A single line (1/4 ′′ Synflex tubing) was taken from the inlet to a high-capacity pump with the instruments branching from this line. The aircraft was fitted out during the week before each of the two flight campaigns, allowing for significant changes to be made between 2018 and 2019 based on instrument performance and data from 2018 ( Fig. 1).

Boundary layer physics instrumentation
A fast-response temperature sensor and a nine-hole NOAA BAT "Best Air Turbulence" probe (Garman et al., 2006) were mounted on a boom on the front of the aircraft (see photo, Fig. A2). This instrumental set-up was chosen to reduce flow distortion effects by the aircraft. These fast-response measurements of wind and temperature fluctuations were made with a frequency of 50 Hz. Garman et al. (2006) investigated the uncertainty of the wind measurements by testing a BAT probe in a wind tunnel. They assessed that the precision of the vertical wind measurements due to instrument noise was approximately ±0.03 ms −1 . Garman et al. (2008) showed that an additional uncertainty in the wind data occurs when a constant up-wash correction value is used, as proposed by the model of Crawford et al. (1996). We use the Crawford model, which increases the uncertainty in the vertical wind compo-nent, w, to approximately ±0.05 ms −1 . We assume for the two horizontal wind components, u and v, similar high uncertainties due to aircraft movement. A detailed description of the Twin Otter turbulence instrumentation and associated data processing can be found in Weiss et al. (2011).
Ambient air temperature was observed with Goodrich Rosemount Probes, mounted on the nose of the aircraft. A non-de-iced model 102E4AL and a de-iced model 102AU1AG logged the temperature at 0.7 Hz. Atmospheric humidity was measured with a Buck 1011C cooled-mirror hygrometer. The 1011C Aircraft Hygrometer is a chilledmirror optical dew point system. The manufacturer stated a reading accuracy of ±0.1 • C in a temperature range of −40 to +50 • C. Chamber pressure and mirror temperature were recorded at 1 Hz.

In situ atmospheric chemistry instrumentation
A Los Gatos Research (LGR) Ultraportable Greenhouse Gas Analyser (uGGA) was installed to measure CH 4 , CO 2 and H 2 O. The expected manufacturer precision for the CH 4 measurement was < 2 ppb averaged over 5 s and < 0.6 ppb over 100 s. The response time of the LGR uGGA itself (i.e. the flush time through the measurement cell) was over 10 s. To achieve higher-temporal-frequency data, a fast Picarro G2311-f was installed to provide measurements of CH 4 , CO 2 and H 2 O at ∼ 10 Hz, with 1σ precision of ∼ 1ppb over 1 s for CH 4 . A third greenhouse gas analyser, an LGR Ultraportable CH 4 /C 2 H 6 Analyser (uMEA) was used to measure CH 4 and C 2 H 6 . In-house laboratory measurements suggest C 2 H 6 1σ precision at 1 s is ∼ 17 ppb for the LGR uMEA. During the 2019 airborne campaign, atmospheric C 2 H 6 was also monitored by a tuneable infrared laser direct absorption spectrometer (TILDAS, Aerodyne Research Inc.) (Yacovitch et al., 2014b) with an expected precision of 50 ppt (parts per trillion) for C 2 H 6 over 10 s. This instrument utilises a continuous wave laser operating in the mid-infrared region (at λ = 3.3 µm). A further description of the TILDAS instrument set-up and performance is available in the Appendices along with instrument precisions and response times in Table A1.

CH 4 and CO 2 calibration
In situ CH 4 and CO 2 instruments were calibrated in flight using a manually operated calibration deck, shown in schematic form in Fig. 2. The calibration gases consisted of a suite of WMO-referenced (World Meteorological Organization) standards with a "high", "low" and "target" designation. The high CH 4 concentration was ∼ 2600 ppb; low was ∼ 1850 ppb; and target was ∼ 2000 ppb. CO 2 concentrations were high at ∼ 468.5 ppm, low at ∼ 413.9 ppm and target at ∼ 423.6 ppm. The absolute values of the cylinders varied between years as they were re-filled and re-certified to the NOAA WMO-CH 4 -X2004A and WMO-CO 2 -X2007 scales. The calibration deck is designed so that upon the calibration valve opening, the calibration gas flow rate is sufficient to overflow the inlet. A similar approach to in-flight calibration is also applied on the NOAA WP-3D aircraft (Warneke et al., 2016). Full details of the calibration procedure are recorded in the Appendices. CH 4 uncertainty (1σ ) is calculated from the in-flight target gas measurements as 1.24 ppb for the Picarro G2311-f and 1.77 ppb for the uGGA, giving performance comparable with similar instrumentation on the FAAM aircraft (O'Shea et al., 2014). The excellent agreement between measured and expected values of CH 4 for the target cylinder (for the Picarro and uGGA) gives us confidence in being able to operate to high levels of accuracy with a very limited period of instrument fitting and testing. CO 2 uncertainty (1σ ) at 1 Hz is calculated as 0.20 ppm for the Picarro G2311-f and 0.35 ppm for the uGGA. More details on the calibration and associated uncertainties are shown in the Appendices.

C 2 H 6 calibration
The calibration cylinders installed on the Twin Otter during both campaigns did not contain measurable amounts of C 2 H 6 , and therefore in-flight calibrations could not be performed. This represents a limitation on the accuracy and traceability of the C 2 H 6 measurements during these campaigns and will be addressed for future studies using the BAS Twin Otter. The uMEA was calibrated in the laboratory post-campaign for the 2018 campaign and pre-and postcampaign in the laboratory for the 2019 season. The uMEA instrument cavity is not temperature stabilised, resulting in significant measurement drift during the course of operation. Corrections for C 2 H 6 and CH 4 measurement drift as a function of cavity temperature were determined experimentally by analysing two calibration cylinders alternately over the course of several hours as the cavity temperature increased. These corrections were then applied to the uMEA C 2 H 6 and CH 4 measurements obtained from both the 2018 and 2019 flight campaigns.
The TILDAS (deployed in 2019) measures a water line, allowing for measurements to be corrected to dry mole using the TDLWintel software (Nelson et al., 2004) to account for changes in humidity during the flight (as discussed in Pitt et al., 2016). The raw measured data were calibrated preand post-flight using two cylinders of a known concentration, whose mole fractions spanned the measurement range observed during flights for C 2 H 6 . By assuming a linear relationship, the calibrated mole fraction corresponding to each measured TILDAS mole fraction was given by interpolating the scale between the pre-and post-flight calibration reference points. Previous studies have reported the sensitivity of TILDAS systems to aircraft cabin pressure (Gvakharia et al., 2018;Kostinek et al., 2019;Pitt et al., 2016). This sensitivity means that the C 2 H 6 mole fractions measured during the flight contain a systematic altitude-dependent bias. However, as cabin pressure only affects the spectroscopic baseline, the zero offset of the measurements is affected but not the instrument gain factor. Therefore, as long as each plume measurement is referenced to a measured background at the same altitude, this cabin pressure sensitivity does not significantly impact the calculated C 2 H 6 mole fraction enhancements. As stated above, future deployments will mitigate this issue by employing in-flight calibration cylinders that are certified for C 2 H 6 . The potential to use a fast, frequent calibration for baseline correction as described by Gvakharia et al. (2018) and Kostinek et al. (2019) will also be investigated, although this has payload implications, as it requires an extra calibration cylinder. Alternatively, the optical bench could be reengineered to sit within a hermetically sealed pressure vessel, as described by Santoni et al. (2014).

Spot sampling
Manually triggered spot sampling provides a cost-effective and relatively simple sample collection method to allow for analyses which cannot be performed mid-flight or require specialist laboratory facilities to gain useful levels of precision. Two discrete air-sampling systems were used during these flights to enable post-flight analysis for VOCs and δ 13 C CH4 .

Son of Whole Air Sampler (SWAS)
The Son of Whole Air Sampler (SWAS) is a new, updated version of the parent WAS system fitted to the FAAM BAe 146 large atmospheric research aircraft (e.g. as used by O'Shea et al., 2014), which it is designed to supersede. The system comprises a multitude of inert Silonite-coated (Entech) stainless steel canisters, grouped together modularly in cases with up to 16 canisters per case. Onboard the Twin Otter, two cases can be fitted allowing for up to 32 canisters to be carried per flight. The theory of operation is to capture discrete air samples from outside of the aircraft and compress the sample either into 1.4 or 2 L canisters at low pressure (40 psi; 275 kPa) via pneumatically actuated bellows valves (PBVs, Swagelok BNVS4-C). Full details of the operation of SWAS are included in the Appendices. For the 2019 campaign, SWAS was updated with the addition of 2 L flowthrough canisters, making narrow plumes easier to capture due to reduced sample line lag and fill times.
SWAS canister sampling was manually triggered during the flights according to in situ observations made by fastresponse instrumentation of CO 2 , C 2 H 6 and CH 4 , with the aim of capturing specific oil and gas plumes. The samples were analysed at the University of York for VOCs postflight using a dual-channel gas chromatograph with flame ionisation detectors (Hopkins et al., 2003). Firstly, 500 mL aliquots of air are withdrawn from the sample canister and dried using a condensation finger held at −30 • C; then they are pre-concentrated onto a multi-bed carbon adsorbent trap consisting of Carboxen 1000 and Carbotrap B (Supelco) and transferred to the gas chromatography (GC) columns (Al 2 O 3 , NaSO 4 deactivated and open tubular; PLOT -porous layer, open tubular) in a stream of helium. Chromatogram peak identification was made by reference to a calibration gas standard containing known amounts of 30 VOCs ranging from C2 to C9. Compounds of interest include C 2 H 6 , propane, butanes, pentanes, benzene and toluene; a full list is shown in Table A2.

FlexFoil bag sampling
Spot sampling for δ 13 C CH4 by collecting whole-air samples into FlexFoil bags (SKC Ltd) has been in use on both the FAAM BAe 146 research aircraft (e.g. Fisher et al., 2017) and during ground-based mobile studies (e.g. Lowry et al., 2020) and provides a relatively cost-effective and rapid methodology for sample collection. The method does have some limitations, however, as the FlexFoil sample bags are only stable for a number of compounds (including CH 4 ). Samples captured in both FlexFoil bags and SWAS were measured at Royal Holloway using continuous-flow isotope ratio mass spectrometry (CF-IRMS; Fisher et al., 2006), and each measurement has a δ 13 C CH4 uncertainty of ∼ 0.05 ‰. Each sample is also measured for CH 4 mole fraction using cavity ring-down spectroscopy to allow for direct comparison to in-flight data (Fig. A3). Alternative, continuous in-flight δ 13 C CH4 instrumentation currently cannot replicate the precision of laboratory sampling, and the few seconds of enhanced CH 4 that would be encountered during flight is not sufficient for averaging of continuous δ 13 C CH4 data to gain a meaningful source δ 13 C CH4 signature (e.g. Rella et al., 2015).

Overall approach to flight planning
The majority of flights were conducted during good operating conditions, i.e. daytime, no precipitation, clear or broken cloud, winds < 10 ms −1 , and visibility, to allow for flying at a minimum safe altitude around the task area. Two approaches were trialled to assess CH 4 emissions from offshore gas installations: (i) regional survey and (ii) specific plume sampling. The flight modes are demonstrated in Fig. 3, with the dark-grey pattern showing a flight plan for regional measurements and the orange and white patterns demonstrating specific plume sampling flight patterns. Flight plans to sample specific installations were designed to capture a full range of expected emissions using the UK National Atmospheric Emissions Inventory (NAEI) as a guide.
Regional survey intentions were twofold: firstly, to offer an identification process for emitters of interest that could specifically be targeted for plume sampling modes and, secondly, to build a picture of aggregate bulk emissions for multiple upwind platforms. This method has been successfully employed during a Gulf of Mexico airborne study (Gorchov Negron et al., 2020). However, in the work presented here, regional surveys were poor for identifying plumes (being too far downwind of platforms or not intercepting thin filament layers containing CH 4 enhancements), and attempts to aggregate bulk emissions were hindered by the often encountered complex boundary layer structure over the area, which controlled dispersion of CH 4 emissions from rigs. From the re- gional flight data derived in 2018 and considering the work in other offshore studies in this area (e.g. Cain et al., 2017), the regional flight mode was determined to be of limited scientific value in the context of this project, and this flight pattern was not used during the 2019 campaign.
Plume sampling flights were conducted in both 2018 and 2019. These flights involved the use of a box pattern to create both upwind and downwind transects on either side of the infrastructure of interest. Upwind transects provided an understanding of other methanogenic sources (such as other installations, ships or long range transport of air masses from onshore sources) that could interfere with observed CH 4 plumes downwind and were conducted to be confident that plumes were solely originating from the targeted infrastructure. Vertically stacked downwind transects at a distance of 1 to 10 km away from emission sources were conducted to better capture the vertical extent of the plume in a 2D Lagrangian plane for CH 4 flux quantification using mass balance analysis (e.g. O'Shea et al., 2014). The vertically stacked transects in profile, as planned from the 2019 field deployment, are demonstrated in Fig. 3. The separation between vertically stacked transects was usually 60 m with a minimum absolute height of 45 m above sea surface up to approximately 260 m to capture the entire extent of a downwind plume. Plume dispersion was dependent on meteorology and emission type (venting, fugitive or combustive emissions), and as such, maximal plume heights varied between individual pieces of infrastructure. Upwind transects were flown at a median height between the minimum and maximum stacked runs.

Assessing and addressing issues encountered during flights
A number of issues were encountered during the flights that influenced the measurements made. An initial presentation of these issues is given here, with recommendations for improvements given in Sect. 6 below.

Complex marine boundary layers
Boundary layer structure proved to be a important influence on observed CH 4 mixing ratios. Figure 4 shows the measured profiles of CH 4 (left-hand panel) and potential temperature (right-hand panel) during an offshore flight in April 2018 along with the corresponding synoptic chart. Potential temperature was calculated as described by Stull (1988). The potential temperature profile demonstrates that the boundary layer structure on this day (and many other days) was partly stable stratified, showing mostly an increase in potential temperature with height, and the boundary layer showed complex layering. The prevailing meteorological situation at that time, illustrated by the synoptic chart in Fig. 4, was of a persistent anticyclonic ridge, stretching from the south-west over the British Isles and western Europe, with associated low wind speeds and poorly defined airflow over the southern North Sea sector. The observed layering was partly also caused by residual boundary layers from previous days and nights which had not dispersed. The structure of the boundary layer in Fig. 4 clearly had an important influence on the vertical profile of CH 4 , which varied and shows a complex profile with height. Due to the complexity of the boundary layer structure, it was concluded that it would be inappropriate to use a particle dispersion model such as the Numerical Atmospheric-dispersion Modelling Environment (NAME) (Jones et al., 2007) to derive a bulk regional emission estimate. The impact of the residual layers of CH 4 enhancement make in-flight decisions very challenging for two main reasons: (i) it is difficult to determine which enhancements are from installations and require further investigation, especially if flying at some distance downwind from a potential source or on a regional survey pattern, and (ii) emissions being actively released can become trapped in vertically thin filaments, which can be easily missed when flying stacked legs, depending on flight altitude. In contrast, on days with a well-mixed boundary layer the CH 4 profile stays relatively constant with height and shows an increase only near a CH 4 source. Figure 5 shows an example of CH 4 and potential temperature profiles, in a well-mixed boundary layer during a flight in May 2019; the synoptic situation on that day was consistent with a slow-moving cyclonic south-easterly airflow. It can clearly be seen how the potential temperature and CH 4 profiles stay almost constant with height and only show structure when intercepting a CH 4 emission at 300 to 350 m altitude. The potential temperature profile indicates neutral stratification of the boundary layer.

Instrument response times
The role of the continuous in-flight measurements is to provide the backbone of the dataset and ensure that, at a bare minimum, the flights are able to identify areas of CH 4 enhancement and inform on the likely sources of the CH 4 enhancement, hence the decision to run redundancy measurements of CH 4 utilising an LGR uGGA. Figure 6 shows typical instrument responses to a CH 4 plume, and it is clear that the cell turnover time of the uGGA is not sufficient to capture the fine detail of the plume. Whilst the uGGA and uMEA are capable of determining the whole infrastructure mass balance and average infrastructure ethane-methane ratios, the refined understanding of the true plume is lost in these slower response instruments. This is important, as the combined Picarro G2311-f and TILDAS data can detect several sources from the same installation (Fig. 6) because of their rapid measurement cell turnover. This information can be used to infer either cold venting (CH 4 and C 2 H 6 ) or combustion from flares or generators (CO 2 , CH 4 and C 2 H 6 ), which could then be used to determine CH 4 emission factors from identified flares (Gvakharia et al., 2017).
There are a number of other implications that arise from slow measurement response. For example, in-flight spot sampling requires guidance from fast-response instruments that can indicate the optimum timing to collect samples that span the plume and thereby capture the representative chemical nature of the plume. Further, in-flight calibrations must be matched to the slowest-response instrument to ensure stabilisation of the measurement of calibration gases across all instruments. Although useful from a cross-checking purpose, use of slower-response instruments can introduce additional, unwanted loss of measurement time and excessive use of calibration gases, and the benefits of instrument redundancy should be carefully considered.

Spot sampling improvements between the 2018 and 2019 campaigns
In-flight spot sample collection was carried out during both the 2018 and 2019 campaigns. Such sampling is challenging and requires fast-response instruments to be viewable to the operator to give the best chance of collecting samples at appropriate points across the plumes. For 2019, a number of simple adaptations were introduced that significantly increased the success of capturing plumes (Fig. A3). The improvements included modified flight planning, with an increased number of passes through discovered plumes. This approach resulted in increased fuel consumption per plume but contributed to the higher success rate of plume capture. The comprehensive update to the SWAS system, which included continuous sample throughflow allowed for more precise spot sampling to be achieved.
where Flux is the bulk net flux passing through the x − z plane per unit time, n air is the molar density of air (mol m −3 ), X plume is the average CH 4 mole fraction measured within the plume and X background is the CH 4 mole fraction of the background. V is the wind component perpendicular to the flight track; x is the plume width perpendicular to the upwinddownwind direction; and z relates to the vertical extent of the plume. The CH 4 and CO 2 measurements from the 10 Hz response instruments were used to provide the highest accuracy in the (i) lateral plume width and (ii) number of unique plumes identified from each individual platform. Slower-response instruments would allow for flux calculations but would not be able to identify individual plumes from the same platform. This could be useful to distinguish, for example, multiple plumes from different emission processes that are spatially distinct within the same platform (e.g. a fugitive source versus a flare). A background mixing ratio was selected to best represent the conditions observed during the flight at the specific time of survey. An average of 30 s of data from either side of the plume on each run were used if this was deemed appropriate with a clean upwind sampling leg. When the upwind sampling was contaminated, more caution should be taken when selecting an appropriate background so that the background value is not distorted by extraneous far-field sources.
For the flux analysis, a flux across each individual stacked horizontal run downwind of a plume was calculated before scaling in the vertical component. The flux was then integrated across potential minimum and maximum plume depths. Figure 7 (upper panel) represents a reduced vertical resolution of the plume where transects at intermediate alti-  tudes through the plume were not conducted. In this case, the minimal plume depth is the narrow span captured by observation in the 45.9-51.9 m altitude window. The maximal plume depth is taken as the height difference between the highest and lowest transects without CH 4 enhancements, which are above and below the plume, respectively; this value has to be used as the maximum due to incomplete sampling of the void area seen in the upper panel of Fig. 7. In cases where the base and top of the plume were not sampled (e.g. during 2018 sampling), the lower limit was selected as the sea surface, and the upper limit of the plume was selected as the atmospheric marine boundary layer. The greatest uncertainty in bulk flux arises when the vertical extent of the plume is not fully captured. For the 2019 campaign, the flux uncertainty related to plume depth was reduced by a factor of 10 compared to the 2018 campaign (as seen in Table 1) by completing a rigorous set of stacked transects at multiple heights throughout the plume. The fluxes presented here serve to demonstrate the approach and the impact of sampling strategy and meteorological conditions on the calculation. Flux estimates for Figure 6. A cross section of CH 4 , CO 2 and C 2 H 6 measurement response during one plume sample as recorded by Picarro G2311-f in pink and green (10 Hz as dashed lines and downsampled to 1 Hz as solid lines), TILDAS 1 Hz in cyan and Los Gatos uGGA 1 Hz in brown. The difference between the uGGA and Picarro at 1 Hz arises from the slower uGGA response time is due to the slower cell turnover. The blue-shaded area shows enhancement in C 2 H 6 and CH 4 , indicating cold venting; the orange-shaded area shows enhancement in C 2 H 6 , CH 4 and a small amount of CO 2 potentially indicating a co-located combustion source. all sampled platforms will be presented in a future study, including a full treatment of component uncertainties.

Ethane-methane ratios (C2 : C1) as a source tracer
It has already been well established that continuous C 2 H 6 measurements can be an excellent diagnostic tool for ascribing enhancements of co-located CH 4 and C 2 H 6 to natural gas emissions in both urban areas (e.g. Plant et al., 2019), semi-rural areas (e.g. Lowry et al., 2020) and during largescale evaluations of oil and gas fields from aerial studies in the USA (e.g. Peischl et al., 2018), Canada (Johnson et al., 2017) and the Netherlands (Yacovitch et al., 2018). During this work, two methods were used to establish C 2 H 6 -CH 4 ratios (hereafter, described as C2 : C1). In 2018 the LGR uMEA was used to measure C 2 H 6 -CH 4 ratios. The benefits of such instrumentation are in its simplicity of operation and that few considerations are required for corrections or variable lags, as both species are measured at the same rate and within the same optical cavity. C2 : C1 can therefore be readily determined as the gradient of a linear regression be-tween the C 2 H 6 and CH 4 measurements. However, the low sensitivity to C 2 H 6 (standard deviation of > 10 ppb in C 2 H 6 over 10 s of background flying) only allowed emissions from two platforms to be characterised for C2 : C1 ratios during the whole of the 2018 campaign and none during 2019 using the LGR uMEA method.
In 2019 the addition of the TILDAS 1 Hz C 2 H 6 instrument allowed for better precision of C 2 H 6 (< 1 ppb) with a faster flush time in the measurement cell. The C 2 H 6 data are time-matched with the 1 Hz Picarro CH 4 dataset to allow C2 : C1 derivation. As the instruments do not have the exact same flow rate and different cell residence times, the C2 : C1 ratios were determined using the integral of each CH 4 and C 2 H 6 enhancement using Gaussian peak fitting. A comparison between the 2018 flight, 2019 flight and published data derived from the same geographical area is shown in Table 2. Although both instruments have been operated for this work without in-flight calibration or engineering solutions to address cabin-pressure-sensitivity issues (Gvakharia et al., 2018) due to weight and time constraints, the agreement between years and with published expected values is highly reassuring. The added value in high-precision C2 : C1 demonstrates that C 2 H 6 is not just a tracer for matching emissions to natural gas; it can give information as to proportions of emissions from mixed sources (as previously used by Peischl et al., 2018) or can be used to identify a likely emission point in a process chain depending upon enrichment or depletion of C 2 H 6 relative to CH 4 . The inclusion of a continuous instrument with a level below parts per billion (sub-ppb) of detection for C 2 H 6 is considered vital for future work with thermogenic sources of CH 4 to allow for more precise source attribution of emissions where no spot sampling has occurred.

δ 13 C CH4 for CH 4 source attribution
The principal method of δ 13 C CH4 source characterisation utilises the principles outlined by Keeling (1961) and Pataki et al. (2003) and has been well utilised since to create δ 13 C CH4 databases for a plethora of known CH 4 sources (e.g. Sherwood et al., 2017). In order for a Keeling plot to give useful results to determine a δ 13 C CH4 source signature of a CH 4 emission, the emission must have been successfully captured multiple times and with a range of CH 4 mixing ratios (which could be achieved by passes at different distances or heights downwind of a point source). This sampling process takes time (especially on an aircraft), where the emission plume is only intercepted once per transect and time in the plume is limited so that only one spot sample can be taken whilst "in-plume". Beyond the time limitations, sampling of a range of CH 4 mixing ratios from emissions and appropriate background samples is not straightforward. Background sampling must capture the air into which emissions are released, but during flights the meteorological conditions often resulted in significant variation of CH 4 mixing ratios and δ 13 C CH4 with altitude, in addition to horizontal variations.
Where repeat transects were conducted at different altitudes, this made selection of appropriate background samples for Keeling plots challenging, since the background CH 4 mixing ratio and δ 13 C varied over the different altitudes. This becomes particularly detrimental to Keeling plot validity where the range in sampled emission mixing ratios is small, since uncertainty in the background samples then becomes more important.
In Fig. 8, a sensitivity analysis is presented from one of the flights investigating the effect of reducing the number of samples on the uncertainty in the δ 13 C CH4 source signature determined for a plume. In this case nine samples were collected, but this took place over eight downwind transects and one upwind transect of a cluster of installations, which is not feasible to repeat for sampling large numbers of installations. As shown in Fig. 8, the uncertainty in the δ 13 C CH4 source signatures increases only slightly with a reduction in number of sampling points, with the exception of one n = 3 run where the source signature is poorly defined. A minimum of three data points can therefore be sufficient for classifying a source of CH 4 emissions (such as thermogenic, microbial or pyrogenic sources), providing that the background and point samples are captured with a large enough range of CH 4 concentration and providing that there is no mixing of sources. This will typically require collection of more than three samples, given some may miss the targeted plumes or potentially be lost during storage or processing as aforementioned. Although a two-point Keeling plot is technically possible, it is impossible to gauge the quality of the regression to be sure that only a single source has been captured.

Conclusions
Given the restrictions and time constraints on the science flights, important lessons for offshore oil and gas airborne measurement campaigns have been learned for rapid instrument re-fitting and agile deployment of a small aircraft for future campaigns. A key finding from this study is that offshore meteorological conditions define the ability of the flights to produce valuable data and suitable meteorology with a wellmixed (neutral) boundary layer is critical to deriving a regional emission estimate through regional modelling. Flying in conditions with multiple residual boundary layers makes interpretation difficult and pin-pointing emissions especially challenging, as emission plumes can easily be missed when they are trapped in thin filaments, increasing the uncertainties of measurement-based emission flux calculations. Although not possible for this work given aircraft scheduling, it is recommended that offshore observations are scheduled with a long window of opportunity to ensure optimal flying conditions. Predictions of the likelihood of a residual boundary layer over a coastal area could be achieved through highspatial-resolution forecast models such as the UK Met Office forecast model (Milan et al., 2020). Information on the  (b) An illustration of the variation in δ 13 C CH4 source signature and its uncertainty determined by Keeling plot analyses for reduced sample sizes. Each analysis represents a single Monte Carlo experiment with the original data, reducing the number of data points to the sample size indicated at random; the δ 13 C CH4 source signature is then calculated with the remaining sample points. Error bars are 2 times the standard error.
temperature structure over the previous few days using all the assimilated information, such as tephigrams and synoptic charts, would help determine the likelihood of residual boundary layers versus a simpler stratified, well-mixed layer. For methods using alternative platforms such as ships or drones, coincidental measurements of vertical profiles must be made to capture the true nature of the emission plume in the current meteorology. Due to the size of the aircraft, payload restrictions and power limitations demand challenging decisions for instrument selection. We recommend deploying at least one instrument measuring CH 4 (and CO 2 ) at 10 Hz, allowing several plumes emitted from a single installation to be resolved (Fig. 6). Priority should next be given to a C 2 H 6 instrument capable of a sub-ppb limit of detection at 1 Hz (or higher) in order to give certainty to the source of the CH 4 emission. Using C2 : C1 appears to be the simplest method for source attribution and is robust for distinguishing natural gas emissions, where the gas has an C 2 H 6 component (Lowry et al., 2020;Plant et al., 2019). Spot sampling is challenging, payload heavy and time consuming, as several passes are needed to collect enough samples (especially for δ 13 C CH4 source attribution). However, results can be very informative, such as the ability to distinguish between a gas leak and a geological reservoir from depth or a near-surface reservoir (Lee et al., 2018). The improvements to SWAS, allowing for continuous throughflow, has increased the success rate of peak sampling but still relies on accurate user triggering.
For mass balance flux calculations, an emission plume and the surrounding background variation in the species of interest, alongside local meteorology, must be fully resolved during the observation stage. This includes instruments with appropriate response times to fully capture the plume and identify any internal structure that may suggest a mixed source. An upwind leg must be conducted to ensure the plume and background are not contaminated by extraneous far-field sources, and the plume must be significantly distinct from this background for meaningful flux calculations. The plume must be laterally and vertically resolved in the 2D plane as much as possible at a fixed distance downwind of the source. Straight and level runs must extend to either side of the plume, and the vertical resolution must include multiple stacked transects with an identification of the top and bottom of the plume (where feasible) to reduce uncertainty in the plume bulk net flux. Full understanding of the meteorology with meteorological measurement instrumentation and a complete profile to determine characteristics of the marine boundary layer from the top to the surface, including determination of inversion heights, must be conducted during the flight day when appropriate radiosonde soundings are not available. The observed impact of complex boundary layer dynamics on plume dispersion also highlights an important limitation of ship-based plume measurements, which are unable to resolve the vertical structure of the plume and therefore rely on the assumption of idealised models of plume dispersion.

Appendix A A1 TILDAS data processing and performance
The TILDAS data were processed as follows. Rapid tuning sweeps of the laser frequency (2996.8 to 2998.0 cm −1 ) by varying the applied current result in the collection of thousands of spectra per second, which are co-averaged. The resulting averaged spectrum is processed at a rate of 1 Hz using a non-linear least-squares fitting algorithm to determine mixing ratios within the operating software, TDLWintel (© Aerodyne). Averaging of these spectra and the path length of 76 m achieved using a Herriott multipass cell provide the sensitivity required for trace gas measurement. Continuously circulated fluid from the Oasis chiller unit is used as a heat sink for the thermodynamically cooled components, and a flow interlock cuts power to the relevant components if the coolant flow stops. Other optical components of the instrument include a 15× Schwarzschild objective in front of each laser, a germanium etalon for measuring the laser tuning rate, a reference gas cell containing air at 25 Torr and numerous mirrors for adjusting the laser beam alignment. During the airborne campaign the instrument was operated remotely via an Ethernet connection. The TILDAS C 2 H 6 instrument accuracy has been tested against two standards containing C 2 H 6 in mixing ratios of 39.79±0.14 ppb and 2.08±0.02 ppb (high-concentration standard and target gas, respectively). As the TILDAS technique relies on highly precise alignment of the focussing and beam-alignment optics before and after the multipass measurement cell, it is particularly prone to motion that applies torque to the optical bench. To remove measurement artefacts associated with this sensitivity, all data collected for roll angles greater than 20 • have been flagged. The presence of the TILDAS in the 2019 campaign ruled out using the multiple circular pass method around a potential emission source as developed by Scientific Aviation for installation emission flux measurements , as there was a risk of invalidating data due to the roll angle of the plane if circling tightly around an installation.

A2 CO 2 and CH 4 calibration
The three cylinders were sampled periodically in flight to determine the instrument gain factor (slope) and zero offset for each analyser. These parameters were linearly interpolated between calibrations and used to rescale the raw measured data (for further details see Pitt et al., 2016). The uncertainties associated with instrument drift and any instrument nonlinearity were assessed by sampling the target cylinder midway between high-low calibrations. The raw target cylinder measurements were rescaled as per the sample data; the mean offset of these target measurements from the WMO-traceable cylinder value (and associated standard deviations) are given for the LGR uGGA and Picarro instrument and are plotted in Fig. A4.  The typical duration of calibration cylinder measurements during the 2018 campaign was 45 s. The Picarro G2311-f analyser had a high flow rate of ∼ 5 SLPM (standard litre per minute), resulting in rapid flushing of both the inlet tubing and sample cavity. The measured value for each calibration was taken as the average over 15 s prior to the calibration end, as this allowed sufficient time for the measured value to reach equilibrium. The uGGA and uMEA both had much lower flow rates of ∼ 0.5 SLPM, resulting in a much longer equilibration time. Consequently, the calibration duration was not of sufficient length for the uGGA and uMEA measurements to reach equilibrium, and their calibration routine was compromised. For these instruments each calibration run was fitted to an offset exponential function in an attempt to predict the mixing ratio at which equilibration would have occurred, given an infinite amount of calibrating time. In order to im- Table A1. Response rates and precision for the instrument set-up on the BAS Twin Otter. All measurements were time-shifted to match the Picarro G2311-f for analysis. Instrument  species  rate species of interest LGR uGGA CH 4 , CO 2 17 s (CH 4 ) 1 ppb over 10 s Picarro G2311-F CH 4 , CO 2 0.4 s (CH 4 ) 1.2 ppb over 1 s LGR uMEA C 2 H 6 , CH 4 17 s a (C 2 H 6 ) 17 ppb over 1 s TILDAS C 2 H 6 < 2 s b (C 2 H 6 ) 50 ppt over 10 s a Measured in laboratory. b Manufacturer's expected precision. prove the data quality and to reduce the post processing time, the calibration periods were run for 75 s per cylinder during the 2019 campaign to ensure that all instruments reached equilibrium. Target cylinders were run approximately every 1 h of flight.

A3 SWAS operation
Each sample is compressed into the canisters using a modified metal bellows pump (Senior Aerospace 28823-7) capable of 150 SLPM open flow but filling the canisters at ∼ 50 SLPM measured average integrated for ∼ 6 and 9 s for the 1.4 and 2 L canisters, respectively. Canister fill pressure is controlled electronically using a back-pressure controller (Alicat PCR3; BPC). The BPC can maintain flow at any set point Figure A4. Target gas data from flights during 2018 for the Picarro G2311-f and Los Gatos uGGA instruments for both CO 2 and CH 4 . pressure (in general 40 psi; 275 kPa), including the final fill pressure set point. This allows the 2 L flow through canisters to be filled, even before the operator activates the sampling, enabling air masses to be sampled through which the aircraft has already flown seconds earlier.
Bespoke software was created to allow control of the SWAS system wirelessly from any position in the aircraft using the Ethernet network. Bespoke software was also created for the analysis of the canisters once in the laboratory. The SWAS flown on the 2018 campaign (V1) was a prototype and was updated to the current final version (V2) to fulfil the requirements of the FAAM BAe 146 and to address potential issues experienced with the prototype. V2 uses the same canisters and valves as V1 but differs slightly in the size of each case and the plumbing of gas lines. In V2, the canister and valve geometry was optimised to allow an elbow compression fitting between the valve and the canisters to be eliminated, with the valve mounted directly to the canister. This reduces the risk of leaks by 66 %. The geometry also allowed for the reduction in size by 1U rack unit, allowing for more canisters to be fitted in the same space, improved control electronics and sample logging to ensure canister fill times were captured accurately and stored securely. V2 also saw the addition of 2 L flow-through canister cases to complement the 1.4 L to-vacuum canister cases. These allowed sample air to be flushed through the canister at a user-defined pressure and makes capturing narrow plumes easier due to reduced sample line lag and fill time.