Towards an operational Ice Cloud Imager (ICI) retrieval product

. The second generation of the EUMETSAT Polar System (EPS-SG) will include the Ice Cloud Imager (ICI), the ﬁrst operational sensor covering sub-millimetre wavelengths. Three copies of ICI will be launched that together will give a measurement time series exceeding 20 years. Due to the novelty of ICI, preparing the data processing is especially important and challenging. This paper focuses on activities related to the operational product planned, but also presents basic technical characteristics of the instrument. A retrieval algorithm based on Bayesian Monte Carlo integration has been developed. The main retrieval quantities are ice water path (IWP), mean mass height ( Z m ) and mean mass diameter ( D m ). A novel part of the algorithm is that it fully presents the inversion as a description of the posterior probability distribution. This is preferred for ICI as its retrieval errors do not always follow Gaussian statistics. A state-of-the-art retrieval database is used to test the algorithm and to give an updated estimate of the retrieval performance. The degrees of freedom in measured radiances, and consequently the retrieval precision, vary with cloud situation. According to present simulations, IWP, Z m and D m can be determined with 90 % conﬁdence at best inside 50 %, 700 m and 50 µm, respectively. The retrieval requires that the data from the 13 channels of ICI are remapped to a common footprint. First estimates of the errors introduced by this remapping are also presented.


Introduction
Satellite data are today an indispensable part of numerical weather prediction (NWP); see for example Bauer et al. (2015). The first observations from space directed towards weather prediction were made during the early 1960s by the TIROS (Television InfraRed Observation Satellite) program, using optical and infrared sensors (Bandeen et al., 1961). According to Staelin et al. (1976), the first satellite-based microwave observations of Earth's atmosphere were made by Cosmos 243 and 384, launched by the Soviet Union in 1968 and 1970, respectively. Atmospheric humidity and liquid cloud water were measured using channels at 22.235 and 37 GHz. These first, brief measurements (two weeks and two days, respectively) were followed by NEMS (Nimbus E Microwave Spectrometer), which was functional 2.4 years after its launch 1972 with Nimbus-5. The channels of NEMS were placed at 22.235, 31.4, 53.65, 54.9 and 58.86 GHz. The additional channels around 55 GHz gave information on the atmospheric temperature profile (Waters et al., 1975). Another microwave sensor on board Nimbus-5 was ESMR (Electrically Scanning Microwave Radiometer), which had a single channel at 19.35 GHz and showed that rainfall can be detected from space (Kidd and Barrett, 1990).
More regular microwave soundings started around 1979 with the MSU (Microwave Sounding Unit) and SSM/T (Special Sensor Microwave -Temperature) sensor series. Both of these instruments only had channels between 50 and 60 GHz (see for example Grody, 1983;Liou et al., 1981). The SSM/I Published by Copernicus Publications on behalf of the European Geosciences Union.
(Special Sensor Microwave -Imager), introduced in 1987, had humidity, cloud liquid water and precipitation as the main atmospheric targets, with channels at 19.4, 22.2, 37.0 and 85.5 GHz (see for example Schluessel and Emery, 1990), and thus extended the coverage of the microwave region to higher frequencies. The next main step was taken during the 1990s with the SSM/T-2 (Special Sensor Microwave -Humidity) and AMSU-B (Advanced Microwave Sounding Unit -B) instruments that both included three channels around 183.3 GHz (see for example Spencer et al., 1989;Saunders et al., 1995). One of the main motivations for extending the coverage up to 183 GHz was to obtain vertical information on humidity, and not only column values. Today a relatively high number of microwave sensors are operational, mainly in sun-synchronous orbits but also in other orbits, such as SAPHIR (Sondeur Atmospherique du Profil d'Humidite Intertropicale par Radiometrie) and GMI (Global precipitation mission Microwave Imager). Across-track and conically scanning microwave radiometers are by tradition denoted as sounders or imagers, respectively. The origin to this classification is that conically scanning instruments have tended to be optimised for deriving surface properties, while vertical sounding of the atmosphere has mainly been implemented through across-track instruments.
In NWP the capability of providing information on temperature and humidity with no or a small impact of clouds has traditionally been seen as the main justification for launching microwave sounders. It has been recognised that passive microwave data also contain valuable information on clouds and precipitation and these features have been used in various stand-alone retrievals (e.g. Spencer et al., 1989;Weng et al., 2003;Andersson et al., 2010), but this fraction of the data has been rejected inside NWP as assimilation systems have been incapable of using these observations. This situation has started to change, and there are already indications of a strong increase in the relative impact of microwave data inside NWP .
The present growing impact of microwave data is mainly due to improved assimilation software in combination with increased computing power, but new versions of the instruments having a higher number of channels has also been beneficial. But one limitation has remained for two decades: that operational microwave observations are so far limited to frequencies below 195 GHz. This situation will change in 2023 with the launch of ICI (Ice Cloud Imager), which will extend the coverage up to 670 GHz. ICI is one of the instruments planned for the next generation of Metop satellites; see Sect. 2 for further details. The frequencies 195 and 670 GHz correspond to wavelengths of 1.5 and 0.45 mm, respectively, and ICI will thus open up the sub-millimetre region for NWP.
The main objective of ICI is to provide data on humidity and ice hydrometeors, particularly the bulk ice mass. The advantage of using sub-millimetre observations for deriving such information was first pointed out by Frank Evans and coworkers in a series of articles (Evans and Stephens, 1995a, b;Evans et al., 1998Evans et al., , 1999Evans et al., , 2002. The initial idea was to have a sub-millimetre instrument on board CloudSat to complement its cloud radar, but this part was later descoped. The idea of a sub-millimetre cloud ice sounder was picked up again in a mission called CIWSIR, which was proposed to the European Space Agency ESA as an "Earth Explorer" in 2002 and again in 2005 . CIWSIR was not selected, but ESA funded preparatory studies that lead to a consolidated mission proposal called CloudIce for Earth Explorer 8 in 2010 (Buehler et al., 2012). It featured channels near 183.31, 243.20, 325.15, 448.00 and 664.00 GHz. Shortly thereafter, a similar sensor was also proposed for the international space station (ISS-ICE), with a reduced set of channels.
While CloudIce was not selected for Earth Explorer 8, it was taken as blueprint for the ICI instrument part of EUMET-SAT Polar System -Second Generation (EPS-SG). Its channel configuration, given explicitly in Table 1, is identical to CloudIce, except that the number of channels near 183 GHz was reduced from 6 to 3.
ICI will be the first operational sub-millimetre mission, but measurements by other instruments of our atmosphere at such wavelengths already exist, mainly by limb sounding instruments. The main objective of these instruments is to monitor gases in the strato-and mesosphere, but at their lowest tangent altitudes they perform observations that have similarities with ICI. Retrievals of ice cloud mass have also been developed for all three sub-millimetre limb sounders launched so far: Aura MLS (Wu et al., 2006), Odin/SMR  and SMILES (Millán et al., 2013;Eriksson et al., 2014). Observations at 887 GHz were recently demonstrated by a "cubesat" mission (IceCube, Wu, 2017). The observation approach behind ICI has also been used by some airborne instruments. The pioneering instruments were MIR (Millimeter-wave Imaging Radiometer) and CoSSIR (Compact Scanning Submillimeter Imaging Radiometer) (Wang et al., 2001;Evans et al., 2005). More recently, ISMAR (International SubMillimetre Airborne Radiometer) has been developed largely to support the preparations for ICI . These instruments and the associated data analysis, besides their intrinsic scientific value, provided justification for ICI in the selection process and provide useful input when designing processing algorithms for ICI.
It is expected that ICI data will be used in two main ways. In NWP the data will mainly be ingested as basic radiances; for a review of challenges, expected benefits and approaches of "all-sky" assimilation, see Geer et al. (2018). The data of ICI can also be "inverted" in stand-alone algorithms to produce a number of geophysical quantities; see Buehler et al. (2012). The produced retrieval datasets can be concerning for short-term weather forecasting, but will likely mainly be used for different climate applications, such as the verification of global models made by the similar ice cloud products derived from limb sounders (e.g. Li et al., 2005;Eriksson et al., 2010;Jiang et al., 2012). This article describes activities performed Atmos. Meas. Tech., 13, 53-71, 2020 www.atmos-meas-tech.net/13/53/2020/ under the auspices of EUMETSAT in preparation of a "dayone" retrieval product (i.e. the product released directly after commissioning), as well as to provide general support for using ICI data. The ICI instrument and its main characteristics are introduced in Sect. 2, while the following Sect. 3 outlines the retrieval algorithm in focus. The expected performance is investigated in Sect. 4 using simulated data. The two final sections provide an outlook and conclusions.
2 The Ice Cloud Imager

Overview of EPS-SG
The ICI mission is part of the EUMETSAT Polar System second-generation system (EPS-SG). The space segment will consist of two satellites, referred to as Metop-SG satellite A and B. There will be three satellite pairs, where each satellite will have a nominal lifetime of 7.5 years to span a total operational lifetime over 21 years. These satellites will fly, like the present Metop, in a sun-synchronous mid-morning orbit at 09:30 local time of descending node. The altitude profile over the Earth geoid varies between 848 and 823 km (832 km mean altitude). The orbit repeat cycle will be 29 d (412 orbits per repeat cycle). The main ground station will be Svalbard, but also McMurdo will be used to improve the timeliness of data. The ground segment also includes regional ground stations for receiving a Direct Data Broadcast. See https:// www.eumetsat.int/website/home/Satellites/FutureSatellites/ EUMETSATPolarSystemSecondGeneration/index.html (last access: 15 December 2019) for further details.
ICI will be on board the B satellites, also carrying MWI (Micro-Wave Imager), SCA (Scatterometer), RO (Radio Occultation sounder) and ARGOS-4 (Advanced Data Collection System). In particular, MWI is a conically scanning radiometer which observes 18 frequencies ranging from 18 to 183 GHz. All channels up to 89 GHz will observe in dual polarisation, while only vertical polarisation will be provided for higher frequencies. MWI has the same requirements for incidence angle and fore-view observation as ICI. Combined, the MWI and ICI radiometers will provide an unprecedented set of microwave passive measurements, from 18.7 GHz up to 664 GHz.
It is noteworthy that MWI will cover the 118.75 GHz oxygen and the 183.15 GHz water vapour molecular transitions with four and five channels, respectively. This gives MWI sounding capabilities, and this instrument narrows down the traditional separation between "imagers" and "sounders".

The receiver package
The ICI radiometer (Bergadá et al., 2016) consists of seven double sideband front ends, operating with local oscillator (LO) frequencies of 183.31, 243.20, 325.15, 448.00 and 664.00 GHz. The frequencies 183.31, 325.15 and 448.00 cor-respond to three water vapour transitions, while 243.20 and 664.00 GHz are "window" channels (see Buehler et al., 2007, Fig. 10). There is a receiver at each of these LO frequencies providing data matching vertical (V) polarisation inside the atmosphere. At both the two window frequencies there is also a second receiver covering horizontal (H) polarisation. A spectrometer of filter-bank type is attached to each front end. The receiver package will be kept in thermal balance by passive cooling. Presently, the receiver noise temperature is expected to be about 600, 900, 1700, 1500 and 2600 K at the five LO frequencies, respectively. See for example Janssen (1993) for an introduction to the concept of receiver noise temperature.
For the window frequency receivers the filter-bank consists of a single channel, while the other filter-banks have three channels each. Position and width of all the channels are reported in Table 1 and are visualised in Fig. 1.

Antenna system, scanning and calibration
The receiver package is integrated with a conically scanning antenna system. The diameter of the main reflector is 0.26 m (slightly elliptical), and the system is rotating at 1.333 Hz (i.e. 45 r.p.m.). Atmospheric observations are made over about 120 • , around the platform's (forward) flight direction. This gives a swath width of roughly 1500 km. The platform will perform yaw manoeuvres to keep the swath centred around the sub-nadir orbit track. During the remaining part of each rotation, calibration data will be obtained by observing "cold sky" and an internal calibration target that will have a temperature of around 300 K. The overall requirement on random (NE T) and systematic (bias) uncertainties of calibrated antenna temperatures are found in Table 1.
The horn antennas are designed to keep the angular resolution the same between channels (about 0.5 • ), but the footprints of the receivers still differ, as the antenna of each front end is placed at a different position in the focal plane. The angular offsets are found in Table 1. The reference angle for the elevation offsets is 44.767 • , measured from the nadir direction. This gives a configuration of instantaneous footprints at surface level as depicted in Fig. 2, with surface incidence angles varying between 51.5 • and 53.8 • . The instantaneous footprint sizes at surface level are about 17 (20 km) along the track and 7.3 (8.5 km) across-track for the footprints having a positive (negative) elevation offset (at −3 dB, and slightly varying with latitude). The angular movement inside the integration time increases the effective across-track size.
Although the combination of conical scanning and the platform's movement in total gives a continuous coverage over the swath, there will not be any perfect matches in horizontal coverage between the channels. Accordingly, some post-processing is required to obtain data suitable for an inversion using channels from more than one front end. To support footprint "remapping" a high across-track sampling will be applied, and data will be recorded every 0.661 ms. This Table 1. Specifications of the ICI receiver. ICI has double sideband receivers, indicated by ± in the third column, and the bandwidth refers to the width of single passbands, i.e. the intermediate frequency bandwidth. "NE T" and "Max bias" are reported as the requirements, and final performance should be better. Further comments are found in the text (the last two columns are discussed in Sect. 3).

Channel
Frequencies Bandwidth Polari-NE T Max bias Elevation Azimuth τ = 1 Ozone corresponds to an across-track movement of the bore sights between samples of about 2.7 km, giving 785 samples per scan. The distance along the track between subsequent scans will be about 9 km. This gives substantial overlap of sample footprints, both in along-and across-track dimensions, giving some freedom in setting the target resolution in the remapping of footprints. The requirement on final footprint size is 16 km (as the average between along-and across-track resolution), and the requirement of NE T, for example, is defined for this horizontal resolution. The noise in individual samples will be higher. It is expected that averaging over four subsequent across-track samples will meet the requirements, and about 200 footprints per scan will effectively be provided. L1b data will only contain the original samples; the optimal remapping will differ depending on application.

Aim and constraints
The planned output of the EPS-SG Overall Ground Segment at EUMETSAT Headquarters includes the MWI-ICI-L2 product, which will contain retrievals based on MWI and ICI and be delivered in near real time. The objective of the IWP product of MWI-ICI-L2 is to provide a day-one robust retrieval that reflects the main information content of ICI radiances. For some centrally generated level 2 products, the EUMETSAT Satellite Application Facilities (SAFs) provide support by specifying the level 2 processing algorithms and share responsibility for the products. The SAF supporting nowcasting (NWC-SAF) retains the scientific ownership of the IWP product and delivered the IWP algorithm theoretical basis definition . To allow for the procure-ment and implementation in the ground segment, the IWP algorithm definition had to be finished during 2018, with further changes in the algorithm specifications so that the basic architecture and design would not be impacted. The efforts so far have focused on the core algorithm and the retrieval database discussed below has been produced as an initial working basis. Future studies will be required to elaborate the final database. Additional products from ICI will be generated directly by the SAFs located at weather services in EUMETSAT member and co-operating states.

Overview
A first, crucial decision was the selection of retrieval approach. "Optimal estimation" (a.k.a. 1DVAR) was not selected as it would demand a forward model handling multiple scattering of polarised radiation and capable of providing the Jacobian with respect to the retrieval quantities. Such a model was simply not at hand. With respect to sub-millimetre cloud observations, optimal estimation has so far only been used for theoretically inclined studies (Birman et al., 2017;Grützun et al., 2018;Aires et al., 2019). Further, the retrieval problem at hand is non-linear and involves non-Gaussian statistics, and a more general solution of the Bayes theorem should be preferable. For practical reasons this leads to approaches based on a retrieval database (Rydberg et al., 2009). The most straightforward implementation can be denoted as BMCI (Bayesian Monte Carlo integration), and has been the method of choice for Evans and coworkers (e.g. Evans et al., 2002).
There are close connections between BMCI and the standard use of neural nets (NNs Pfreundschuh et al., 2018). Such NNs, a form of machine learning, have been applied on both simulated ICI data Wang et al., 2017) Atmos. Meas. Tech., 13, 53-71, 2020 www.atmos-meas-tech.net/13/53/2020/  and ISMAR field data . Both approaches (BMCI and NNs) were considered initially, but NNs were eventually rejected as it was found that a very high number of nets would be required and there was no established way to estimate retrieval uncertainties. Following the selection of BMCI, a complete retrieval algorithm was designed (Fig. 3). The algorithm consists of two main parts: a series of pre-processing steps and the actual inversion by BMCI. Only the most critical aspects are discussed in the following sections; for details we refer to . The generation of the final retrieval database is a task for the future, but a possible manner to generate the database is still outlined in Sect. 4.2.

Input and output
The main input to the retrieval algorithm are geo-located and calibrated antenna temperatures, i.e. L1b data. Data from a number of footprints will be involved in each inversion, being remapped to the target footprint specified (Sect. 3.4.1). The target footprint also governs the extraction of geophysical variables (Sect. 3.4.3). All important retrieval parameters are set by a configuration data structure.
The main output variables (L2) are ice water path (IWP), mean mass height (Z m ) and mean mass diameter (D m ). These three variables are all reported as percentiles of the estimated posterior distribution (Sect. 3.5.1) and are defined as antenna weighted means. For example, the reported IWP is an estimation of where r is the antenna (or radiation) pattern (in sr −1 , with the satellite as reference point); is antenna pattern solid angle, x, y and z are Cartesian coordinates (with arbitrary origin); and IWC is ice water content: where n is particle size distribution, m is particle mass and d veq is equivalent volume diameter (ρ is the density of ice): That is, d veq is the diameter of an "ice sphere" with the same mass. The start of the altitude integration in Eq. (1), z 0 , is presently set to be the surface altitude, but it can be changed. Mean mass height is defined as and mean mass size as (cf. for example Delanoë et al., 2014, Eq. 3) These equations are applied to calculate the IWP etc. of the database cases, and thus will represent the "true" values. As these equations take inhomogeneities into account, both vertically and horizontally, the impact of "beam filling"  will automatically be included in the estimated retrieval uncertainty. The L2 data will contain further data, such as retrieved water vapour column, but the exact L2 format is not finalised and only the three main retrieval quantities are discussed below.
3.4 Pre-processing part 3.4.1 Target footprint and remapping of data The exact geo-location of samples differs between channels (Sect. 2.3), but the time integration of individual samples is shorter than the time period necessary to sweep out a single projected field of view. This allows for a footprint-matching procedure by remapping of the original data. A toolbox for performing such remappings has been developed in a dedicated study issued by EUMETSAT (Rydberg and Eriksson, 2019). The toolbox is based on the Backus-Gilbert methodology (Backus and Gilbert, 1970;Stogryn, 1978), which was previously successfully applied for footprint-matching between various satellite data (e.g. Bennartz, 2000; Maeda and Imaoka, 2016).
In short, the Backus-Gilbert methodology can be used to obtain a set of optimal weighting coefficients for neighbouring samples, both within the scan and from adjacent scans, to create a remapped representation of the data matching a specified target footprint. A remapped value is a linearly weighted combination of data of the channel of concern. The weights are found, after a trade-off analysis, by minimisation of a penalty function that considers both the effective noise of the remapped data and the fit to the target footprint.
The centre position of a retrieval is set by selecting one of the sample footprints of ICI-1V. The exact shape of the target footprint around this position will be determined later, but it is expected to be ≈ 16 km and close to circular. The effective noise of remapped samples should be equal or below the "NE T" reported in Table 1. Example results are found below, in Sect. 4.1.

Bias correction
The algorithm allows for a simple "bias correction" of the data: where T c a,j is the corrected antenna temperature for channel j , T a,j is the value as given by the remapping toolbox and a j , and b j are channel specific coefficients.
The purpose of the bias correction is to remove systematic differences between remapped L1b data and the simulations behind the retrieval database. A bias can originate from calibration issues, the remapping and incorrect spectroscopic data in the simulations, for example. This module will only be applied as a rough temporary solution if any bias is de-Atmos. Meas. Tech., 13, 53-71, 2020 www.atmos-meas-tech.net/13/53/2020/ tected, until the source to the bias has been understood and corrected.

Geophysical data and RTTOV
The retrieval performance can be improved by incorporating various geophysical data. These data will be taken from the ECMWF forecast system. Data of dynamic character that will be used include temperature, ozone and surface wind speed, while static data are various parameters to characterise surface altitude and type. The water vapour profile is also imported from ECMWF, but it is modified below the tropopause to have a constant relative humidity (a configuration setting). The logic behind this approach is to incorporate information on atmospheric temperatures and ozone from ECMWF (ICI has no temperature channels), for example, while letting humidity be constrained by the ICI data. The last column in Table 1 gives the mean impact of ozone based on a set of simulations. The maximum impact found was 2.1 K, for ICI-11 and a mid-latitude winter scenario.
Using the ECMWF data as input, radiative transfer calculations will be performed by applying the RTTOV software (Saunders et al., 2018), to obtain the first estimate of the atmospheric optical thickness and a reference antenna temperature (T r a ). These calculations assume "clear-sky" conditions (i.e. no impact of hydrometeors), are run for all ICI channels and are discussed further below.

Channel selection
Modelling of surface effects will, at least initially, be one of the main obstacles for these retrievals. Simulating these effects for land surfaces is already a challenge at low microwave frequencies. The situation for water bodies is better, particularly as the TESSEM sea-surface emissivity parameterisation has been updated to cover the full frequency range of ICI . Some validation of TESSEM has been made (using ISMAR), but presently relatively large model uncertainties are expected even for water surfaces.
The impact of surface effects on measured radiances depends mainly on the atmospheric transmission. The transmission varies strongly between the ICI channels, as exemplified in Fig. 4. It also varies with the atmospheric situation. Estimates of the altitude at which the transmission to ICI equals e −1 , for clear-sky conditions, are found in the column "τ = 1" of Table 1. The lowest altitudes are associated with the driest atmospheric scenario considered, and vice versa. The table shows that surface effects are in general of no concern for ICI 7V, 10V, 11V and 11H, for example, while for some channels the surface must always be considered.
As a consequence, an adaptive selection of data is required. A channel mask is formed by evaluating where τ cs,j is the clear-sky optical thickness of channel j obtained by RTTOV, c hm is a configuration setting, τ hm,j is estimated additional optical thickness due to hydrometeors and τ s t is a threshold value for surface type s. Data from channels fulfilling this criterion are included in the calculations. In the pre-processing part τ hm is set to zero. The channel mask is re-evaluated as part of the BMCI module. At this stage, hydrometeor attenuation is considered. Both c hm and τ s t are configurable variables, and the latter is specified for five different surface types. The selection of τ s t should consider the extent to which surface emissivity variability is represented in the final retrieval database, as well as the extent to which the error model in Sect. 3.5.3 covers remaining modelling uncertainty.

Detection of clear-sky data
The algorithm includes an optional module for identifying observations that with a high probability match clear-sky conditions, which thus can be set to give IWP = 0 without doing an actual inversion. This procedure results in the L2 structure not being fully filled, e.g. the water vapour column will not be retrieved, and this module will only be activated if it will be necessary to decrease the overall calculation burden of the processing. As the module likely will not be applied, we refer to  for details.

Theory and retrieval representation
The retrieval is performed by the BMCI method (Sect. 3.2). For a description of BMCI and its relationship to Bayesian estimation, see for example Kummerow et al. (1996) or Pfreundschuh et al. (2018). In short, BMCI is based on a "retrieval database" consisting of n pairs of atmospheric state, x i , and corresponding observation, y i , with the constraint that x i is approximately distributed according to reality, i.e. represents the prior distribution of x. The essence of BMCI is, for a given measurement y, to attribute a posterior probability, p i (x i |y), to each database state as where w i is a measure on the agreement between y and y i , and with S o being the covariance matrix describing measurement and forward model uncertainties (Kummerow et al., 1996). The factors a i can be seen as a priori weights. They can be used to optimise the retrievals for a given database size. For example, it could be justified to accept cases with IWP = 0 only with some probability r < 1 during database generation (Sect. 5). If this thinning is performed, remaining database cases with IWP = 0 will obtain a i = 1/r (instead of 1). As Eq. (9) involves S o , this has the consequence that all uncertainties covered by this covariance matrix must approximately follow Gaussian statistics (as for 1DVAR). On the other hand, BMCI allows any prior distribution of variables (unlike 1DVAR), and "outliers", for example, can be included in the generation of the retrieval database.
The actual solution of BMCI is the estimated posterior distribution (as for all Bayesian methods), but it is unpractical to report sets of p. Some more compact description is needed. If the posterior distribution follows a Gaussian distribution it suffices to report the expectation value and the width of the distribution. ICI retrievals do not fall into this category and it was decided to instead use a more general description based on the cumulative distribution function, in the continuous case defined as where p denotes a probability density function, and in the framework of BMCI is obtained by summing the probability of all cases with x i < x: Using Eq. (11), F x|y is calculated on a wide grid of x values. These data are then used to obtain the inverse distribution function, F −1 , numerically by interpolation to a set of fixed percentiles. A more descriptive name of F −1 is the quantile function. For example, F −1 (0.5) is the median and the 90th percentile is F −1 (0.9). Figure 5 exemplifies prior and posterior quantile functions. It is presently planned to report the 5th, 16th, 50th, 84th and 95th percentiles in the L2 data. If the retrieval must be condensed to a single value, the first candidate to "best estimate" should be the 50th percentile. The other percentiles can be used in different ways. For example, if the 5th percentile for IWP is > 0 then a correct detection of ice hydrom-Atmos. Meas. Tech., 13, 53-71, 2020 www.atmos-meas-tech.net/13/53/2020/ eteors is highly probable. The 16th-84th percentile range matches ±1σ for a Gaussian distribution. The true value is between the 5th and 95th percentiles with a probability of 90 %, etc.

Database extraction and iterations
Not all database cases are included in the BMCI summation, a filtering is done based on surface type, pressure, wind speed and temperature, as well as T a (as defined below in Eq. 12).
Wind speed is applicable only over water. The database extraction is done in an iterative manner, where the filter limits are adjusted with an iteration counter, in order to fetch both the most relevant and a sufficient number of matches. The filtering does not involve latitude or season. This results in, for example, a tropical database case being able to influence the inversion of a mid-latitude summer measurement, if there is a match in surface temperature etc. An additional iteration scheme has been added around the core BMCI calculations. One of the first reasons is to better make use of the observations in situations with significant hydrometeor contents. The optical thickness associated with hydrometeors is estimated alongside of the L2 data in each iteration. Based on this updated estimate of the total optical thickness, Eq. (7) is reevaluated for all channels. If this results in more channels being able to be included, BMCI is reiterated with the new channel mask. This iteration is important as the channels that are sensitive to the surface in a clear-sky situation, and thus ignored in the initial iteration, are the most important ones to obtain good estimates at high IWP.
The second reason is to handle the fact that the retrieval database only provides a discrete coverage of the distribution of y. If one y i happens to agree closely with y, one w i can be orders of magnitude bigger than all other w and the summation in Eq. (8) will be dominated by one database case. While the median value found can be realistic, this results in an underestimation of the retrieval uncertainty. It could also be the case that no y i gives a significant match with y. Both of these situations are primarily handled by increasing the variances in S o , effectively making the "search radius" larger. If this does not suffice, channels will be rejected until an acceptable number of significant weights are obtained.
For further details of the filtering and iteration schemes, see . All critical parameters are part of the configuration data.

Measurement vector and uncertainties
The measurement vector (y) incorporates data from channels fulfilling the optical thickness criterion of Eq. (7) as a difference: where T c a,j is defined by Eq. (6) and T r a,j is a simulated antenna temperature (by RTTOV, Sect. 3.4.3). To match this, the retrieval database contains the difference between a full (all-sky) simulation and one (clear-sky) matching T r a,j . The matrix S o (Eq. 9) represents both instrument and simulation uncertainties. It is kept diagonal due to a lack of relevant information on uncertainty correlations between channels. The knowledge regarding such correlations is especially poor for surface emissivity. The variances σ 2 are set as where NE T is uncertainty due to thermal noise and calibration. The second term aims to represent the impact of unknown surface emissivity, where is emissivity uncertainty, T skin is the ECMWF surface skin temperature, and it is assumed that the emissivity is relatively high. The antenna temperature is then approximately T a = T skin e −τ cs ,j + T e (1−e −τ cs ,j ), where T e is an effective temperature of the atmosphere, and thus dT a /d ≈ T skin e −τ cs ,j .
The last term covers uncertainty in modelling of hydrometeor scattering. To our best knowledge, no investigation of such modelling errors has been made. The uncertainty is zero for clear-sky conditions, and it should in general increase with the strength of scattering. Based on these two simple observations, we decided to simply model the error as proportional to the deviation from the clear-sky reference simulation. NE T for each channel (j ), for water and land, and c are constants, part of the configuration data.

Remapping of data
Samples from all ICI channels will be convolved into the field of view of ICI-1V. This section summarises the main findings obtained by applying the Backus-Gilbert toolbox developed (Sect. 3.4.1).

Simulate test data
To test the toolbox four full orbits were simulated. The orbit parameters were taken from Metop-A (orbits 4655, 4656, 6985 and 9744). Geophysical data for the time of the four orbits were taken from ERA5 (https://www. ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5, last access: 15 December 2019). ERA5 lacks data on precipitation of a convective nature. To compensate for this, precipitation was added from a separate database (provided by Alan Geer at ECMWF), based on similarities of large-scale precipitation profiles and some other variables. Using these data, radiances were simulated for all MWI and ICI channels, covering an area broader than the instrument's swath and representing a set of incidence angles.
These simulations were done by running the ARTS software in its three-dimensional mode .

P. Eriksson et al.: Towards an operational ICI retrieval product
Absorption due to gases and liquid water content was calculated following Rosenkranz (1993Rosenkranz ( , 1998 and Ellison (2007), and surface emissivities following Prigent et al. (2017) and Aires et al. (2011). The size distribution of rain drops and ice hydrometeors were set following Abel and Boutle (2012) and Field et al. (2007), respectively. Particle properties were taken from Eriksson et al. (2018), applying ID 25 for rain and IDs 15 and 20 for ice hydrometeors (the name of these habits are found in Table 2).
Using the set of pre-calculated pencil-beam radiances as a "look-up" table, antenna weighted brightness temperatures could be generated with a relatively low calculation burden, taking full account of MWI's and ICI's scanning and footprints characteristics, for different assumptions of the exact syncing between the instruments. In all parts, the WGS-84 reference ellipsoid was applied.

Main findings
The assumption here is that the goal of the remapping is to obtain data as would be observed with a synthetic instrument with a common footprint for all channels (implying the same surface incidence angle for all channels). As will be shown, this cannot be achieved perfectly. However, these "errors" can at least partly be considered in the retrieval process and the final impact can be relatively low. Most importantly, the basic impact of different incidence angles can be included in both 1DVAR and BMCI retrievals.
A bias-free convolution was demonstrated as long as the remapping does not involve a change in incidence angle. However, this is strictly true only for ICI-2V and ICI-3V. These two channels share bore sight with ICI-1V, but the antenna patterns differ somewhat and a remapping is still required. Figure 6 exemplifies the issues that appear for the other channels, with an incidence angle that differs from the one of ICI-1V (Table 1). Considerable remapping errors are found for ICI-11V. The elevation offset of this channel deviates to ICI-1V with 1.53 • , which scales to a ∼ 2 • lower incidence angle at surface level. This angular difference results in a remapping error even in the absence of hydrometeors, as exemplified by the upper-left portion of the simulated area. The remapping generates data that are 0.4-0.6 K too warm, as the toolbox cannot compensate for the original difference in incidence angle. The "clear-sky" brightness temperature is higher at a lower incidence angle.
The same effect can be noted for areas with relative homogeneous cloud distributions, but brightness temperatures vary more strongly with incidence angle in cloudy conditions and the error for ICI-11V is here instead about 0.5-2 K (see for example the area directly south of 0 • N, 10 • E). Further, at the edges of areas with hydrometeors even higher errors can be noted, as well as errors of opposite sign. These errors originate in horizontal inhomogeneities. The target footprint is defined at the altitude of the reference ellipsoid and the remapping is optimal for this altitude. However, the effective footprint for some altitude inside the atmosphere is the one at zero altitude projected upwards, following the incidence angle of each channel. That is, the footprints will not overlap perfectly, with a horizontal displacement that increases with altitude. As example, for ICI-11V it is about 700 m at 12 km. Hence, the noted errors match the change in brightness temperature for horizontal shifts of that order. However, the atmospheric data used in the simulations do not have this high horizontal resolution and the magnitude of these errors are just indicative.
The errors found for ICI-5V (Fig. 6) show a similar spatial pattern, but have the reversed sign and are of lower magnitude. This is expected as the zenith offset of ICI-5V is only 0.04 • smaller than the one of ICI-1V, in contrast to the larger, positive shift for ICI-11V.

Generation of retrieval database
Retrieval databases for ICI must so far be generated by radiative transfer simulations. The input to the simulations can be obtained from atmospheric models, providing a sufficiently detailed description of hydrometeors Brath et al., 2018). This approach relies on the model mimicking reality with sufficient accuracy, as it represents the estimated posterior distribution for the BMCI retrieval. Another option is to base the simulations directly on obser-Atmos. Meas. Tech., 13, 53-71, 2020 www.atmos-meas-tech.net/13/53/2020/ vations as far as possible. As the spatial resolution of ICI is limited, the most important input is information on vertical and horizontal structures in hydrometeor fields. Today such data are available through cloud radars, even on global scale by CloudSat (Stephens et al., 2002). The cloud radar data can be used in various ways. Some options are explored by Evans et al. (2002Evans et al. ( , 2012, while the results reported below are based on the methodology developed in Rydberg et al. (2007Rydberg et al. ( , 2009. The basic idea is to produce simulated passive observations that are consistent with the basic information provided by the radar, i.e. measured reflectivities. This is done for some assumptions of particle size and shape distributions. That is, external retrievals of IWC, for example, are not involved; the mapping from radar reflectivities to particle optical properties at the frequencies of the passive data is done by an internal, implicit retrieval. The retrieval database should contain simulations for a set of different particle assumptions, to reflect the variability and our limited knowledge of particle shapes and sizes. Remaining atmospheric data can be taken from some analysis (such as ECMWF's ERA5), but this still represents a drawback of the approach as consistency is not guaranteed. Most importantly, corrections are likely required to avoid improbable relative humidities where hydrometeors are present.

Test retrieval database
At this stage, the simulations are based on stretches of Cloud-Sat reflectivities, selected randomly and with no preference regarding longitudes, land-ocean etc. That is, the simulations have two dimensions, vertical and along the track. The ICI slant geometry and antenna pattern are represented fully inside this 2-D geometry. So far the extension of the antenna pattern in the across-track dimension is neglected, but can be included by mapping the CloudSat data to three dimensions (Rydberg et al., 2009). Consideration of the antenna pattern is required to avoid systematic modelling biases due to "beam filling" .
The procedure applied to map radar reflectivities to microwave radiances is described in detail by Ekelund et al. (2019). The radiative transfer calculations were performed with the ARTS software , using its interface to the RT4 (Evans and Stephens, 1995b) scattering solver. RT4 is applied following the "independent beam approximation" (see further Sect. 5), inside the twodimensional atmosphere formed based on the CloudSat data. Absorption due to gases and liquid water was treated as in Sect. 4.1.
The microphysical models applied are described in Table 2. For middle and high latitudes simulations with a modified gamma distribution (for two habits) were also produced, but the resulting D m was found to be unrealistically high (a significant fraction above even 2 mm) and this part of the database is here rejected. Oriented and melting particles are so far ignored.
Observations over both water and land were simulated. Ocean surface emissivity was modelled according to Prigent et al. (2017). Due to a lack of any model for ICI's frequency range, land emissivity was simply set to vary randomly around 0.9 (with a log-normal distribution). The data used below contain in total 6.2 × 10 6 cases, based on 1373 CloudSat orbits between September 2015 and January 2016.
Besides the retrieval database, a smaller dataset was also simulated for channels 16-21 of the Advanced Technology Microwave Sounder (ATMS) (Weng et al., 2012) and a statistical comparison to actual observations was made. The simulated data were generated exactly as done for the retrieval database, except that the footprint averaging followed the specifications of ATMS. Example results are displayed in Fig. 7. The peak in the distribution around 255 K corresponds to "clear-sky" situations (low-level cloud can still be present), while most cases below ∼230 K should contain influences of ice hydrometeors. The agreement between simulations and observations is high down to about 200 K. For lower brightness temperatures the simulations show higher occurrence rates than the observations. This deviation is at least partly a consequence of the full antenna pattern and particle orientation not yet being considered in the simulations. The better agreement for nadir simulations, where ATMS has a smaller footprint, indicates the impact of the first of these two effects. By assuming totally random particle orientation, radar back-scattering is under-estimated and our procedure Table 2. Combinations of particle size distribution and habit model included in the test retrieval database. McFarquhar and Heymsfield (1997) is shortened to MH97. Field et al. (2007) defined a tropical and a mid-latitude version of their size distribution, and both are used. Each habit model consists of single scattering data selected from Eriksson et al. (2018), where both name and ID number are specified. will generate clouds with a high bias in IWC. There is a compensating effect when simulating the passive data, by a similar under-estimation of extinction, but it is smaller, at least for angles away from nadir where particle orientation has a smaller impact on the projected cross section . The approach behind the database generation reproduces GMI (Draper et al., 2015) data in a similar manner, even when focusing on the tropical Pacific where deep convective systems control the impact of ice hydrometeors on ICI and the radiative transfer simulations are especially challenging .
A similar comparison is found in Fig. 13 of Geer and Baordo (2014). They obtained a poorer agreement with observations, with an underestimation starting at about 225 K. Similar particle models were used and the better agreement found here is likely a consequence of the simulations being based on CloudSat, and not model data. The agreement is similar for the other ATMS channels considered; see . A graphical manner for exploring whether the retrieval database covers the multi-dimensional space spanned by the observations to be inverted is found in Brath et al. (2018, Fig . 2).

Degrees of freedom
As an introduction to the information provided by ICI, Fig. 8 displays an estimate of the measurements' degrees of freedom (DoF) for tropical conditions. The DoF can be seen as a measure of the effective number of channels.
Each DoF value is calculated by finding the (left) eigenvectors (E) of the simulated set of measurement vectors in consideration (without noise added). These eigenvectors and the covariance matrix (S y ) of the data are related as follows: where is a diagonal matrix, holding the eigenvalues. See for example Eriksson et al. (2002) for further details. The uncertainty due to thermal noise, in the eigenvalue space, is where S has NE T 2 as its diagonal elements and is zero elsewhere (see Eq. 13). As S is diagonal, S will also be diagonal due to properties of the eigenvectors (orthonormality). The number of diagonal elements in S y that are larger than the corresponding value in S can be taken as the DoF. This calculation of DoF is essentially the same as the analysis described in Sect. 2.4.1 of Rodgers (2000), but is somewhat more general as it is based on S y and does not involve the Jacobian matrix, so it can be easily computed even in cases where Jacobians are not available. For very low IWP and most wet atmospheres, the DoF is only 2. For these conditions, ICI is primarily sensitive to humidity in the middle and upper troposphere. The DoF increases with decreasing IWV, as humidity at lower altitudes then gets a growing impact. The DoF is here about 3, consistent with the fact that ICI has three channels around each water vapour transition covered (1V-3V, 5V-7V and 9V-11V, respectively), and that there is a high redundancy in information between these groups of channels (which together give an improved precision for water vapour retrievals). Figure 4 shows that the two innermost 448 GHz channels cover higher altitudes than the other channels, but it appears that these two channels add little information in single-footprint retrievals due to relatively high noise (Table 1). A further analysis of ICI's overall performance for clear-sky conditions is left for a future study. For most dry atmospheres, there is also a surface contribution to the DoF, mainly by channels 4V and 4H, from the various variables affecting surface emission and reflectivity.
The DoF is considerably higher at high IWP. The maximum DoF in Fig. 8 is 8, but the true number is likely higher. The figure is based on simulations only including totally random particle orientation and thus the full information given by the dual polarisation channels is not reflected. The simulations also lack melting particles and still use a relatively low number of particle models, and the full variability of hydrometeors is probably not yet reflected.
There is an intermediate range, extending between about 10 and 500 g m −2 , where DoF is increasing with IWP. This analysis shows that ICI acts mainly as a coarse humidity sounder for IWP below ∼ 10 g m −2 , but, as designed, pro-Atmos. Meas. Tech., 13, 53-71, 2020 www.atmos-meas-tech.net/13/53/2020/ vides more rich data with increasing ice hydrometeor content. This indicates that ICI is suitable for measuring IWP, but the DoF gives no information on retrieval precision or if other quantities also can be constrained.

Overall performance
The retrieval performance was estimated by repeatedly dividing the data generated between a retrieval database and test data (Fig. 9). The algorithm described in Sect. 3 was followed, except that no footprint remapping or run of RTTOV was performed. Since particle orientation is not yet included, these retrievals did not include the extra 243 and 664 GHz channels that measure H polarisation. Noise was added following the NE T of Table 1, but present tests indicate that lower noise will actually be achieved. Both these aspects should lead to a conservative estimate of the performance at low IWP, or compensate for error sources not yet considered. The results in Fig. 9 are averages of retrievals over both water and land. The best performance is found for tropical conditions where IWP above about 50 g m −2 can be retrieved without a clear bias. ICI also provides information for lower IWP, down to about 10 g m −2 , but then with an increasing influence of prior information causing a low bias. This bias occurs because the prior data are dominated by cases with IWP = 0. Accordingly, the bias could be decreased strongly by an independent method of cloud detection, effectively removing all, or most, IWP = 0 from the prior distribution.
The retrieval precision in Fig. 9 is reported as the range between the 5th and 95th percentile. This range corresponds to a 50 % uncertainty above about 200 g m −2 . The precision is poorer for lower IWP, particularly on the 5th percentile side. This percentile reaches IWP = 0 when the true value is ∼ 15 g m −2 . Mean altitude, Z m , is well estimated over its full range (for the type of ice clouds of concern for ICI), i.e. between about 4 and 12 km, with a median precision in the order of 700 m. The retrieval of D m is best between 175 and 400 µm, where the median precision is about 50 µm, but the retrievals should be competitive between about 100 and 800 µm.
As a contrast, results for mid-latitude late-autumn-winter conditions are also found in Fig. 9. There are likely higher uncertainties in these simulations (e.g. they involve only a single particle habit) and these results should be approached with more care. Compared to tropical conditions, the performance is poorer, especially for IWP below 100 g m −2 . This is the case because the ice hydrometeors here are found at lower altitudes, often below the sounding range of the high-frequency channels. Low IWP is best estimated by the 664 GHz channels, but they have sensitivity only down to about 5 km (Fig. 4). Low-altitude clouds also make the choice of τ s t (Eq. 7) critical. For these test retrievals, it was set to 1 for oceans and 3 for all other surface types. Z m is retrieved without any significant bias between 2 and 10 km, but the posterior distribution is highly skewed below 3 km. That is, the 50th percentile is in general a good estimate, but the retrieval cannot fully rule out considerably higher Z m . The accuracy is good for D m between 150 and 600 µm, while there is a quickly growing low bias above 650 µm.
These results do not deviate significantly from earlier similar studies. The most similar one is Jimenez et al. (2007), particularly as it also used IWP, Z m and D m as retrieval quantities. They found a better retrieval performance for low IWP, which likely is due to a smaller retrieval database and fewer considered error sources. Our results should be more realistic, albeit somewhat conservative, as explained at the start of this section. Wang et al. (2017) made a study focusing on relatively severe weather over Europe and obtained similar IWP accuracy to that reported here. They did not consider Z m and D m , but retrieval of separate hydrometeor classes as well as joint inversion of data from MWI and ICI. When comparing results between studies, the error range definition considered must be noted. We use a wider range (matching ±2σ ) compared to most others.

Outlook
The basic algorithm will not be modified until some time after the launch of ICI and the main concern for the coming years is to refine the retrieval database generation. A required extension is to include particle orientation, as shown by Defer et al. (2014) and Gong and Wu (2017). The first data on scattering properties at sub-millimetre wavelengths of oriented particles have just been presented . Varying orientation distributions should be used in the database generation. Scattering solvers handling oriented particles include RT4 (Evans and Stephens, 1995b) and DOIT (Emde et al., 2004). The performance for Z m and D m is shown for states with an IWP above 25 and 50 g m −2 for tropical and middle latitudes, respectively.
In the database used in this work, a strict separation between liquid and ice hydrometers was assumed. This is a simplification in several ways. Super-cooled liquid cloud droplets are common in the atmosphere (e.g. Zhang et al., 2010), frequently as part of "mixed-phase" clouds. Results in Pfreundschuh et al. (2019) indicate that ICI has some sen-sitivity to such super-cooled liquid water and it should thus be considered in future work. Also the super-cooled liquid water in updraft regions of convective cells should be taken into account, especially as the drops here can be of millimetre size and the liquid water content can reach several grams per cubic metre (Lawson et al., 2015). This should lead to a significant impact on both CloudSat and ICI data. Finally, the impact of melting ice hydrometeors should be assessed and included if found relevant. However, data on single scattering properties of such particles are still lacking for the frequency range of ICI. A broader range of particle size distributions and particle shapes should be used, compared to the simulations used in this work. The simulations should of course make use of most recent studies of these particle properties, preferably applying data tailored for each cloud type of concern. IS-MAR should be an essential tool for validating microphysical assumptions. The first study of this type has already been performed .
On the instrument side, a more detailed treatment of the full antenna pattern is needed. This will increase the calculation burden, but to what extent is not yet known. The present assumption is that an independent beam approximation (IBA) can be applied, i.e. that the radiance at one location can be sufficiently well estimated by a simulation of onedimensional character. Test simulations have revealed that this is not true for all situations, but full three-dimensional, polarised simulations can so far only be performed by computationally costly Monte Carlo methods, and therefore IBA would be preferable. The error by applying IBA is being assessed as part of a EUMETSAT fellowship project.
As discussed in Sect. 4.1, the necessary spatial remapping of channels causes some errors due to the differences in incidence angle. These remapping errors must either be incorporated in the generation of the database or be treated as an observation uncertainty. In the latter case, an error model must be derived to set S o (Eq. 9) accordingly. The information on temperature and ozone obtained from ECMWF (Sect. 3.4.3) has uncertainty and the resulting impact on the retrievals has not yet been studied. The same is true for errors in assumed spectroscopic parameters, used to calculate the absorption due to gases. As ICI will operate in a relatively unexplored wavelength region, considerably spectroscopic uncertainties cannot be ruled out at this point (Mattioli et al., 2019). IS-MAR should also be a useful tool for validation here.
A number of retrieval configuration settings need to be determined. For example, the optical thickness thresholds (τ s t , Eq. 7) should be reevaluated at some point, preferably with improved knowledge, obtained by ISMAR, of the variability of surface emissivity at the frequencies of ICI. Another example is that a clear strategy for the database thinning discussed in Sect. 3.5.1 is lacking, and only rudimentary tests have so far been made.
In a longer perspective, joint inversions of data from MWI and ICI shall be considered. Such synergistic retrievals Atmos. Meas. Tech., 13, 53-71, 2020 www.atmos-meas-tech.net/13/53/2020/ should be especially beneficial for obtaining consistent data on liquid and ice hydrometeor properties . The remapping toolbox is prepared to handle this extension, but application of BMCI becomes more problematic as dealing with the combined measurements drastically increases the required retrieval database size. Machine learning could be an alternative. In Pfreundschuh et al. (2018) it is shown that quantiles of the posterior distribution can be estimated by neural networks more efficiently than with BMCI.

Conclusions
Ice hydrometeors presently constitute one of the components in Earth's atmosphere that are least constrained by observation and modelling systems. There is even a persistent large spread among zonal means of IWP (Waliser et al., 2009;Eliasson et al., 2011;Duncan and Eriksson, 2018). ICI will provide observations that could be used to decrease these uncertainties inside both weather forecasting and stand-alone retrievals, as well as by model verification through "satellite simulators". ICI does not offer the spatial resolution of cloud radars, such as the CloudSat one, but has the swath width needed for obtaining semi-global coverage on a daily basis. The focus of this article is the ICI retrieval algorithm ) that will be applied operationally at EU-METSAT. At the time of the algorithm selection, BMCI was judged a safer option than existing machine learning alternatives. However, since machine learning is developing rapidly, future scientific retrieval algorithms may well employ it.
The "day-one" algorithm described here aims to extract the basic information of ICI on ice hydrometeors, which is the ice water path, as well as cloud altitude and particle size. ICI also has the potential to provide profiles of ice water content Birman et al., 2017;Grützun et al., 2018;Aires et al., 2019), but that possibility has been so far left for research groups to investigate.
An innovative aspect of the new algorithm is that, to our best knowledge, it is the first example where the retrieval result is presented fully as a description of the posterior distribution (by reporting five percentiles), and not as the expectation value and some uncertainty value. This more general approach is preferable for ICI as the retrieval uncertainty can exhibit a highly skewed distribution.
The core algorithm has successfully been tested using IS-MAR and simulated ICI data, but the final retrieval performance is mainly determined by the quality of the retrieval database provided to the processing system. Such a database has been produced for test purposes and to provide updated estimates of the retrieval precision. The database reflects the state of the art, but the retrieval error estimates should still be considered as tentative because some tools needed to cover the full complexity of the observations are still lacking.
It is hard to find stringent uncertainty estimates of other IWP retrievals, but we note that the global mean of IWP given by the DARDAR inversions (mainly based on Cloud-Sat) changed by 26 % between the two most recent versions (Cazenave et al., 2019). Based on present simulations, ICI will deliver a similar accuracy at least above IWP = 200 g m −2 . Above this IWP, there is no intrinsic cause of bias in the retrievals, and the precision for single retrievals is ±50 % (at quantiles matching ±2σ ).
The use of ICI in numerical weather prediction (NWP) is not discussed here, but several activities described are also relevant for this application. The most notable example should be the development of ice hydrometeor single scattering data , which is of direct relevance for "all-sky" assimilation of ICI radiances. A problem common for NWP and stand-alone retrievals is how to incorporate the effect of ice particle orientation without making the radiative transfer calculations too costly. This extension is required to make full use of ICI's double polarisation channels at 243 and 664 GHz.
In this paper we have tried to reflect the efforts already performed to prepare for inversions using ICI, but also to indicate the work that remains to be done. Combining the data of ICI and MWI is an especially interesting prospect for future extensions. That combination can possibly provide a relatively full view of water in all of its three phases (gas, liquid and ice).
As a last remark we would like to stress that ICI will provide the first "operational" observations of our atmosphere in the sub-millimetre region and its data will cover more than 20 years. This will give the weather forecasting and climate communities a new important data source.
Code availability. The footprint remapping toolbox can be obtained by contacting EUMETSAT. It will be distributed "as is" with no warranties and on the condition of no redistribution, and this article and the EUMETSAT study with contract EUM/C0/18/4600002075/CJA are cited where used. Other numerical results in the article are based on various MATLAB and Python scripts, kept in local SVN repositories, that can be obtained by contacting the first two authors. Usage of these scripts requires assess to the ARTS and Atmlab packages, available at http://www. radiativetransfer.org/ (last access: 15 December 2019).
Author contributions. PE and BR participated in most of the activities described and lead the manuscript writing. VM and CA contributed to the work, supervising and reviewing the algorithm development in its various phases, and contributed to the paper writing. AT is in charge of the algorithm development inside NWC-SAF and has revised the paper. UK is the main ICI scientist at ESA/ESTEC and has provided the technical description. SAB has contributed input and text, and revised the paper.
Acknowledgements. The first version of the algorithm, based on neural nets, was outlined by Gerrit Holl as project scientist at SMHI. Figure 8 was produced by Simon Pfreundschuh (Chalmers). The science advisory group around MWI and ICI has given feedback on the work described here. This project would not have been possible without the many individuals that are contributing to the technical development of ICI and the development of the ARTS infrastructure.
Financial support. This research has been supported by the Swedish National Space Agency (grant no. 169/16) and the Deutsche Forschungsgemeinschaft (project no. 390683824). The development of footprint remapping routines for ICI was performed under the EUMETSAT study "Application of optimal interpolation procedures to EPS-SG MWI and ICI", contract EUM/C0/18/4600002075/CJA. Review statement. This paper was edited by Alexander Kokhanovsky and reviewed by Ralph Ferraro and two anonymous referees.