Dimensionality reduction in Bayesian estimation algorithms
Abstract. An idealized synthetic database loosely resembling 3-channel passive microwave observations of precipitation against a variable background is employed to examine the performance of a conventional Bayesian retrieval algorithm. For this dataset, algorithm performance is found to be poor owing to an irreconcilable conflict between the need to find matches in the dependent database versus the need to exclude inappropriate matches. It is argued that the likelihood of such conflicts increases sharply with the dimensionality of the observation space of real satellite sensors, which may utilize 9 to 13 channels to retrieve precipitation, for example.
An objective method is described for distilling the relevant information content from N real channels into a much smaller number (M) of pseudochannels while also regularizing the background (geophysical plus instrument) noise component. The pseudochannels are linear combinations of the original N channels obtained via a two-stage principal component analysis of the dependent dataset. Bayesian retrievals based on a single pseudochannel applied to the independent dataset yield striking improvements in overall performance.
The differences between the conventional Bayesian retrieval and reduced-dimensional Bayesian retrieval suggest that a major potential problem with conventional multichannel retrievals – whether Bayesian or not – lies in the common but often inappropriate assumption of diagonal error covariance. The dimensional reduction technique described herein avoids this problem by, in effect, recasting the retrieval problem in a coordinate system in which the desired covariance is lower-dimensional, diagonal, and unit magnitude.