Adaptative Transform Coding of Images Using a Mixture of Principal Components

Robert Douglas Dony, McMaster University


The optimal linear block transform for coding images is well known to be the Karhunen-Loeve transformation (KLT). However, the assumption of stationarity in the optimality condition is far from valid for images. Images are composed of regions whose local statistics may vary widely across an image. A new approach to data representation, a mixture of principal components (MPC), is developed in this thesis. It combines advantages of both principal components analysis and vector quantization and is therefore well suited to the problem of compressing images. The author proposes a number of new transform coding methods which optimally adapt to such local differences based on neural network methods using the MPC representation. The new networks are modular, consisting of a number of modules corresponding to different classes of the input data. Each module consists of a linear transformation, whose bases are calculatd during an initial training period. The appropriate class for a given input vector is determined by an optimal classifier. The performance of the resulting adaptive networks is shown to be superior to that of the optimal nonadaptive linear transformation, both in terms of rate-distortion and computational complexity. When applied to the problem of compressing digital chest radiographs, compresion rations of between 39:1 and 40:1 are possible without any significant loss in image quality. In addition, the quality of the images were consistently judged to be as good as or better than the KLT at equivalent compression ratios.

The new networks can also be used as segmentors with the resulting segmentation being independent of variations in illumination. In addition, the organization of the resulting class representations are analagous to the arrangement of the directionally sensitive columns in the visual cortex.