Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Chemical Engineering


Professor John F. MacGregor


In this thesis, various multivariate statistical regression methods are investigated for estimating process models from the process input-output data. These identified models are to be used for designing model based controllers and experimental optimisation of multivariate processes. The following issues are explored: (i) identification of finite impulse response models for model based control; (ii) multi-output identification tor multivariate processes; (iii) recursive updating of process models for adaptive control and prediction; and (iv) experimental design in latent variables for high dimensional systems.

In the first part of the thesis, various approaches to identifying non-parsimonious finite impulse response (FIR) models are compared on the basis of closeness of fit to the true process, robust stability provided by the resulting model, and the control performance obtained. The major conclusion by all assessments is that obtaining FIR models by first identifying low order transfer function models by prediction error methods is much superior to any of the methods which directly identitY the FIR models.

In the second part, the potential of multi-output identification for multivariate processes is investigated via simulations on two process examples: a quality control example and an extractive distillation column. The identification of both the parsimonious transfer function models using multivariate prediction error methods and of non-parsimonious FIR models using multivariate statistical regression methods such as two-block partial least squares (PLS2), canonical correlation regression (CCR), reduced rank regression (RRR) are considered. The multi-output identification methods provide better results when compared to the single-output identification methods based on essentially all comparison criteria. The benefits for using multi-output identification are most obvious when there are limited amount of data and when the secondary output variables have better signal to noise ratios.

In the third part of this thesis, an improvement to the PLS algorithm is made. It is shown that only one of either the X or the Y matrix needs to be deflated during the sequential process of computing latent vectors. This result then leads to two very fast PLS kernel algorithms. Using these improved kernel algorithms, a new and fast recursive, exponentially weighted PLS algorithm is developed. The recursive PLS algorithm provides much better performance than the recursive least squares algorithm when applied to adaptive control of a simulated 2 by 2 multivariable continuous stirred tank reactor and updating of a multi-output prediction model for an industrial mineral flotation circuit.

Finally, a design methodology similar to the evolutionary operation (EVOP) and the response surface methodology (RSM) for optimisation of high dimensional system is proposed. A variation of the PLS algorithm, called selective PLS, is developed. It can be used to analyse the process data and select meaningful groupings of the process variables in which the EVOP/RSM experiments can be performed.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."