The PLSR method is, like PCA, a bilinear technique which expresses a data matrices as a matrix product of two matrices:
| (1) | |||
| (2) |
where the scores (
and
) are not the same as the
scores found in PCA. These equations are often referred to as the outer relations.
In PCR we have performed the latent variable projection in
independently of whether it is relevant for the prediction in
. It is often that the principal components in
do not
represent the best directions that are relevant for the prediction of
. In the Partial Least Squares regression (PLSR) scheme we
find new latent variables for
that are relevant for the
prediction of
.
In addition to the outer relations we also have the
inner relation which relates the
and
scores as follows:
where
is a regression coefficient.
We can write the inner relation as:
where
is a diagonal matrix. So we can write:
The main idea in PLSR is to ensure that the latent vectors in
have maximum relevance for
. We can formulate this as
we find a vector
in column space of
:
and a vector
in the column space of
:
such that the squared covariance between
and
is
maximized:
for
Bjørn Alsberg 2005-02-18