The timing and order of drug treatment affects cell death.
One solution: use the concepts from PCA to reduce dimensionality.
First step: Simply apply PCA!
Dimensionality goes from \(m\) to \(N_{comp}\).
Decompose X matrix (scores T, loadings P, residuals E) \[X = TP^T + E\]
Regress Y against the scores (scores describe observations – by using them we link X and Y for each observation)
\[Y = TB + E\]
The PCs for the X matrix do not necessarily capture X-variation that is important for Y.
We might miss later PCs that are important for prediction!
What if, instead of maximizing the variance explained in X, we maximize the covariance explained between X and Y?
We will find principal components for both X and Y:
\[X = T P^t + E\]
\[Y = U Q^t + F\]
\[\mathbf{w}_a = \dfrac{1}{\mathbf{u}'_a\mathbf{u}_a} \cdot \mathbf{X}'_a\mathbf{u}_a\]
https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated
\[\mathbf{w}_a = \dfrac{\mathbf{w}_a}{\sqrt{\mathbf{w}'_a \mathbf{w}_a}}\]
https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated
\[\mathbf{t}_a = \dfrac{1}{\mathbf{w}'_a\mathbf{w}_a} \cdot \mathbf{X}_a\mathbf{w}_a\]
https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated
\[\mathbf{c}_a = \dfrac{1}{\mathbf{t}'_a\mathbf{t}_a} \cdot \mathbf{Y}'_a\mathbf{t}_a\]
https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated
\[\mathbf{u}_a = \dfrac{1}{\mathbf{c}'_a\mathbf{c}_a} \cdot \mathbf{Y}_a\mathbf{c}_a\]
https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated
https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated
Janes et al, Nat Rev MCB, 2006
R2X provides the variance explained in X:
\[ R^2X = 1 - \frac{\lvert X_{\textrm{PLSR}} - X \rvert }{\lvert X \rvert} \]
R2Y shows the Y variance explained:
\[ R^2Y = 1 - \frac{\lvert Y_{\textrm{PLSR}} - Y \rvert }{\lvert Y \rvert} \]
If you are trying to predict something, you should look at the cross-validated R2Y (a.k.a. Q2Y).
sklearn.decomposition.PCA
and sklearn.linear_model.LinearRegression
sklearn.pipeline.Pipeline
sklearn.cross_decomposition.PLSRegression
M.fit(X, Y)
to trainM.predict(X)
to get new predictionsPLSRegression(n_components=3)
to set number of components on setupM.n_components = 3
after setup