There are two schools of thoughts when it comes to any analytics models .
- The data obtained from the experiment is accurate , the model that is applied on it isn't accurate hence the errors .
- The data obtained from the experiment has errors but the model applied accurate .
There are many regression models , for starters we are going to discuss about Ordinary Least Squares (OLS) , Total Least Squares (TLS).
- OLS : OLS follows the first school of thought .
The optimization problem is :
$ min(\sum _{i=1}^{N} (y_i - \alpha x_i)^2)$ .
Now we are going to employ Matrix manipulation (ie)
$ Y = [y_1,y_2,....,y_n]^T$ ,
$ X = [\vec{x_1},\vec{x_2},....,\vec{x_n},\vec{1} ]^T$ ,
$ \alpha = [\alpha _{1},\alpha _{2},....,\alpha _{n},\beta]^T$ .
As can be seen the subscripts are the indices of the datapoints.Now the problem becomes $ E= min((X \alpha - Y)^T (X \alpha - Y))$ .Now expanding E , setting $ \vec{\nabla}(E) = 0 $ we get :
$ \implies \vec{ \nabla }((X \alpha )^T(X \alpha )-(X \alpha )^T Y-Y^T (X \alpha )+Y^T Y) = 0$
$ \implies \vec{ \nabla }(\alpha ^T (X^T X) \alpha - 2 (Y^T X)\alpha + Y^T Y )= 0$
$ \implies ( 2(X^T X) \alpha - 2 (Y^T X)^T= 0$
$ \implies ( 2(X^T X) \alpha =2 (Y^T X)^T= 0$
$ \implies \alpha = (X^T X)^{-1} ( X^T Y) - (1)$Hence as can be seen the above , equation (1) can be used for finding the regression model. - TLS : This model follows 2nd school of thought.
Optimisation problem : $ min(\sum _{i=1}^N ((y_i - \hat{y_i})^2 + (x_i^1 - \hat{x_i^1})^2+ (x_i^2 - \hat{x_i^2})^2 .... + (x_i^N - \hat{x_i^M})^2))$ .
Subject to $ y_i^* = \alpha _ 1 x_1^* + \alpha _ 2 x_2^* ... \alpha _ n x_n^*$.Here , $ \hat{x_i} ,\hat{y_i}$ are the correct data points that we are trying to predict.Let us now put the data points in a matrix form :
$ Z = [X, Y] =[\vec{x^1}, \vec{x^2},... \vec{x^M}, \vec{y}]$
The rank of [X,Y] matrix is $ (M+1)$ considering $ M<n$ .
The problem is thus reformulated as follows :
Find the smallest value of $ \hat{Z}$ which reduces the rank of the matrix $ Z + \hat{Z}$ to M .The above formulation , using Eckart-Young theorem can be used to obtain the solution of both $ \alpha$ as well as the $ [\hat{Z} + Z]$ matrix .Final Algorithm 1) $ [U,S,V] = svd(Z)$ .
2) $ \alpha = -V[1:(N-1),(M+1) ]/V[N,(M+1)]$ .
3) $ \hat{Z} = Z V[ : ,M+1] V[ : ,M+1]^T$Matlab code :[U,S,V] =svd(D(1:m1,:)); % SVD decomposition reg = V(:,end); % M+1 th column extraction reg = (-reg/reg(end)); % Dividing by the last value reg = reg(1:(end-1)); % Final regression model d = D + (-D*V(:,end))*(V(:,end)') ; % Transforming Data to new basis
Note: the obtained regression model applied on the new set of data would get bad results. As can be seen if the $ X$ is transformed to $ \hat{X}$ it would give you better results than OLS , but computation of $ \hat{X}$ is not possible on new set of data.
Matlab code link : https://github.com/RAAKASH/CH5440/tree/master/Assignment2 .
Reading Materials : http://people.duke.edu/~hpgavin/SystemID/CourseNotes/TotalLeastSquares.pdf
No comments:
Post a Comment