Machine Learning is a branch of Artificial Intelligence. It contains many algorithms to solve various real-world problems. Building a Machine learning model is not only the Goal of any data scientist but deploying a more generalized model is a target of every Machine learning engineer.
Regression is also one type of supervised Machine learning.
Regression
Regression is a type of Machine learning which helps in finding the relationship between independent and dependent variable.
In simple words, Regression can be defined as a Machine learning problem where we have to predict discrete values like price, Rating, Fees, etc.
R Square/Adjusted R Square
R Square measures how much variability in dependent variable can be explained by the model. It is the square of the Correlation Coefficient(R) and that is why it is called R Square.
R Square is a good measure to determine how well the model fits the dependent variables. However, it does not take into consideration of overfitting problem. If your regression model has many independent variables, because the model is too complicated, it may fit very well to the training data but performs badly for testing data. That is why Adjusted R Square is introduced because it will penalize additional independent variables added to the model and adjust the metric to prevent overfitting issues.
Advantages of MSE
- The graph of MSE is differentiable, so you can easily use it as a loss function.
Disadvantages of MSE
- The value you get after calculating MSE is a squared unit of output. for example, the output variable is in meter(m) then after calculating MSE the output we get is in meter squared.
- If you have outliers in the dataset then it penalizes the outliers most and the calculated MSE is bigger. So, in short, it is not Robust to outliers which were an advantage in MAE.
Mean Square Error (MSE)/Root Mean Square Error (RMSE)
While R Square is a relative measure of how well the model fits dependent variables, Mean Square Error is an absolute measure of the goodness for the fit.
MSE is calculated by the sum of square of prediction error which is real output minus predicted output and then divide by the number of data points. It gives you an absolute number on how much your predicted results deviate from the actual number. You cannot interpret many insights from one single result but it gives you a real number to compare against other model results and help you select the best regression model.
Root Mean Square Error(RMSE) is the square root of MSE. It is used more commonly than MSE because firstly sometimes MSE value can be too big to compare easily. Secondly, MSE is calculated by the square of error, and thus square root brings it back to the same level of prediction error and makes it easier for interpretation.
Advantages of RMSE
The output value you get is in the same unit as the required output variable which makes interpretation of loss easy.
Disadvantages of RMSE
It is not that robust to outliers as compared to MAE.
Mean Absolute Error (MAE)
Mean Absolute Error (MAE) is similar to Mean Square Error(MSE). However, instead of the sum of square of error in MSE, MAE is taking the sum of the absolute value of error.
Compare to MSE or RMSE, MAE is a more direct representation of sum of error terms. MSE gives larger penalization to big prediction error by square it while MAE treats all errors the same.
Advantages of MAE
- The MAE you get is in the same unit as the output variable.
- It is most Robust to outliers.
Disadvantages of MAE
- The graph of MAE is not differentiable so we have to apply various optimizers like Gradient descent which can be differentiable.
Multivariate regression
Non-Linear Regression
One thought on “Metrics for evaluating linear model, Multivariate regression, Non-Linear Regression”