The **evaluation** of a model is one of the most important steps in the Machine Learning process, since it will let us know how good our model is, how much it has learned from the training sample (`train`

) and how it will perform for never seen or new data (`test`

and/or `validation`

).

To evaluate a model there are certain sets of metrics, which are distinguished according to whether a model allows classification or regression.

A **classification model** is used to predict a category or the class of an observation. For example, we might have a model that predicts whether an email is spam (1) or not spam (0), or whether an image contains a dog, a cat, or a bird. Classification models are useful when the output variable is categorical.

Metrics that can be applied to this type of models are as follows:

**Accuracy**. Measures the percentage of predictions that the model got right with respect to the total it made. For example, how many e-mails the model managed to classify well.**Recall**. Measures the proportion of true positives that the model was able to identify. For example, how many emails that are actual spam the algorithm managed to identify well, removing the non-spam emails that it misclassified.**F1 score**: This is the average of precision and recall. It is useful when classes are unbalanced.**AUC**: Describes the probability that a model classifies a randomly chosen positive instance higher than a randomly chosen negative one.

A **regression model** is used to predict a continuous value. For example, we might have a regression model that predicts the price of a house based on characteristics such as its size, number of bedrooms, and location. Regression models are useful when the output variable is continuous and numeric.

Metrics that can be applied to this type of model are as follows:

**Mean Absolute Error**(*MAE*). Mean absolute difference between predictions and actual values.**Mean Squared Error**(*MSE*). Similar to above, but squares the differences before performing the division.**Root Mean Squared Error**(*RMSE*). It is the square root of the MSE.**Coefficient of determination**($R^2$). Proportion of variation in the target that is predictable from the characteristics.

The `scikit-learn`

package makes it easy to apply these functions to models. The documentation is available here.