Self-paced

Explore our extensive collection of courses designed to help you master various subjects and skills. Whether you're a beginner or an advanced learner, there's something here for everyone.

Bootcamp

Learn live

Join us for our free workshops, webinars, and other events to learn more about our programs and get started on your journey to becoming a developer.

Upcoming live events

Learning library

For all the self-taught geeks out there, here is our content library with most of the learning materials we have produced throughout the years.

It makes sense to start learning by reading and watching videos about fundamentals and how things work.

Search from all Lessons


LoginGet Started
← Back to Lessons
Edit on Github
Open in Colab

Model Evaluation

Evaluation of a model

The evaluation of a model is one of the most important steps in the Machine Learning process, since it will let us know how good our model is, how much it has learned from the training sample (train) and how it will perform for never-before-seen or new data (test and/or validation).

To evaluate a model, there are certain sets of metrics that are distinguished according to whether a model allows classification or regression.

Metrics for classification models

A classification model is used to predict a category or the class of an observation. For example, we might have a model that predicts whether an email is spam (1) or not spam (0), or whether an image contains a dog, a cat, or a bird. Classification models are useful when the output variable is categorical.

Metrics that can be applied to these types of models are as follows:

  • Accuracy. Measures the percentage of predictions that the model got right with respect to the total it made. For example, how many emails did the model manage to classify well?
  • Recall. Measures the proportion of true positives that the model was able to identify. For example, how many emails that are actual spam did the algorithm manage to identify well, removing the non-spam emails that it misclassified?
  • F1 score: This is the average of precision and recall. It is useful when classes are unbalanced.
  • Area Under the Curve (AUC): Describes the probability that a model classifies a randomly chosen positive instance higher than a randomly chosen negative one.

Metrics for Regression Models

A regression model is used to predict a continuous value. For example, we might have a regression model that predicts the price of a house based on characteristics such as its size, number of bedrooms, and location. Regression models are useful when the output variable is continuous and numeric.

Metrics that can be applied to this type of model are as follows:

  • Mean Absolute Error (MAE). Mean absolute difference between predictions and actual values.
  • Mean Squared Error (MSE). Similar to above, but squares the differences before performing the division.
  • Root Mean Squared Error (RMSE). It is the square root of the MSE.
  • Coefficient of determination (R2R^2). Proportion of variation in the target that is predictable from the characteristics.

The scikit-learn package makes it easy to apply these functions to models. The documentation is available here.