Explore our extensive collection of courses designed to help you master various subjects and skills. Whether you're a beginner or an advanced learner, there's something here for everyone.
Join us for our free workshops, webinars, and other events to learn more about our programs and get started on your journey to becoming a developer.
For all the self-taught geeks out there, here is our content library with most of the learning materials we have produced throughout the years.
It makes sense to start learning by reading and watching videos about fundamentals and how things work.
Data Science and Machine Learning - 16 wks
Full-Stack Software Developer - 16w
Search from all Lessons
Curated list of small interactive and incremental exercises you can take to get better at any coding skill.
Curated section of projects to build while learning with simple instructions, videos, solutions, and more.
Guides on different topics related to the technologies that we teach in our courses
A regularized linear model is a version of a linear model that includes an element in its function to avoid overfitting and improve the learning capability of the model.
Generally speaking, a linear model (like the one we saw in the previous module) tries to find the relationship between the input variables and the output variable. However, if a linear model has too many parameters or if the data are very noisy, it can happen that the model fits the training data too well, producing a clear overfit and making it difficult to generalize well to new data.
To avoid this problem, regularized linear models add an extra term to penalize coefficient values that are too large. These models are linear regressions like those seen in the previous module but with the addition of a regularization term. The two types of models are:
Both techniques attempt to limit or "penalize" the size of the coefficients in the model. Imagine that we are fitting a line to points on a graph:
We can easily build a regularized linear model in Python using the
scikit-learn library and the
Ridge functions. Some of its most important parameters and the first ones we should focus on are:
alpha: This is the regularization hyperparameter. It controls how much we want to penalize high coefficients. A higher value increases the regularization and therefore the model coefficients tend to be smaller. Conversely, a lower value reduces it and allows higher coefficients. The default value is 1.0 and its range of values goes from 0.0 to infinity.
max_iter: This is the maximum number of iterations of the model.
Another very important parameter is the
random_state, which controls the random generation seed. This parameter is crucial to ensure replicability.
You can easily use
scikit-learn to program these methods after the EDA: