Explore our extensive collection of courses designed to help you master various subjects and skills. Whether you're a beginner or an advanced learner, there's something here for everyone.
Join us for our free workshops, webinars, and other events to learn more about our programs and get started on your journey to becoming a developer.
For all the self-taught geeks out there, here is our content library with most of the learning materials we have produced throughout the years.
It makes sense to start learning by reading and watching videos about fundamentals and how things work.
Data Science and Machine Learning - 16 wks
Full-Stack Software Developer - 16w
Search from all Lessons
Curated list of small interactive and incremental exercises you can take to get better at any coding skill.
Curated section of projects to build while learning with simple instructions, videos, solutions, and more.
Guides on different topics related to the technologies that we teach in our courses
Follow the instructions below:
Once you have finished solving the exercises, be sure to commit your changes, push to your repository and go to 4Geeks.com to upload the repository link.
Would we be able to predict which movies might or might not be a commercial success? This dataset collects part of the knowledge from the API TMDB, which contains only 5000 movies out of the total number. The following resources are available:
We must load the two files and store them in two separate data structures (Pandas DataFrames). On one side we will have stored the information of the movies and their credits.
Create a database to store the two DataFrames in separate tables. Then join the two tables with SQL (and integrate it with Python) to generate a third table containing information from both tables unified. The key through which the join can be done is the title of the movie (
Now, clean the generated table and leave only the following columns:
As you can see, there are some JSON formatted columns. Select, from each of the JSONs, select the
name attribute and replace the
keywords columns. For the
cast column, select the first three names.
The only columns left to modify are
crew (team) and
overview (summary). For the first column, convert it to contain the name of the director. For the second, convert it to a list.
Once we have finished processing the columns and the recommendation model is not confused, for example, between Jennifer Aniston and Jennifer Conelly, we will remove the spaces between the words. Apply this function to the columns
Finally, we will reduce our dataset by combining all of our previous converted columns into a single column called
tags (which we will create). This column will now have all the elements separated by commas and then we will replace them with blanks. It should look something like this:
To solve this problem we will create our own KNN. The first thing to do is to vectorize the text following the same steps you learned in the previous lesson.
Once you have done that, we would have to choose a distance to compare text. In this module we have seen a few, and the only one compatible with what we want to do is the
With this code we can see the similarity between our vectors (vector representations of the
Finally, we can design our similarity function based on the cosine distance. Our proposal is as follows:
In such a way that we would return the 5 movies most similar to the one we enter in the title. We could use it as follows: