Self-paced

Explore our extensive collection of courses designed to help you master various subjects and skills. Whether you're a beginner or an advanced learner, there's something here for everyone.

Bootcamp

Learn live

Join us for our free workshops, webinars, and other events to learn more about our programs and get started on your journey to becoming a developer.

Upcoming live events

Learning library

For all the self-taught geeks out there, here is our content library with most of the learning materials we have produced throughout the years.

It makes sense to start learning by reading and watching videos about fundamentals and how things work.

Search from all Lessons


Start interactive tutorial

← Back to Projects

Recommendation Systems - Your Future with Data

Difficulty

  • intermediate

Average duration

5 hrs

Technologies

Difficulty

  • intermediate

Average duration

5 hrs

Technologies

🌱 How to start this project

This project aims to build a supervised classification model that, based on demographic and socioeconomic data of an adult (age, education level, occupation, marital status, country of origin, etc.), predicts whether the person will earn more or less than $50,000 per year.

Based on the model's results, students must develop an interpretative recommendation system capable of suggesting possible strategies or changes to increase the likelihood of surpassing that income threshold.

Objectives

  • Explore census data.
  • Build socioeconomic profiles.
  • Analyze the importance and weight of social variables (education, gender, race, etc.) in economic predictions.
  • Apply recommendation system techniques.
  • Visualize and professionally communicate findings.

🌱 How to start this project

Follow these instructions:

  1. Create a new repository based on the Machine Learning project template by clicking here.
  2. Open the newly created repository in Codespace using the Codespace button extension.
  3. Once the Codespace's VSCode has finished loading, start your project by following the instructions below.

📝 Instructions

  1. Load the dataset. We will use the Adult Income Dataset, also known as "Census Income". This information was collected by the U.S. Census Bureau and downloaded by the academy to store it in this project folder under the name adult-census-income.csv. Alternatively, you can load it directly in your code from the following link:

    1https://raw.githubusercontent.com/4GeeksAcademy/predicting-your-future-with-data/main/adult-census-income.csv

    This dataset includes variables such as:

    • Age
    • Education level
    • Marital status
    • Occupation
    • Hours worked per week
    • Gender
    • Country of origin
    • Annual income (>50K or <=50K)
  2. Data preprocessing. Clean null or misencoded data, transform categorical variables, and normalize numerical variables.

  3. Define the recommendation problem. Plan how you will structure your recommendation system:

    • What is being recommended?
    • Who is the "user" in this case?
    • What variables define a user's profile?
  4. Build the recommendation system. Use one of the following approaches:

    • Content-based filtering. Represent each user as a vector and calculate similarities between users and recommendations.

    • Collaborative filtering. Simulate a user vs. trajectory matrix. Apply k-NN, Pearson correlation, or matrix factorization.

    • Hybrid system. Combine both approaches.

  5. Test with simulated cases. Build simulated profiles of hypothetical users and observe what trajectories (education, occupation, etc.) the system would recommend to improve their estimated income.

    1# Example: 25-year-old user, high school graduate, works part-time 2user_profile = {...}

🚛 How to deliver this project

Once you have completed the practical case, make sure to commit your changes, push them to your repository, and go to 4Geeks.com to submit the repository link.

Signup and get access to this project for free

We will use it to give you access to your account.
Already have an account? Login here.

By signing up, you agree to the Terms and conditions and Privacy policy.

Difficulty

  • intermediate

Average duration

5 hrs

Technologies

Difficulty

  • intermediate

Average duration

5 hrs

Technologies

Difficulty

  • intermediate

Average duration

5 hrs

Technologies

Difficulty

  • intermediate

Average duration

5 hrs

Technologies

Signup and get access to this project for free

We will use it to give you access to your account.
Already have an account? Login here.

By signing up, you agree to the Terms and conditions and Privacy policy.

Difficulty

  • intermediate

Average duration

5 hrs

Technologies

Difficulty

  • intermediate

Average duration

5 hrs

Technologies