← Back to Projects

Machine Learning Final Project

Goal

4Geeks Coding Projects tutorials and exercises for people learning to code or improving their coding skills

Difficulty

beginner

Repository

Click to open

Video

Not available

Live demo

Not available

Average duration

2 hrs

Technologies

  • You have reached the final project! If you look back, do it only to see how far you have come! Now the final step to the finish line.

  • We have built projects based on different business problems, from different industries, and using a variety of algorithms. Now it's time to build your own project using the algorithm that you think is right for your problem.

  • If an ML model makes a prediction in Jupyter, is anyone around to hear it? Probably not. Deploying models is the key to making them useful.

“Hard work always beats talent when talent doesn't work hard” - Tim Notke

🌱 How to start this project

  1. Create a new repository based on machine learning project by clicking here.
  2. Open the recently created repository on Gitpod by using the Gitpod button extension.
  3. Once Gitpod VSCode has finished opening, add or edit any necessary files or folders to make your project structure ready for deployment.
  4. Start your project following the Instructions below.

🚛 How to deliver this project

You should deliver:

-the link to your project Github repo (already deployed).

-the link to your deployed machine learning web application.

📝 Instructions

Group formation

Groups should ideally consist of three people. Minimum number of members is two people.

Project Phases

1. Problem Definition

Start by defining the business problem and then landing it into a machine learning problem.

“A problem defined is a problem half solved” - Albert Einstein

2. Data collection

How will you collect the data? Is it an existing public dataset? Will you have to merge data from different sources? Maybe do some web scraping?

This is a fundamental part because in a real life project, depending on the data you have, the problem can be solved with the existing data or maybe you will have to convince your client that paying for more data is really needed.

3. Exploratory Data Analysis

Explore your data as much as you can to find important patterns and relationships between features. Use graphs to explain this patterns. This will be important to show in your presentation.

4. Data Preprocessing

Clean your data to build a good model, because poor quality data will always produce faulty results. You can go back to your data preprocessing module to remember all the steps needed.

If this is a classification problem, is your data balanced? If not, consider resampling it, or make sure to pick the correct evaluation metric.

Does your data have a lot of outliers? Are they normal values from your population or should you drop them? or even better, impute them?

Will your model require normalization? Maybe you are dealing with a robust algorithm to train and normalization is not needed.

Make yourself all this questions before training your model. Who knows these are the same questions you will be asked at your presentation.

Remember: Garbage in, garbage out.

5. Model and results

Pick one or more algorithms to train, evaluate and hypertune. Choose the one you will be working with and save it for the deployment step.

6. Deployment

Build a machine learning web application using your saved model. You can use Flask, Streamlit or any other tool that you know. Use Heroku or another cloud computing platform that you prefere to deploy your web application and share it with the world.

Presentation

The presentation will last 5 minutes per group, so make sure to use your time efficiently. The code will be reviewed by us, so do not waste time explaining your code. You should focus in the important points as if you were trying to sell your project to the stakeholders of your company. They probably won't have a technical background (maybe they will), so try using simple words and an easy to understand notebook presentation. Remember that quality beats quantity.

Important points recommended to mention in your 5 minute presentation:

  • What was the business problem

  • How did you collect the data

  • Important patterns found in the data

  • What algorithm and evaluation metric did you use to build your final model

  • Show your web application working and mention how can it be improved in the future.

“The secret of getting ahead is getting started.” - Mark Twain

Goal

4Geeks Coding Projects tutorials and exercises for people learning to code or improving their coding skills

Difficulty

beginner

Repository

Click to open

Video

Not available

Live demo

Not available

Average duration

2 hrs