You have reached the final project! If you look back, do it only to see how far you have come! Now the final step to the finish line.
We have built projects based on different business problems, from different industries, and using a variety of algorithms. Now it's time to build your own project using the algorithm that you think is right for your problem.
If an ML model makes a prediction in Jupyter, is anyone around to hear it? Probably not. Deploying models is the key to making them useful.
“Hard work always beats talent when talent doesn't work hard” - Tim Notke
You should deliver:
-the link to your project Github repo (already deployed).
-the link to your deployed machine learning web application.
Groups should ideally consist of three people. Minimum number of members is two people.
1. Problem Definition
Start by defining the business problem and then landing it into a machine learning problem.
“A problem defined is a problem half solved” - Albert Einstein
2. Data collection
How will you collect the data? Is it an existing public dataset? Will you have to merge data from different sources? Maybe do some web scraping?
This is a fundamental part because in a real life project, depending on the data you have, the problem can be solved with the existing data or maybe you will have to convince your client that paying for more data is really needed.
3. Exploratory Data Analysis
Explore your data as much as you can to find important patterns and relationships between features. Use graphs to explain this patterns. This will be important to show in your presentation.
4. Data Preprocessing
Clean your data to build a good model, because poor quality data will always produce faulty results. You can go back to your data preprocessing module to remember all the steps needed.
If this is a classification problem, is your data balanced? If not, consider resampling it, or make sure to pick the correct evaluation metric.
Does your data have a lot of outliers? Are they normal values from your population or should you drop them? or even better, impute them?
Will your model require normalization? Maybe you are dealing with a robust algorithm to train and normalization is not needed.
Make yourself all this questions before training your model. Who knows these are the same questions you will be asked at your presentation.
Remember: Garbage in, garbage out.
5. Model and results
Pick one or more algorithms to train, evaluate and hypertune. Choose the one you will be working with and save it for the deployment step.
Build a machine learning web application using your saved model. You can use Flask, Streamlit or any other tool that you know. Use Heroku or another cloud computing platform that you prefere to deploy your web application and share it with the world.
The presentation will last 5 minutes per group, so make sure to use your time efficiently. The code will be reviewed by us, so do not waste time explaining your code. You should focus in the important points as if you were trying to sell your project to the stakeholders of your company. They probably won't have a technical background (maybe they will), so try using simple words and an easy to understand notebook presentation. Remember that quality beats quantity.
Important points recommended to mention in your 5 minute presentation:
What was the business problem
How did you collect the data
Important patterns found in the data
What algorithm and evaluation metric did you use to build your final model
Show your web application working and mention how can it be improved in the future.
“The secret of getting ahead is getting started.” - Mark Twain