4Geeks Coding Projects tutorials and exercises for people learning to code or improving their coding skills
Difficulty
beginnerRepository
Click to openVideo
Not available
Live demo
Not available
Average duration
2 hrs
Technologies
You will not be forking this time, please take some time to read these instructions:
Once you are finished creating your clustering project, make sure to commit your changes, push to your repository and go to 4Geeks.com to upload the repository link.
House clustering
We will create 6 housing clusters based only on their 'latitude','longitude', and 'medincome' column.
Dataset links:
https://raw.githubusercontent.com/4GeeksAcademy/k-means-project-tutorial/main/housing.csv
Step 1:
Install and import the necessary libraries: pandas, sklearn and seaborn.
Step 2:
Load the housing dataset and take a look at the first rows. Then create a new dataframe with only the 'latitude','longitude', and 'medincome' column to create our clusters.
Step 3:
Instantiate the k-means algorithm. Then, create a new 'cluster' feature in your dataset and predict the cluster by fitting the 3 columns you have. You can view the k-means documentation to implement it: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.
Step 4:
Convert your new 'cluster' column to a 'category' type.
Step 5:
Use seaborn's replot to visualize your new clusters.
Step 6:
As always, use your notebook to experiment and make sure you are getting the results you want.
Use your app.py file to save your defined steps, pipelines or functions in the right order.
In your README file, write a brief summary.
Solution guide:
https://github.com/4GeeksAcademy/k-means-project-tutorial/blob/main/solution_guide.ipynb
4Geeks Coding Projects tutorials and exercises for people learning to code or improving their coding skills
Difficulty
beginnerRepository
Click to openVideo
Not available
Live demo
Not available
Average duration
2 hrs
Technologies