Self-paced

Explore our extensive collection of courses designed to help you master various subjects and skills. Whether you're a beginner or an advanced learner, there's something here for everyone.

Bootcamp

Learn live

Join us for our free workshops, webinars, and other events to learn more about our programs and get started on your journey to becoming a developer.

Upcoming live events

Learning library

For all the self-taught geeks out there, here is our content library with most of the learning materials we have produced throughout the years.

It makes sense to start learning by reading and watching videos about fundamentals and how things work.

Search from all Lessons


LoginGet Started
← Back to Lessons
Edit on Github
Open in Colab

Pandas Exercises And Solutions

Pandas exercises and solutions using Jupiter Notebooks

The following list below are solutions for the pandas exercises given in the pandas python lesson at 4Geeks.com, click here to access the exercise instructions.

In [ ]:
import numpy as np
import pandas as pd

np.random.seed(42)

Exercise 01

In [2]:
# From list
l = [1, 2, 3, 4, 5, 6]
serie = pd.Series(l)
print(serie)

# From NumPy array
array = np.array([1, 2, 3, 4, 5, 6])
serie = pd.Series(array)
print(serie)

# From dictionary
d = {"A": 1, "B": 2, "C": 3}
serie = pd.Series(d)
print(serie)
0    1
1    2
2    3
3    4
4    5
5    6
dtype: int64
0    1
1    2
2    3
3    4
4    5
5    6
dtype: int64
A    1
B    2
C    3
dtype: int64

Exercise 02

In [3]:
# From NumPy array
array = np.random.randint(1, 10, size = (5, 5))
dataframe = pd.DataFrame(array)
dataframe
Out[3]:
01234
074857
137854
288365
328625
416913
In [4]:
# From dictionary
d = {
    "A": np.random.randint(10, 100, size = 5),
    "B": np.linspace(1, 10, 5),
    "C": np.random.randn(5)
}
dataframe = pd.DataFrame(d)
dataframe
Out[4]:
ABC
0641.00-0.600254
1733.250.947440
2125.500.291034
3607.75-0.635560
41610.00-1.021552
In [5]:
# From list of tuples
t = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
dataframe = pd.DataFrame(t)
dataframe
Out[5]:
012
0123
1456
2789

Exercise 03

In [6]:
s1 = pd.Series([1, 2, 3, 4, 5])
s2 = pd.Series([4, 5, 6, 7, 8])

# Method 1
dataframe = pd.DataFrame({"ser1": s1, "ser2": s2})
dataframe = pd.DataFrame({"ser1": s1, "ser2": s2}, index = s1.index)
dataframe
Out[6]:
ser1ser2
014
125
236
347
458
In [7]:
# Method 2
dataframe = pd.concat([s1, s2], axis = 1)
dataframe
Out[7]:
01
014
125
236
347
458
In [8]:
# Method 3
s1.name = "ser1"
s2.name = "ser2"

dataframe = s1.to_frame().join(s2)
dataframe
Out[8]:
ser1ser2
014
125
236
347
458

Exercise 04

In [9]:
s1 = pd.Series([1, 2, 3, 4, 5])
s2 = pd.Series([4, 5, 6, 7, 8])

# Method 1: Using Pandas function
filtering_results = s1.isin(s2)
indices = s1[filtering_results].index

indices
Out[9]:
Index([3, 4], dtype='int64')
In [10]:
# Method 2: Using NumPy function
indices = np.where(s1.isin(s2))
indices
Out[10]:
(array([3, 4]),)
In [11]:
# Method 3: Using Python
indices = []

for value in s1.values:
    if value in s2.values:
        indices.append(s1[s1 == value].index[0])
indices
Out[11]:
[3, 4]

Exercise 05

In [12]:
s1 = pd.Series([1, 2, 3, 4, 5])
s2 = pd.Series([4, 5, 6, 7, 8])

# Method 1
unique_s1 = s1[~s1.isin(s2)]
unique_s2 = s2[~s2.isin(s1)]

unique_elements = np.concatenate([unique_s1, unique_s2])
unique_elements
Out[12]:
array([1, 2, 3, 6, 7, 8])
In [13]:
# Method 2
concat = pd.concat([s1, s2])
unique_elements = concat[~concat.duplicated(keep = False)].values
unique_elements
Out[13]:
array([1, 2, 3, 6, 7, 8])

Exercise 06

In [14]:
df = pd.DataFrame(np.random.rand(10, 5) * 10, columns = [f"Col {i}" for i in range(5)])
df
Out[14]:
Col 0Col 1Col 2Col 3Col 4
04.9517690.3438859.0932042.5878006.625223
13.1171115.2006805.4671031.8485459.695846
27.7513289.3949898.9482745.9790009.218742
30.8849251.9598290.4522733.2533033.886773
42.7134908.2873753.5675332.8093455.426961
51.4092428.0219700.7455069.8688697.722448
61.9871570.0552218.1546147.0685737.290072
77.7127030.7404473.5846571.1586918.631034
86.2329813.3089800.6355843.1098233.251833
97.2960626.3755758.8721274.7221491.195942
In [15]:
df.sort_values("Col 0")
Out[15]:
Col 0Col 1Col 2Col 3Col 4
30.8849251.9598290.4522733.2533033.886773
51.4092428.0219700.7455069.8688697.722448
61.9871570.0552218.1546147.0685737.290072
42.7134908.2873753.5675332.8093455.426961
13.1171115.2006805.4671031.8485459.695846
04.9517690.3438859.0932042.5878006.625223
86.2329813.3089800.6355843.1098233.251833
97.2960626.3755758.8721274.7221491.195942
77.7127030.7404473.5846571.1586918.631034
27.7513289.3949898.9482745.9790009.218742
In [16]:
df.sort_values(by = ["Col 2", "Col 4"])
Out[16]:
Col 0Col 1Col 2Col 3Col 4
30.8849251.9598290.4522733.2533033.886773
86.2329813.3089800.6355843.1098233.251833
51.4092428.0219700.7455069.8688697.722448
42.7134908.2873753.5675332.8093455.426961
77.7127030.7404473.5846571.1586918.631034
13.1171115.2006805.4671031.8485459.695846
61.9871570.0552218.1546147.0685737.290072
97.2960626.3755758.8721274.7221491.195942
27.7513289.3949898.9482745.9790009.218742
04.9517690.3438859.0932042.5878006.625223

Exercise 07

In [17]:
df.columns = [f"{i}_column" for i in range(5)]
df
Out[17]:
0_column1_column2_column3_column4_column
04.9517690.3438859.0932042.5878006.625223
13.1171115.2006805.4671031.8485459.695846
27.7513289.3949898.9482745.9790009.218742
30.8849251.9598290.4522733.2533033.886773
42.7134908.2873753.5675332.8093455.426961
51.4092428.0219700.7455069.8688697.722448
61.9871570.0552218.1546147.0685737.290072
77.7127030.7404473.5846571.1586918.631034
86.2329813.3089800.6355843.1098233.251833
97.2960626.3755758.8721274.7221491.195942

Exercise 08

In [18]:
df.index = [f"{i}_row" for i in range(10)]
df
Out[18]:
0_column1_column2_column3_column4_column
0_row4.9517690.3438859.0932042.5878006.625223
1_row3.1171115.2006805.4671031.8485459.695846
2_row7.7513289.3949898.9482745.9790009.218742
3_row0.8849251.9598290.4522733.2533033.886773
4_row2.7134908.2873753.5675332.8093455.426961
5_row1.4092428.0219700.7455069.8688697.722448
6_row1.9871570.0552218.1546147.0685737.290072
7_row7.7127030.7404473.5846571.1586918.631034
8_row6.2329813.3089800.6355843.1098233.251833
9_row7.2960626.3755758.8721274.7221491.195942