Exercise Set IV - Plotting
Jupyter Google Colab 💡 Show Hints ℹ️ Show Solutions
Exercise 1
Create a simple line plot using Matplotlib. The x-axis should represent the numbers 1 through 5, and the y-axis should represent their squares. Add a title, x-axis label, and y-axis label to the plot.
Import matplotlib.pyplot
and use plt.plot()
for line plots. Don’t forget to add appropriate labels and display the plot with plt.show()
.
Exercise 2
Create a scatter plot using Seaborn. Generate 100 random x values between 0 and 10 and corresponding y values using the equation y = 2x + noise
, where noise
is random noise from a normal distribution with mean 0 and standard deviation 1. Add a title, x-axis label, and y-axis label to the plot.
Use np.random.normal
and np.linspace(0, 10, 100)
for value generation
Exercise 3
Using Pandas and Numpy, create a DataFrame with two columns: A
and B
. Column A
should contain 50 random values between 0 and 1, and column B
should contain 50 random values between 1 and 2. Use Seaborn to create a scatter plot of A
vs. B
. Add a title to the plot.
Use np.random.rand()
to generate random values and pd.DataFrame()
to structure your data. Create scatter plots with sns.scatterplot()
.
Exercise 4
Create a histogram using Seaborn. Generate 1000 random integer values between -50 and 50 using np.random.randint
and plot the histogram with a KDE (Kernel Density Estimate) overlay. Add a title to the plot.
Use np.random.randint()
to generate random integers and sns.histplot()
with the kde=True
parameter for the density overlay.
Exercise 5
Create a heatmap using Seaborn. Generate a 5x5 matrix of random values between 0 and 1. Use Seaborn to plot the heatmap with annotations and a coolwarm color palette. Add a title to the plot.
Use np.random.rand()
to create a random matrix and sns.heatmap()
with the annot=True
and cmap
parameters for visualization.
Exercise 6
Create a violin plot using Seaborn. Generate data for three categories (‘A’, ‘B’, ‘C’), each containing 50 random values. Plot the violin plot and label the x-axis with the categories. Add a title to the plot.
Generate random data for each category and organize it into a list structure. Use sns.violinplot()
and plt.xticks()
to set appropriate labels.
Exercise 7
Create a dataset with two columns: Category
and Values
. The Category
column should contain three categories (X
, Y
, Z
), and the Values
column should contain 50 random values for each category. Use Seaborn to create a box plot to compare the distributions of the three categories. Add a title to the plot.
Create repeated category labels and combine multiple random arrays. Structure the data as a DataFrame and use sns.boxplot()
with appropriate column specifications.
Exercise 8
Using your own judgment, analyze the following scenarios. For each scenario, determine whether the dataset is univariate or multivariate. Justify your answer and create an appropriate plot to visualize the data. Use any tools or libraries you find suitable.
A company tracks the monthly revenue of its top 5 products over the past year.
A survey collects the ages of 500 individuals in a city.
A school maintains records of students’ grades in Math, Science, and English for the current semester.
Consider whether each scenario involves one variable or multiple variables/relationships. Choose appropriate visualization techniques based on the data structure and what insights you want to reveal.
- Multivariate: The dataset contains revenue data for multiple products over time. A line plot or grouped bar chart would be appropriate.
- Univariate: The dataset contains only one variable (ages). A histogram or box plot would be appropriate.
- Multivariate: The dataset contains students’ grades in three subjects. A box plot for each subject would be appropriate.
Summary
These exercises covered essential data visualization techniques using Matplotlib and Seaborn:
- Basic Plotting: Line plots and scatter plots for relationship visualization
- Data Generation: Creating synthetic datasets with NumPy for visualization practice
- Statistical Plots: Histograms with KDE overlays for distribution analysis
- Advanced Visualizations: Heatmaps, violin plots, and box plots for complex data
- Data Analysis: Understanding univariate vs. multivariate data and choosing appropriate visualizations