Exercise Set IV - Plotting

Jupyter Google Colab 💡 Show Hints ℹ️ Show Solutions

Exercise 1

Create a simple line plot using Matplotlib. The x-axis should represent the numbers 1 through 5, and the y-axis should represent their squares. Add a title, x-axis label, and y-axis label to the plot.

Hint

Import matplotlib.pyplot and use plt.plot() for line plots. Don’t forget to add appropriate labels and display the plot with plt.show().

Solution

Exercise 2

Create a scatter plot using Seaborn. Generate 100 random x values between 0 and 10 and corresponding y values using the equation y = 2x + noise, where noise is random noise from a normal distribution with mean 0 and standard deviation 1. Add a title, x-axis label, and y-axis label to the plot.

Hint

Use np.random.normal and np.linspace(0, 10, 100) for value generation

Solution

Exercise 3

Using Pandas and Numpy, create a DataFrame with two columns: A and B. Column A should contain 50 random values between 0 and 1, and column B should contain 50 random values between 1 and 2. Use Seaborn to create a scatter plot of A vs. B. Add a title to the plot.

Hint

Use np.random.rand() to generate random values and pd.DataFrame() to structure your data. Create scatter plots with sns.scatterplot().

Solution

Exercise 4

Create a histogram using Seaborn. Generate 1000 random integer values between -50 and 50 using np.random.randint and plot the histogram with a KDE (Kernel Density Estimate) overlay. Add a title to the plot.

Hint

Use np.random.randint() to generate random integers and sns.histplot() with the kde=True parameter for the density overlay.

Solution

Exercise 5

Create a heatmap using Seaborn. Generate a 5x5 matrix of random values between 0 and 1. Use Seaborn to plot the heatmap with annotations and a coolwarm color palette. Add a title to the plot.

Hint

Use np.random.rand() to create a random matrix and sns.heatmap() with the annot=True and cmap parameters for visualization.

Solution

Exercise 6

Create a violin plot using Seaborn. Generate data for three categories (‘A’, ‘B’, ‘C’), each containing 50 random values. Plot the violin plot and label the x-axis with the categories. Add a title to the plot.

Hint

Generate random data for each category and organize it into a list structure. Use sns.violinplot() and plt.xticks() to set appropriate labels.

Solution

Exercise 7

Create a dataset with two columns: Category and Values. The Category column should contain three categories (X, Y, Z), and the Values column should contain 50 random values for each category. Use Seaborn to create a box plot to compare the distributions of the three categories. Add a title to the plot.

Hint

Create repeated category labels and combine multiple random arrays. Structure the data as a DataFrame and use sns.boxplot() with appropriate column specifications.

Solution

Exercise 8

Using your own judgment, analyze the following scenarios. For each scenario, determine whether the dataset is univariate or multivariate. Justify your answer and create an appropriate plot to visualize the data. Use any tools or libraries you find suitable.

A company tracks the monthly revenue of its top 5 products over the past year.
A survey collects the ages of 500 individuals in a city.
A school maintains records of students’ grades in Math, Science, and English for the current semester.

Hint

Consider whether each scenario involves one variable or multiple variables/relationships. Choose appropriate visualization techniques based on the data structure and what insights you want to reveal.

Solution

Multivariate: The dataset contains revenue data for multiple products over time. A line plot or grouped bar chart would be appropriate.

Univariate: The dataset contains only one variable (ages). A histogram or box plot would be appropriate.

Multivariate: The dataset contains students’ grades in three subjects. A box plot for each subject would be appropriate.

Summary

These exercises covered essential data visualization techniques using Matplotlib and Seaborn:

Basic Plotting: Line plots and scatter plots for relationship visualization
Data Generation: Creating synthetic datasets with NumPy for visualization practice
Statistical Plots: Histograms with KDE overlays for distribution analysis
Advanced Visualizations: Heatmaps, violin plots, and box plots for complex data
Data Analysis: Understanding univariate vs. multivariate data and choosing appropriate visualizations