20%off off — LaunchSpecial
Home/Machine Learning with Python/Data Visualisation with Matplotlib & Seaborn
beginner
25 min
120 XP

Data Visualisation with Matplotlib & Seaborn

Create insightful plots for exploratory data analysis — histograms, scatter plots, correlation heatmaps, and more

Data Visualisation for ML

Exploratory Data Analysis (EDA) is the process of visually understanding your data before modelling. Good visualisations reveal patterns, outliers, and relationships that guide feature engineering and model selection.

Why Visualise?

  • Spot outliers and anomalies
  • Understand feature distributions
  • Find correlations between variables
  • Identify class imbalances
  • Validate data cleaning steps

Matplotlib Basics

python
import matplotlib.pyplot as plt

# Line plot
plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
plt.xlabel('x')
plt.ylabel('y = x²')
plt.title('Quadratic Function')
plt.show()

Key Plot Types for ML

Distribution Plots

python
# Histogram — understand feature distribution
plt.hist(data, bins=30, edgecolor='black')

# Density plot
import seaborn as sns
sns.kdeplot(data)

Relationship Plots

python
# Scatter plot — feature vs target
plt.scatter(X[:, 0], y, alpha=0.5)

# Pair plot — all features vs each other
sns.pairplot(df, hue='target')

Correlation Heatmap

python
# Critical for feature selection
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)

Box Plots — Outlier Detection

python
sns.boxplot(x='category', y='value', data=df)

Seaborn vs Matplotlib

SeabornMatplotlib
High-level, beautiful defaultsLow-level, full control
Statistical plots built-inManual customisation
Works directly with DataFramesWorks with arrays
Less code for complex plotsMore code, more flexibility

Try It Yourself

python
🐍 Python

to use AI code explanations