Create insightful plots for exploratory data analysis — histograms, scatter plots, correlation heatmaps, and more
Exploratory Data Analysis (EDA) is the process of visually understanding your data before modelling. Good visualisations reveal patterns, outliers, and relationships that guide feature engineering and model selection.
import matplotlib.pyplot as plt
# Line plot
plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
plt.xlabel('x')
plt.ylabel('y = x²')
plt.title('Quadratic Function')
plt.show()
import matplotlib.pyplot as plt
# Line plot
plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
plt.xlabel('x')
plt.ylabel('y = x²')
plt.title('Quadratic Function')
plt.show()
# Histogram — understand feature distribution
plt.hist(data, bins=30, edgecolor='black')
# Density plot
import seaborn as sns
sns.kdeplot(data)
# Histogram — understand feature distribution
plt.hist(data, bins=30, edgecolor='black')
# Density plot
import seaborn as sns
sns.kdeplot(data)
# Scatter plot — feature vs target
plt.scatter(X[:, 0], y, alpha=0.5)
# Pair plot — all features vs each other
sns.pairplot(df, hue='target')
# Scatter plot — feature vs target
plt.scatter(X[:, 0], y, alpha=0.5)
# Pair plot — all features vs each other
sns.pairplot(df, hue='target')
# Critical for feature selection
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)
# Critical for feature selection
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)
sns.boxplot(x='category', y='value', data=df)
sns.boxplot(x='category', y='value', data=df)
| Seaborn | Matplotlib |
|---|---|
| High-level, beautiful defaults | Low-level, full control |
| Statistical plots built-in | Manual customisation |
| Works directly with DataFrames | Works with arrays |
| Less code for complex plots | More code, more flexibility |
to use AI code explanations