Image by Author | Ideogram
Â
This Python tutorial covers practical step-by-step examples of visualizing data contained in NumPy, a common Python data structure to efficiently handle large datasets.
The tutorial showcases different types of data visualizations using a popular plotting library: matplotlib. This library provides intuitive tools to plot, customize, and interpret data, facilitating insight drawing from NumPy arrays. If you are looking for DIY examples for acquiring a quick foundation for visualizing data in Python, this tutorial is for you.
Â
Tutorial Examples
Â
To carry out the below examples of visualizing data contained in NumPy arrays, you’ll only need to import two libraries at the start of your Python script or program: NumPy and matplotlib.
import numpy as np
import matplotlib.pyplot as plt
Â
Let’s dive now into the real-world data examples we’ve prepared for you.
Â
Visualizing 1D Data: Stock Prices Over Time
Â
Our first example visualizes daily stock prices over a month (30 days) using a simple line plot.
days = np.arange(1, len(stock_prices) + 1)
# Array of daily stock prices (30 elements)
stock_prices = [102.5, 105.2, 103.8, 101.9, 104.7, 106.3, 107.1, 105.5,
108.2, 109.0, 107.8, 106.5, 108.9, 109.5, 110.2, 109.8,
111.5, 112.3, 110.9, 113.1, 111.8, 114.2, 113.5, 115.0,
114.7, 116.2, 115.8, 117.5, 116.9, 118.1]
# Plot the array in a line plot
plt.plot(days, stock_prices)
plt.xlabel('Day')
plt.ylabel('Price ($)')
plt.title('Stock Prices Over Time')
plt.show()
Â
The above code creates two NumPy arrays: one called days
, containing the days of the month (used for the x-axis of the plot), and the main data array stock_prices
containing the values to represent (y-axis). When these two arrays are passed as arguments to plt.plot()
, by default matplotlib builds a simple line plot. Additional attributes can be optionally set to add axes titles and a plot title. This simple approach is ideal to visualize time series data contained in a NumPy array.
Output:
Â
Â
Alternatively, for experiment purposes, you can generate your 1D array of stock prices randomly, as follows:
days = np.arange(1, 31)
stock_prices = np.random.normal(100, 5, size=days.shape)
Â
Â
Visualizing Two 1D Data Arrays: Height vs. Weight
Â
Suppose we have two data variables collected from 100 individuals: their height in cm and their weight in kg, each stored in a separate NumPy array. If wanted to visualize these two variables jointly -for instance, to analyze correlations-, a scatter plot is the solution.
This example randomly generates two arrays: height
, and weight
, of size 100 each. It then uses matplotlib’s scatter
method to create a scatter plot upon both arrays.
height = np.random.normal(170, 10, 100) # Random heights generated using a normal distribution with mean 170 and stdev 10
weight = np.random.normal(70, 8, 100) # Random heights generated using a normal distribution with mean 70 and stdev 8
plt.scatter(height, weight)
plt.xlabel('Height (cm)')
plt.ylabel('Weight (kg)')
plt.title('Height vs. Weight')
plt.show()
Â
Output:
Â
Â
Visualizing a 2D array: temperatures across locations
Â
Suppose you have collected temperature recordings over a range of 10 equidistant latitudes and 10 equidistant longitudes in a rectangular area. Instead of using 1D NumPy arrays, it is more appropriate to use one 2D NumPy array for these data. The below example shows how to visualize this “data grid” of temperatures using a heatmap: an interesting type of visualization that maps data values to colors in a defined color scale. For simplicity, the temperature data are generated randomly, following a uniform distribution with values between 15ºC and 30ºC.
# 2D data grid: 10x10 temperature recordings over a rectangular area
temperatures = np.random.uniform(low=15, high=30, size=(10, 10)) # Temperature in °C
plt.imshow(temperatures, cmap='hot', interpolation='nearest')
plt.colorbar(label="Temperature (°C)")
plt.title('Temperature Heatmap')
plt.show()
Â
Note that plt.imshow
is used to create the heatmap, specifying the 2D array to visualize, a specific color scale (in the example, ‘hot’), and an interpolation method, necessary when the data granularity and the image resolution differ.
Output:
Â
Â
Visualizing Multiple 1D Arrays: Financial Time Series
Â
Back to the first example about stock prices, let’s suppose we now have three different financial stocks and want to visualize their daily evolution jointly in a simple plot. If each stock time series is contained in a 1D array of equal size, the process is not very different from what we did earlier.
days = np.arange(1, 31)
stock_A = np.random.normal(100, 5, size=days.shape)
stock_B = np.random.normal(120, 10, size=days.shape)
stock_C = np.random.normal(90, 8, size=days.shape)
plt.plot(days, stock_A, label="Stock A")
plt.plot(days, stock_B, label="Stock B")
plt.plot(days, stock_C, label="Stock C")
plt.xlabel('Day')
plt.ylabel('Price ($)')
plt.title('Stock Prices Over Time')
plt.legend()
plt.show()
Â
The difference? We invoke plt.plot
three consecutive times, once for each stock time series. This does not generate three plots. Matplotlib creates the plot in the final instruction plt.show()
: everything in previous instructions is like “artifacts” that will be added to the resulting visualization.
Output:
Â
Â
Wrapping Up
Â
Through four insightful examples of varying complexity, this tutorial has illustrated how to easily visualize different types of data contained in NumPy arrays using several visualization methods, from simpler tools like the line plot to more sophisticated approaches like heatmaps.
Â
Â
Iván Palomares Carrascosa is a leader, writer, speaker, and adviser in AI, machine learning, deep learning & LLMs. He trains and guides others in harnessing AI in the real world.