AITC Wiki

0405 Histograms and Binnings

直方图与分箱

0405 Histograms and Binnings

中文版:直方图与分箱

Histograms, Binnings, and Density

A simple histogram canbeagreat first step in understanding a dataset. Earlier, wesawa preview of Matplotlib’s histogram function (see Comparisons, Masks, and Boolean Logic), which creates a basic histogram inoneline, once the normal boiler-plate imports are done:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
 
data = np.random.randn(1000)
plt.hist(data);

Two-Dimensional Histograms and Binnings

Justaswe create histograms in one dimension by dividing the number-lineintobins, wecanalso create histograms in two-dimensions by dividing points among two-dimensional bins. We’lltakeabrief look at several waystodothishere. We’ll start by defining some data—an x and y array drawn from a multivariate Gaussian distribution:

mean = [0, 0]
cov = [[1, 1], [1, 2]]
x, y = np.random.multivariate_normal(mean, cov, 10000).T

plt.hist2d: Two-dimensional histogram

One straightforward waytoplotatwo-dimensional histogram istouse Matplotlib’s plt.hist2d function:

plt.hist2d(x, y, bins=30, cmap='Blues')
cb = plt.colorbar()
cb.set_label('counts in bin')

Justaswith plt.hist, plt.hist2d has a number of extra options to fine-tunetheplotandthe binning, which are nicely outlined in the function docstring. Further, just as plt.hist has a counterpart in np.histogram, plt.hist2d has a counterpart in np.histogram2d, which canbeusedas follows: