0205 Computation on arrays broadcasting
中文版:广播
Computation on Arrays: Broadcasting
Wesawinthe previous section how NumPy’s universal functions canbeusedto vectorize operations and thereby remove slow Python loops. Another means of vectorizing operations istouseNumPy’s broadcasting functionality. Broadcasting is simply asetofrules for applying binary ufuncs (e.g., addition, subtraction, multiplication, etc.) on arrays of different sizes.
Introducing Broadcasting
Recall that for arrays ofthesamesize, binary operations are performed on an element-by-element basis:
import numpy as npa = np.array([0, 1, 2])
b = np.array([5, 5, 5])
a + bBroadcasting allows these types of binary operations to be performed on arrays of different sizes–for example, wecanjustas easily add a scalar (think ofitasazero-dimensional array) toanarray:
a + 5Wecanthink ofthisasan operation that stretches or duplicates the value 5 intothearray [5, 5, 5], andaddsthe results.
The advantage of NumPy’s broadcasting isthatthis duplication of values does not actually take place, butitisa useful mental model aswethink about broadcasting.
We can similarly extend this to arrays of higher dimension. Observe the result whenweaddaone-dimensional array toatwo-dimensional array:
M = np.ones((3, 3))
MM + aHeretheone-dimensional array a is stretched, or broadcast across the second dimension in order to match the shape of M.
While these examples are relatively easy to understand, more complicated cases can involve broadcasting of both arrays. Consider the following example:
a = np.arange(3)
b = np.arange(3)[:, np.newaxis]
print(a)
print(b)a + bRules of Broadcasting
Broadcasting in NumPy follows a strict setofrules to determine the interaction between the two arrays:
- Rule 1: Ifthetwo arrays differ in their number of dimensions, the shape oftheonewithfewer dimensions is padded withonesonits leading (left) side.
- Rule 2: Iftheshape ofthetwo arrays doesnotmatch in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
- Rule 3: Ifinany dimension the sizes disagree and neither is equal to 1, an error is raised.
Tomakethese rules clear, let’s consider a few examples in detail.
Broadcasting example 1
Let’slookat adding a two-dimensional array toaone-dimensional array:
M = np.ones((2, 3))
a = np.arange(3)Let’s consider an operation on these two arrays. The shape of the arrays are
M.shape = (2, 3)a.shape = (3,)
Weseebyrule 1 thatthearray a has fewer dimensions, sowepaditontheleftwithones:
M.shape -> (2, 3)a.shape -> (1, 3)
By rule 2, wenowseethatthefirst dimension disagrees, so we stretch this dimension to match:
M.shape -> (2, 3)a.shape -> (2, 3)
The shapes match, andweseethatthefinal shape will be (2, 3):
M + aBroadcasting example 2
Now let’stakealookatan example in which the two arrays are not compatible:
M = np.ones((3, 2))
a = np.arange(3)Thisisjusta slightly different situation thaninthefirst example: the matrix M is transposed.
Howdoesthis affect the calculation? The shape of the arrays are
M.shape = (3, 2)a.shape = (3,)
Again, rule 1 tells usthatwemustpadtheshape of a with ones:
M.shape -> (3, 2)a.shape -> (1, 3)
By rule 2, the first dimension of a is stretched to match that of M:
M.shape -> (3, 2)a.shape -> (3, 3)
Nowwehitrule 3–the final shapes donotmatch, so these two arrays are incompatible, aswecan observe by attempting this operation:
M + aNote the potential confusion here: you could imagine making a and M compatible by, say, padding a’s shape withonesontheright rather thantheleft.
Butthisisnothowthe broadcasting rules work!
Thatsortof flexibility might be useful insomecases, butitwould lead to potential areas of ambiguity.
If right-side padding iswhatyou’d like, youcandothis explicitly by reshaping the array (we’llusethe np.newaxis keyword introduced in The Basics of NumPy Arrays):
a[:, np.newaxis].shapeM + a[:, np.newaxis]Alsonotethatwhile we’ve been focusing on the + operator here, these broadcasting rules apply to any binary ufunc.
For example, hereisthe logaddexp(a, b) function, which computes log(exp(a) + exp(b)) with more precision thanthenaive approach:
np.logaddexp(M, a[:, np.newaxis])For more information onthemany available universal functions, refer to Computation on NumPy Arrays: Universal Functions.
Broadcasting in Practice
Broadcasting operations formthecoreofmany examples we’ll see throughout this book. We’llnowtakealookata couple simple examples of where theycanbe useful.
Centering an array
In the previous section, wesawthat ufuncs allow a NumPy user to remove theneedto explicitly write slow Python loops. Broadcasting extends this ability. One commonly seen example is when centering an array of data. Imagine youhaveanarray of 10 observations, eachofwhich consists of 3 values. Using the standard convention (see Data Representation in Scikit-Learn), we’ll store thisina array:
X = np.random.random((10, 3))We can compute themeanofeach feature using the mean aggregate across the first dimension:
Xmean = X.mean(0)
XmeanAndnowwecan center the X array by subtracting the mean (thisisa broadcasting operation):
X_centered = X - XmeanTo double-check that we’vedonethis correctly, wecancheck that the centered array hasnearzeromean:
X_centered.mean(0)To within machine precision, themeanisnowzero.
Plotting a two-dimensional function
One place that broadcasting is very useful is in displaying images based on two-dimensional functions. Ifwewantto define a function , broadcasting canbeusedto compute the function across the grid:
# xandyhave 50 steps from 0 to 5
x = np.linspace(0, 5, 50)
y = np.linspace(0, 5, 50)[:, np.newaxis]
z = np.sin(x) ** 10 + np.cos(10 + y * x) * np.cos(x)We’ll use Matplotlib toplotthistwo-dimensional array (these tools will be discussed infullin Density and Contour Plots):
%matplotlib inline
import matplotlib.pyplot as pltplt.imshow(z, origin='lower', extent=[0, 5, 0, 5],
cmap='viridis')
plt.colorbar();The result is a compelling visualization ofthetwo-dimensional function.