6. Numpy

The Introduction to Python section gave you an introduction to basic Python data types like strings, lists, tuple, and dictionaries. In this chapter we begin introducing some more advanced data types available in special Python packages like called numpy. and pandas.

We start in this chapter with numpy; numpy is Python’s basic numerical computing module, and it actually provides a foundation for many other Python tools, such as Pandas; numpy is actually a vast library with many useful parts; we will consider only a small part.

We will begin by learning how to create, access, and update the basic numpy data structure, the ndimensional array, as well how to add, subtract, and multiply with arrays using vectorized arithmetic operations, operations that apply elementwise to all the elements of an array. When using numpy, many mathematical functions are vectorized. What we learn about operations on numbers will carry over to Boolean conditions, conditions that are True or False of the individual elements in an array. Applying a Boolean condition to an array results in another array in which each cell contains True or False. We will learn to use such Boolean arrays to extract portions of arrays that satisfy Boolean conditions, allowing for high-level queries and manipulations of the data.

An immediate payoff from our brief survey of numpy is that all the principles for computing with numpy arrays will carry over with minor modifications to computing with pandas DataFrames in the next module. The methods we use with numpy, then, provide the foundation for data manipulation with a wide variety of non-numerical data.

Contents