6. DataΒΆ

The Introduction to Python section gave you an introduction to basic Python data types like strings, lists, tuple, and dictionaries. In this section we introduce some more advanced data types available in special Python packages called numpy. and pandas.

To motivate the new types, we work our way through an introduction to arrays an extended example with pandas, using the PUMS Census data. The PUMs data is a treasure house of up-to-date social science data on the U.S. population. Additionally, since the data from past censuses is available, it is the principle starting point for analyzing U.S. demographic trends.

The data presents a number of computational challenges, beginning with the format it is stored in. But there are some pre-existing Python tools that give you a useful entry point. The most important of these the pandas data type data frame, a flexible table-like structure very similar to the data structure with the same name in R.