12.8. Social security card holdersΒΆ

Go to the following website

http://www.ssa.gov/oact/babynames/numberUSbirths.html

and select the births table there and copy it to your clipboard:

>>> import pandas as pd
>>> tt = pd.read_clipboard()
>>> tt
>>> tt
<class 'pandas.core.frame.DataFrame'>
Int64Index: 133 entries, 0 to 132
Data columns (total 4 columns):
birth      133  non-null values
Male       133  non-null values
Female     133  non-null values
Total      133  non-null values
dtypes: int64(1), object(3)

The first few rows should look like this:

>>> tt.head()
   birth     Male   Female     Total
0    1880  118,400   97,605  216,005
1    1881  108,285   98,858  207,143
2    1882  122,032  115,698  237,730
3    1883  112,481  120,066  232,547
4    1884  122,741  137,589  260,330

LittleStop Make sure that what you get is not multi-indexed:

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 134 entries, (birth , Male , Female ) to (2012 , 2,009,637, 1,921,563)
Data columns (total 1 columns):
Year of    134  non-null values
dtypes: object(1)

with a head that looks like this:

>>> tt.head()
                        Year of
birth  Male    Female     Total
1880   118,400 97,605   216,005
1881   108,285 98,858   207,143
1882   122,032 115,698  237,730
1883   112,481 120,066  232,547

If you got this, it’s because your cut-and-paste included the two line column name:

Year of
birth

The pandas read_clipboard functions interprets this as an extra layer of indexing. Just exclude the words “Year of” from your highlighting when you cut, and you’ll get something much closer to what we want.