Increasingly, social scientists find themselves facing exponentially larger data sets available on the internet and elsewhere without suitable tools to deal with them. Many social scientists end up using spreadsheet programs for their data-processing tasks and spend hours clicking around or copying and pasting, and then repeating the process for other data files. Not only is this a waste of time, but it often leaves you in a situation where it is hard to reproduce the steps that got you a particular result, making your work useless.

This course will show you how to use computing tools freely available via the **scripting** language **Python** to use your data more powerfully and effectively. The course touches on many topics of theoretical interest in the emerging field of data science, such as social networks and data visualization, but the focus is on manipulating data so that you can tailor it to the needs of your particular project. The course targets social science students and will assume no prior programming knowledge. Although many of the techniques are relevant to linguistics, economics, and geography, the course focuses on techniques that are applicable to a wide range of data sources, including images, social network data, web pages, blogs.

Tiopics covered include

  • Python Basics
  • Searching for patterns in text and web data (regular expressions)
  • Extracting information from big data sets (Government data)
  • Constructing social networks from data (visualizing social groups)
  • Connecting to your stat package (Python data frames)
  • Visualizing similarity relations
  • Visualizing quantitative relationships on maps


The course will use two required texts Python for Dummies (Stef Maruch and Aahz Maruch) and Data Wrangling with Python for Data Analysis (Jacqueline Kazul and Katharine Jarmul), but will also make heavy use of online course notes and freely available Python software.


Prerequisites and Grading

No course pre-requisites. No knowledge of programming will be asssumed. Upper division standing. Some openness to acquiring computational skills. Some knowledge of what counts as interestinmg data in your own Social Science.

Grading will be based on exercises, quizzes, midterm, and a final project.

San Diego State University
