html> Python for Social Scientists
San Diego State University logo

Department of Linguistics and Oriental Languages



Intro Textbook website

Textbook2 website

Course outline

Online book draft

Text Analytics Certificate & Minor

Linguistics 572

Python for Social Scientists




Increasingly, social scientists find themselves facing exponentially larger data sets available on the internet and elsewhere without suitable tools to deal with them. Many social scientists end up using spreadsheet programs for their data-processing tasks and spend hours clicking around or copying and pasting, and then repeating the process for other data files. Not only is this a waste of time, but it often leaves you in a situation where it is hard to reproduce the steps that got you a particular result, making your work useless.

This course will show you how to use computing tools freely available via the **scripting** language **Python** to use your data more powerfully and effectively. The course touches on many topics of theoretical interest in the emerging field of data science, such as social networks and data visualization, but the focus is on manipulating data so that you can tailor it to the needs of your particular project. The course targets social science students and will assume no prior programming knowledge. Although many of the techniques are relevant to linguistics, economics, and geography, the course focuses on techniques that are applicable to a wide range of data sources, including images, social network data, web pages, blogs.

Tiopics covered include

  • Python Basics
  • Searching for patterns in text and web data (regular expressions)
  • Extracting information from big data sets (Government data)
  • Constructing social networks from data (visualizing social groups)
  • Connecting to your stat package (Python data frames)
  • Visualizing similarity relations
  • Visualizing quantitative relationships on maps


The course will use two required texts Python for Dummies (Stef Maruch and Aahz Maruch) and Data Wrangling with Python for Data Analysis (Jacqueline Kazul and Katharine Jarmul), but will also make heavy use of online course notes and freely available Python software.


Full Course outline

Prerequisites and Grading

No course pre-requisites. No knowledge of programming will be asssumed. Upper division standing. Some openness to acquiring computational skills. Some knowledge of what counts as interestinmg data in your own Social Science.

Grading will be based on exercises, quizzes, midterm, and a final project.

Place and Time

Tu Th 1100-1215 AH 3150

Contact Info

Mailing address:
gawron at mail dot sdsu dot edu
Department of Linguistics and Oriental Languages
San Diego State University
5500 Campanile Drive
San Diego, CA 92182-7727
Telephone: (619) 594-0252
Office location: SHW, room 238
Office hours: TuTh 12:30-1:30, Tu 3:30-4:30, Th 09:30-10:30

Unix | Computational Linguistics Lab