San Diego State University logo

Department of Linguistics and Oriental Languages


Tu Sec Syllabus

Wed Sec Syllabus

Main Textbook website

Optional Textbook2

Tu Course outline

Wed Course outline

Online book draft

Text Analytics Certificate & Minor

Linguistics 572

Python for Social Scientists




Increasingly, social scientists find themselves facing exponentially larger data sets available on the internet and elsewhere without suitable tools to deal with them. Many social scientists end up using spreadsheet programs for their data-processing tasks and spend hours clicking around or copying and pasting, and then repeating the process for other data files. Not only is this a waste of time, but it often leaves you in a situation where it is hard to reproduce the steps that got you a particular result, making your work useless.

This course will show you how to use computing tools freely available via the scripting language Python to use your data more powerfully and effectively. The course touches on many topics of theoretical interest in the emerging field of data science, such as social networks and data visualization, but the focus is on manipulating data so that you can tailor it to the needs of your particular project. The course targets social science students and will assume no prior programming knowledge. Although many of the techniques are relevant to linguistics, economics, and geography, the course focuses on techniques that are applicable to a wide range of data sources, including images, social network data, web pages, blogs.

Tiopics covered include

  • Python Basics
  • Computing with arrays (Numpy)
  • Data analsyis with Data frame (Pandas)
  • Network analysis (Networkx)
  • Visualizing similarity relations
  • Searching for patterns in text and web data (regular expressions)


The course will use one required text Python Data Science Handbook (Jake VenderPlas), but will also make heavy use of online course notes and freely available Python software. For introductory material, a secondary optional textbook is recommended, A Whirlwind tour of Python (Jake VenderPlas). Links to notebooks keyed to both textbooks will be provided in the course outline.


Tu Course outline

Wed Course outline

Prerequisites and Grading

No course pre-requisites. No knowledge of programming will be asssumed. Upper division standing. Some openness to acquiring computational skills. Some knowledge of what counts as interestinmg data in your own Social Science.

Grading will be based on exercises, quizzes, midterm, and a final project.


Tu 1600-1840

Wed 1600-1840

Contact Info

Mailing address:
gawron at mail dot sdsu dot edu
Department of Linguistics and Oriental Languages
San Diego State University
5500 Campanile Drive
San Diego, CA 92182-7727
Telephone: (619) 594-0252
Office location: SHW, room 238

Unix | Computational Linguistics Lab