Python for Social Science Syllabus

Course Outline

Linguistics 572

The table below gives an approximate schedule of classes, assignments, and lectures for this course.

Day

Reading

Assignment

Lecture

Background

Wed Jan 19 Book Draft: Table of Contents, 1. Preface, 2.1-2.3 What is this course about? What is Python? Why Python 2.4 Install Python (Anaconda) or run in the cloud? Assignment: Running Python Notebook. Introductory remarks Jupyter notebook Demo (notebook). Python Data Science HB (PDSHB): Chapter 1. IPython PDSHB Notebook Index. (Notebooks contain the text of PDSHB + executable text snippets)
Wed Jan 26 Book Draft: 3.1-3.3 Python types I, More Python types II. Running python assignment due. Notebook demo (notebook), Python types notebook, Python types I: Strings, numbers. Python types II: Sequences, Dictionaries, sets. VanDerPlas: Whirlwind Tour of Python: Chapter 7: Notebooks.
Wed Feb 02 Book Draft: If-then statements, Boolean results, loops, List comprehension, Assignment: Python types assignment due. Loops, conditional clauses notebook. , Programming notebook. VanDerPlas: Whirlwind Tour of Python: Chapter 8, 9, 12: Notebooks.
Wed Feb 09 Book Draft: 5.1: importing, 5.2 Namespaces, 5.3 block structure, 5.4 Functions and function parameters, import, namespaces, classes. Solution to running python, Solution to python types. Notebooks: Functions, Programming, Harder programming.  
Wed Feb 16 Book Draft: Book Draft: 4.6 Functions. 4.7. Functions Functions assignment due. Notebooks: Functions. Sets, Set operations, Set example, Iterators and generators, Climate change problem (containers), DNA string (containers, coding), DNA translation(Dictionary codebook). VanDerPlas: Whirlwind Tour of Python: Chapter 9: Notebooks.
Wed Feb 23 Book Draft: Numpy: 6.1 - 6.4 Functions assignment solution. Notebooks: Intro to numpy: arrays, tables, splicing, arithmetic with arrays, arrays versus lists, Boolean arrays and Boolean indexing, fancy indexing. More nitty gritty on Boolean arrays (from PDSHB), Numpy tools A broader survey of numpy capabilities (From Handson Machine Learning). In class Boolean notebook, Numpy broadcasting (notes/examples). Python Data Science HB (PDSHB): Chapter 2. Numpy. PDSHB Notebook Index.
Wed Mar 02   In class Boolean notebook solutions. Notebooks: Edited version of numpy.ipynb suitable as a study tool.  
Wed Mar 09 Book Draft: Intro to pandas and pandas data frames 6.4 - 6.8., Pandas tutorial.
Numpy assignment due, Midterm Study notebook, Midterm Study answers. Lecture: Mid Semester Review. Tools: Pandas Intro (HOML). Pandas notebook I, Pandas notebook II Python Data Science HB (PDSHB): Chapter 3. Pandas. PDSHB Notebook Index.
Wed Mar 16   Midterm 2022. Numpy assignment solution. Notebooks: Pivot tables and merges in Pandas, Covid analysis example, Census data example,  
Wed Mar 23 Book Draft: Introducing Regular Expressions, Reading in and tokenizing text data. NLTK book. Text processing pipeline, unicode basics. Final project suggestions, This is when your midterm NB will be due. Regular expressions notebook, WordNet, Unicode, Text processing notebook. VanDerPlas: Whirlwind Tour of Python: Chapter 15: Notebooks. NLTK Book ch. 3
Wed Mar 30 H'day H'day H'day H'day
Wed Apr 06 Book Draft: Chap. 7: Classification of text. Regression, Chap. 7: Linear classifiers, SVM classification, Applying linear classifiers to text: Movie review example. Pandas assignment due . notebooks: Regression. Linear Classifiers (SVMs). Iris data classification (sklearn). Classifying movie review (NLTK); precision, recall, etcetera. Sklearn/Insult classification. Python Data Science HB (PDSHB): Chapter 5. Machine learning. PDSHB Notebook Index.
Wed Apr 13 Book Draft: Chap. 9: Social networks intro, Gephi demo. Project suggestions revisited, Midterm answers for your midterm version. Midterm answers with more extensive annotation, Regular Expressions assignment deadline moved one week. Social Networks lectures slides, and New using networkx notebook, Centrality experiments, Assortativity notebook. Facebook ego networks.  
Wed Apr 20 Book Draft: Chap. 7: Regression. Regression, Regular Expressions assignment due, Classification assignment moved one week. Matplotlib Intro, 03_Classification (HOML). Regression, Regression and classification. NLTK Book ch. 3
Wed Apr 27 Book Draft: Chap. 8: Visualization. Classification assignment due, Notebooks: Review of plotting basics, Box and violin plots, Color: Visualizing multidimensional data: Parallel coordinate plots, Color: correlation heat maps, Geographic visualization, NY Times Covid analysis, geographic visualization problem. Mandelbrot and others. Linear Mapping examples.
Wed May 04   Complete solution for classification assignment. Final project due 5/11 Data bias.
Last class day