12.2. Read and create a fileΒΆ

This assignment requires you to write a program that can read in a file and extract some information from it and save that information in another file. Your program will be called extract_title_words.py.

The name of the file you read is going to be a commandline argument. Your program needs to be able to read different files that are passed in as commandline arguments and perform the same information extraction task on them, and write the different results to different output files.

Here are download links to the two files you are going to read in:

  1. pride_and_prejudice.txt
  2. practice_file.txt

We give some examples of what your progra will do. If you you call your program this way:

cmd > python extract_title_words.py <practice_file.txt>

your program will loop through every word in <practice_file.txt>, and print every capitalized word to another file named practice_file_caps.txt, with duplicates removed. On the other hand, if you call it this way,:

cmd > python extract_title_words.py pride_and_prejudice.txt

it prints out every capitalized word of Pride and Prejudice to a file named pride_and_prejudice_caps.txt.

Extra credit: The capitalized words in the output file are in the order of their first occurrence in the input file (but there still are no duplicates).

Here are some hints, notebooks, and online text material that will help in figuring out how to do the assignment:

  1. Read the entire The Anatomy of a Python program chapter to get clear on what parts a working program needs.

  2. Reading, Writing Files, concepts

  3. Files and texts

  4. Loops, functions

  5. Looping through the words in a file (first example).

  6. How to check if a word is capitalized. Use the string method istitle to determine if a string is capitalized:

    >>> 'King'.istitle()
    True
    >>> 'king'.istitle()
    False
    
  7. The Sets container introduction.