Maximum Entropy modeling assignment (Due Mar. 14, 2017)

Here are the main components of the assignment. They should all be downloaded and placed in a single directory on your local machine, which we will call $WORKING_DIR.
1. XML data file
2. XML data file backup
3. Top level feature extractor. This is the file you will edit.
4. Module loaded by the top level feature extractor. . This must be present in the same directory in order for the feature extractor to work.
5. Training and testing code. Uses NLTK
6. Assignment slides: Background directly related to this assignment.
7. Max entropy lecture slides
Instructions for using these modules are given below.
Here is what you need to email me by the deadline date above (gawron@mail.sdsu.edu):
1. The edited version of call_extract_event that gave you the best score you achieved on the word sense disambiguation task.
2. The output from call_maxent_event.py on your best scoring run. The trainer outputs a lot of stuff, starting with something like:
```
Marks-MacBook-Pro:max_entropy_module gawron$ python -i call_maxent.py senseval-hard.evt 100
Reading data...
  Senses: HARD1 HARD2 HARD3
Splitting into test & train...
Training classifier...
  ==> Training (100 iterations)

      Iteration    Log Likelihood    Accuracy
      ---------------------------------------
             1          -1.09861        0.087
```
  and ending with something like:
```
Testing classifier...
Accuracy: 0.8756
Total: 410


Label                 Precision      Recall
_____________________________________________
HARD1                     0.873       1.000
HARD2                     0.944       0.515
HARD3                     0.833       0.125


Label                  Num Corr
HARD1                337
HARD2                17
HARD3                5


  -2.418 get_VB==True and label is 'HARD3'
  -2.212 look_NN==True and label is 'HARD1'
  -2.121 hardest_JJ==True and label is 'HARD3'
  -2.020 find_VB==True and label is 'HARD2'
       ....
```
  The final bit is the list of the most informative features in the trained model.
  Cut and paste all this output and send it.
3. Answer the following questions by writing up a few sentences in a Word or PDF document.
  1. How do you compare the performance of the Max Ent classifier with the Naive Bayes classifier? (See the output at the end of your training output; also see NLTK Naive Bayes classifier). Does the relative performance of the two classifiers change after modifying the feature extractor? Which classifier is better? Include a discussion of computational efficiency.
  2. How does the best Naive Bayes classifier differ from the best Max Ent classifier in the top 20 most informative features?
  3. Execute the following code after training and testing:
```
>>> (feats, label) = test[0]
>>> pdist = nb_classifier.prob_classify(feats)
>>> pdist.prob('HARD3')
```
    The first lines set the variable feats and cls to the extracted feature dictionary and class of the first example on the test set. It is instructive to look at these and see what you've got. The variable label is the correct class label or word sense for that example. It should correspond to one of the classes found when the data was loaded in at the beginning of your training output The second line of code applies the classifier to that example. and returns probabilities for all three classes in the form of an object called pdist, which has a method prob which returns the probability when given a class.
    What kind of probability is being returned. Is it:
    1. P(label | feats): the probability of the class 'HARD3' given the feature set; or
    2. P(feats | label): the probability of feature set given the label 'HARD3'; or
    3. P(feats, label): the joint probability of the feature set and label?
    Hint: Compute pdist.prob for all three classes.
  4. What feature or features are causing the probability of one class to be so high for the feature set in test[7]?
4. For more background and info on maxent models, see Le Zhang's (U. Edinburgh) max ent page,
5. For this assignment you need a Python package that is NOT part of the standard Python distro. It is called nltk. A google search on the string "nltk" will direct you to the nltk home page, or you can go to:
  If at all possible you should stick to Python 2.6 or 2.7. Among the optional python packages needed for some portions of nltk, you will need numpy.
  You will not need to install the NLTK data after installing the software. So your install needs are:
  1. NLTK software: Follow platform specific directions on NLTK download page
  2. Numpy: Follow platform specific directions on NLTK download page
    NOTE!: Version 1.6.1 of numpy works with Python 2.7.3 and with the latest version of nltk (2.0.4). See the NLTK website for numpy links.
  3. NLTK data: To do this, follow the directions on the NLTK data page. Note: Installing the data is definitely optional for this assignment. You can get all the data you need from the XML corpus below.
6. To check for nltk access. start up python and proceed as follows:
```
[gawron@ngram ~]$ python
Python 2.5.1 (r251:54863, Jul 10 2008, 17:24:48) 
[GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
```
Running your maxent Classifier
1. To train and test your classifier you need to give it some labeled data.
  The training corpus is an xml file with example sentences using the word hard. This XML file contains about 4000 events. Each event is a sentence in which the word hard occurred with one of three senses.
  1. "HARD1": difficult to do
  2. "HARD2": potent (as in "the hard stuff")
  3. "HARD3": physically resistant to denting, bending, or scratching
  In the xml file an event looks something like this:
  To disambiguate these sentences you need to turn the XML into a form the max ent trainer can use. Essentially you have to turn the raw data into feature dictionaries that represent each example with a set of binary yes/no features. So the steps are:

1.	Feature extraction	Map raw data into features: XML file → EVENT file
2.	Training	Train and test the maxent model on the event file EVENT file → maxent model

Maximum Entropy modeling assignment (Due Mar. 14, 2017)

Running your maxent Classifier

Preparing training data for your classifier