Package viterbi :: Module viterbi_search

Module viterbi_search

Code implementing a Python version of the Viterbi algorithm for HMMs, which computes the lowest probability path through the HMM, given a sequence of observations.

Path probabilities through the HMM are computed using TWO probability models, implemented as dictionaries. The first is a transition probability model (A) and the second is the emission probability model (B). The transition prob:

      A[s1][s2]

returns the probability of transitioning to s2 given that you are at state s1 (Pr(s2|s1)). The emission prob:

     B[o][s]

is the probability of observing a string o given you are at state s (Pr(o|s)). Note A and B are dictionaries of dictionaries but A is a dictionary with state keys, and B is a dictionary with observation keys. Symbols not in the HMM alphabet will produce errors.

To run the decoder:

>>> decode_line('313',A3, B3, states3, start3)

This runs with Machine 3, the ice cream/weather machine of Chapter 6, Jurafsky & Martin, with the input diagrammed in Fig 6.10. The first argument is the input and the next four define the HMM:

   -- A3:  the state transition probs
   -- B3:  the observation probs
   -- states3: the set of states of the machine
   -- start3 the start state

An optional 5th Boolean argument toggles HMM emission prob predicting. With predicting turned on (the default), emission probs for the state one is coming FROM are used, which means one had to transition to that state before seeing the emission that mattered there. So one had to"predict" right (i.e., guess). The default value for this argument is True. The only machine defined in this file that runs with predicting turned off is Machine 1:

>>> decode_line('httt',A1, B1, states1, start1, True)

With predicting turned off, emission probs for the state one is going to are used, which means one sees the emission probs and the emission at the same time, and no predicting is necessary to choose the state with the highest emission prob for the current emission.

This file includes three example HMMs. The first is a simple coin flipping HMM, which, with predicting turned on, tries to "guess" the result of the next coin flip in a sequence of coin flips by transitioning at each observation to either a head-predicting state ('H') or a tail-predicting state ('T'). The observation alphabet consists of the lower case letters 'h' and 't'.

Machine 2 is the machine used for the word decoder in the first edition of Jurafsky and Martin, Ch.7, Fig. 9, which gives path probilities for decoding a sequence of phones as a sequence of words. A state state is a pair of a word and a phone. The emission probs in B2 are always either 1.0 or 0.0, 1.0 for any input phone that matches the phone that state represents, state[0], 0 for all others. For example, state ('dh','the') has probability 1.0 for"dh" and probability 0 for all other phones.The transition probs, A2, are of the form:

   A2[phone1][phone2]

and represent the probability of phone2 given phone1. It includes cross-word transitions. These are computed by combining the probability models in viterbi.pron_probs and viterbi.bigram_probs.

Functions

[hide private]

a 4-tuple of trellis (the Viterbi table), back (the backtrace table), the best probability path through the HMM (list), and nice_path_string (a pretty string representation of the best path, suitable for printing).

decode(Stringseq, A, B, states, s0, predict=False)
Takes a list of strings, Stringseq, each element a word according to the HMM defined by A,B, states, and s0. source code

dictionary of dictionaries

make_transition_dict(bigram_prob_dict, pron_prob_dict, states)
Turn dictionaries representing bigrams probs and pronunciation models into transition_dict, an HMM transition function such that: source code

switch_transition_function_to_log_probs(A, states)

source code

switch_emission_function_to_log_probs(B, states)

source code

print_viterbi_table(viterbi, stringseq, states, start, precision=6)
stringseq may be a list.

source code

print_info(s_from, s_to, A, emission_probs, trellis)

source code

decode_line(line, A, B, states, start, predict=False)

source code

Variables

[hide private]

neg_infinity = -inf

back = []

trellis = []

start1 = 's0'
Start state for HMM 1, the head-tails machine.

states1 = ['s0', 'H', 'T']
States list for HMM 1, the head-tails machine.

A1 = {'H': {'H': -1.0, 'T': -1.0, 's0': -inf}, 'T': {'H': -1.0...
Transition probs for HMM 1, the head-tails machine.

B1 = {'h': {'H': 0.0, 'T': -inf, 's0': -1.0}, 't': {'H': -inf,...
Observation probs for HMM 1, in the head-tails machine.

start2 = ('start', 'start')
The starting state for HMM 2, which is the HMM implicit in J&M, Figure 7.10.

phones = set(['aa', 'ax', 'ay', 'd', 'dh', 'iy', 'n'])

B2 = {'aa': {('aa', 'I'): 0.0, ('aa', 'on'): 0.0, ('ax', 'the'...

A2 = {('aa', 'I'): {('aa', 'I'): -inf, ('aa', 'on'): -14.37697...

states2 = [('d', 'need'), ('iy', 'need'), ('n', 'need'), ('ax'...
The states for HMM 2, which is the HMM implicit in J&M, Figure 7.10.

start3 = 'start0'

states3 = ['start0', 'HOT', 'COLD']

A3 = {'COLD': {'COLD': -0.736965594166, 'HOT': -1.32192809489,...

B3 = {'1': {'COLD': -1.0, 'HOT': -2.32192809489, 'start0': -in...

Stringseq = ['aa', 'n', 'iy', 'dh', 'ax']

phone = 'ax'

phonedic = {('aa', 'I'): -inf, ('aa', 'on'): -inf, ('ax', 'the...

state = ('start', 'start')

Function Details

[hide private]

decode(Stringseq, A, B, states, s0, predict=False)

source code

Takes a list of strings, Stringseq, each element a word according to the HMM defined by A,B, states, and s0. Returns path, a list of states of length len(Stringseq)+1, representing the highest prob HMM path that accepts Stringseq, as well as the two tables built in the Viterbi computation. Note: implements the pseudocode of J&M, Figure 7.09.

trellis is the table the Viterbi algorithm fills in. Therefore, trellis will be an array of length T+1 (length of input + t=0) Each element will be a dictionary with states as keys. trellis[t][s] is the viterbi score of state s at time t [a log prob]

back (for "backtrace") will be an array with the same dimensions:

     back[t][s]

is the best state to have come FROM to get to s at t.

Parameters:

Stringseq - a list of strings representin the sequence input observations
A - Transition prob table of HMM
B - Observation prob table of HMM
states - A list of states for HMM
s0 - Start state of HMM

Returns: a 4-tuple of trellis (the Viterbi table), back (the backtrace table), the best probability path through the HMM (list), and nice_path_string (a pretty string representation of the best path, suitable for printing).

make_transition_dict(bigram_prob_dict, pron_prob_dict, states)

source code

Turn dictionaries representing bigrams probs and pronunciation models into transition_dict, an HMM transition function such that:

  transition_dict[(phone_from,word_from)][(phone_to,word_to)] = prob

This dictionary represents the kind of global phone to phone transition probs shown in J&M, Figure 7.08, including cross-word transitions.

This transition probability across words is estimated as:

  Pr(end | word_from,phone_from) *
  Pr(phone_to | start, word_to) *
  Pr(word_to | word_from)

or equivalently:

  pron_prob_dict[word_from][phone_from][end] *
  pron_prob_dict[word_to][start][phone_to] *
  bigram_prob_dict[word_from][word_to]

Thus the resulting HMM transition table has no start word and end word states.

Parameters:

bigram_prob_dict - A dictionary of dictionaries representing bigram probs.
pron_prob_dict - A dictionary of dictionaries representing with word keys representing bigram pronunication probs.
states - the list of states for the current HMM.

Returns: dictionary of dictionaries

print_viterbi_table(viterbi, stringseq, states, start, precision=6)

source code

stringseq may be a list. Indicates phone list. Add '#'.

Precision specifies the precision for floats. Log probs are converted back to probs, meaning high precisions may be needed to get non-zero output. A precision level of 9 is required to reproduce the table in J&M, Fig. 7.10

Variables Details

[hide private]

A1

Transition probs for HMM 1, the head-tails machine. Transitions to start state always have prob 0. Transitions to either of the other state always have prob 0.5

Value:

{'H': {'H': -1.0, 'T': -1.0, 's0': -inf},
 'T': {'H': -1.0, 'T': -1.0, 's0': -inf},
 's0': {'H': -1.0, 'T': -1.0, 's0': -inf}}

B1

Observation probs for HMM 1, in the head-tails machine. Start state has unbiased observation probs. The other 2 have observation probs 1.0 for heads and tails respectively.

Value:

{'h': {'H': 0.0, 'T': -inf, 's0': -1.0},
 't': {'H': -inf, 'T': 0.0, 's0': -1.0}}

B2

Value:

{'aa': {('aa', 'I'): 0.0,
        ('aa', 'on'): 0.0,
        ('ax', 'the'): -inf,
        ('ay', 'I'): -inf,
        ('d', 'need'): -inf,
        ('dh', 'the'): -inf,
        ('end', 'I'): -inf,
        ('end', 'need'): -inf,
...

A2

Value:

{('aa', 'I'): {('aa', 'I'): -inf,
               ('aa', 'on'): -14.3769797176,
               ('ax', 'the'): -inf,
               ('ay', 'I'): -inf,
               ('d', 'need'): -inf,
               ('dh', 'the'): -12.5600097067,
               ('end', 'I'): 0.0,
               ('end', 'need'): -inf,
...

states2

The states for HMM 2, which is the HMM implicit in J&M, Figure 7.10.

Value:

[('d', 'need'),
 ('iy', 'need'),
 ('n', 'need'),
 ('ax', 'the'),
 ('iy', 'the'),
 ('n', 'the'),
 ('dh', 'the'),
 ('n', 'on'),
...

A3

Value:

{'COLD': {'COLD': -0.736965594166,
          'HOT': -1.32192809489,
          'start0': -inf},
 'HOT': {'COLD': -1.73696559417,
         'HOT': -0.51457317283,
         'start0': -inf},
 'start0': {'COLD': -2.32192809489,
            'HOT': -0.321928094887,
...

B3

Value:

{'1': {'COLD': -1.0, 'HOT': -2.32192809489, 'start0': -inf},
 '2': {'COLD': -1.32192809489, 'HOT': -1.32192809489, 'start0': -inf},
 '3': {'COLD': -3.32192809489, 'HOT': -1.32192809489, 'start0': -inf}}

phonedic

Value:

{('aa', 'I'): -inf,
 ('aa', 'on'): -inf,
 ('ax', 'the'): 0.0,
 ('ay', 'I'): -inf,
 ('d', 'need'): -inf,
 ('dh', 'the'): -inf,
 ('end', 'I'): -inf,
 ('end', 'need'): -inf,
...