Module viterbi_search
source code
Code implementing a Python version of the Viterbi algorithm for HMMs,
which computes the lowest probability path through the HMM, given a
sequence of observations.
Path probabilities through the HMM are computed using TWO probability
models, implemented as dictionaries. The first is a transition
probability model (A
) and the second is the emission
probability model (B
). The transition prob:
A[s1][s2]
returns the probability of transitioning to s2 given that you are at
state s1 (Pr(s2|s1)). The emission prob:
B[o][s]
is the probability of observing a string o
given you are
at state s
(Pr(o|s)). Note A
and
B
are dictionaries of dictionaries but A
is a
dictionary with state keys, and B
is a dictionary with
observation keys. Symbols not in the HMM alphabet will produce
errors.
To run the decoder:
>>> decode_line('313',A3, B3, states3, start3)
This runs with Machine 3, the ice cream/weather machine of Chapter 6,
Jurafsky & Martin, with the input diagrammed in Fig 6.10. The first
argument is the input and the next four define the HMM:
-- A3: the state transition probs
-- B3: the observation probs
-- states3: the set of states of the machine
-- start3 the start state
An optional 5th Boolean argument toggles HMM emission prob predicting.
With predicting turned on (the default), emission probs for the state one
is coming FROM are used, which means one had to transition to that state
before seeing the emission that mattered there. So one had
to"predict" right (i.e., guess). The default value for this
argument is True. The only machine defined in this file that runs with
predicting turned off is Machine 1:
>>> decode_line('httt',A1, B1, states1, start1, True)
With predicting turned off, emission probs for the state one is going
to are used, which means one sees the emission probs and the emission at
the same time, and no predicting is necessary to choose the state with
the highest emission prob for the current emission.
This file includes three example HMMs. The first is a simple coin
flipping HMM, which, with predicting turned on, tries to
"guess" the result of the next coin flip in a sequence of coin
flips by transitioning at each observation to either a head-predicting
state ('H') or a tail-predicting state ('T'). The observation alphabet
consists of the lower case letters 'h' and 't'.
Machine 2 is the machine used for the word decoder in the first
edition of Jurafsky and Martin, Ch.7, Fig. 9, which gives path
probilities for decoding a sequence of phones as a sequence of words. A
state state
is a pair of a word and a phone. The emission
probs in B2
are always either 1.0 or 0.0, 1.0 for any input
phone that matches the phone that state represents,
state[0]
, 0 for all others. For example, state
('dh','the')
has probability 1.0 for"dh" and
probability 0 for all other phones.The transition probs, A2
,
are of the form:
A2[phone1][phone2]
and represent the probability of phone2 given phone1. It includes
cross-word transitions. These are computed by combining the probability
models in viterbi.pron_probs and viterbi.bigram_probs.
a 4-tuple of trellis (the Viterbi table),
back (the backtrace table), the best probability path
through the HMM (list), and nice_path_string (a pretty
string representation of the best path, suitable for printing).
|
decode(Stringseq,
A,
B,
states,
s0,
predict=False)
Takes a list of strings, Stringseq , each element a word
according to the HMM defined by A ,B ,
states , and s0 . |
source code
|
|
dictionary of dictionaries
|
make_transition_dict(bigram_prob_dict,
pron_prob_dict,
states)
Turn dictionaries representing bigrams probs and pronunciation models
into transition_dict , an HMM transition function such
that: |
source code
|
|
|
switch_transition_function_to_log_probs(A,
states) |
source code
|
|
|
switch_emission_function_to_log_probs(B,
states) |
source code
|
|
|
|
|
print_info(s_from,
s_to,
A,
emission_probs,
trellis) |
source code
|
|
|
decode_line(line,
A,
B,
states,
start,
predict=False) |
source code
|
|
|
neg_infinity = -inf
|
|
back = [ ]
|
|
trellis = [ ]
|
|
start1 = ' s0 '
Start state for HMM 1, the head-tails machine.
|
|
states1 = [ ' s0 ' , ' H ' , ' T ' ]
States list for HMM 1, the head-tails machine.
|
|
A1 = { ' H ' : { ' H ' : -1.0, ' T ' : -1.0, ' s0 ' : -inf} , ' T ' : { ' H ' : -1.0...
Transition probs for HMM 1, the head-tails machine.
|
|
B1 = { ' h ' : { ' H ' : 0.0, ' T ' : -inf, ' s0 ' : -1.0} , ' t ' : { ' H ' : -inf, ...
Observation probs for HMM 1, in the head-tails machine.
|
|
start2 = ( ' start ' , ' start ' )
The starting state for HMM 2, which is the HMM implicit in J&M, Figure 7.10.
|
|
phones = set([ ' aa ' , ' ax ' , ' ay ' , ' d ' , ' dh ' , ' iy ' , ' n ' ])
|
|
B2 = { ' aa ' : { ( ' aa ' , ' I ' ) : 0.0, ( ' aa ' , ' on ' ) : 0.0, ( ' ax ' , ' the ' ...
|
|
A2 = { ( ' aa ' , ' I ' ) : { ( ' aa ' , ' I ' ) : -inf, ( ' aa ' , ' on ' ) : -14.37697...
|
|
states2 = [ ( ' d ' , ' need ' ) , ( ' iy ' , ' need ' ) , ( ' n ' , ' need ' ) , ( ' ax ' ...
The states for HMM 2, which is the HMM implicit in J&M, Figure 7.10.
|
|
start3 = ' start0 '
|
|
states3 = [ ' start0 ' , ' HOT ' , ' COLD ' ]
|
|
A3 = { ' COLD ' : { ' COLD ' : -0.736965594166, ' HOT ' : -1.32192809489, ...
|
|
B3 = { ' 1 ' : { ' COLD ' : -1.0, ' HOT ' : -2.32192809489, ' start0 ' : -in...
|
|
Stringseq = [ ' aa ' , ' n ' , ' iy ' , ' dh ' , ' ax ' ]
|
|
phone = ' ax '
|
|
phonedic = { ( ' aa ' , ' I ' ) : -inf, ( ' aa ' , ' on ' ) : -inf, ( ' ax ' , ' the ...
|
|
state = ( ' start ' , ' start ' )
|
decode(Stringseq,
A,
B,
states,
s0,
predict=False)
| source code
|
Takes a list of strings, Stringseq , each element a word
according to the HMM defined by A ,B ,
states , and s0 . Returns path , a
list of states of length len(Stringseq)+1, representing the highest prob
HMM path that accepts Stringseq, as well as the two tables built in the
Viterbi computation. Note: implements the pseudocode of J&M, Figure 7.09.
trellis is the table the Viterbi algorithm fills in.
Therefore, trellis will be an array of length T+1 (length of
input + t=0) Each element will be a dictionary with states as keys.
trellis[t][s] is the viterbi score of state s at time t [a log prob]
back (for "backtrace") will be an array with
the same dimensions:
back[t][s]
is the best state to have come FROM to get to s at t.
- Parameters:
Stringseq - a list of strings representin the sequence input observations
A - Transition prob table of HMM
B - Observation prob table of HMM
states - A list of states for HMM
s0 - Start state of HMM
- Returns: a 4-tuple of
trellis (the Viterbi table),
back (the backtrace table), the best probability path
through the HMM (list), and nice_path_string (a pretty
string representation of the best path, suitable for printing).
|
make_transition_dict(bigram_prob_dict,
pron_prob_dict,
states)
| source code
|
Turn dictionaries representing bigrams probs and pronunciation models
into transition_dict , an HMM transition function such
that:
transition_dict[(phone_from,word_from)][(phone_to,word_to)] = prob
This dictionary represents the kind of global phone to phone
transition probs shown in J&M, Figure 7.08, including cross-word
transitions.
This transition probability across words is estimated as:
Pr(end | word_from,phone_from) *
Pr(phone_to | start, word_to) *
Pr(word_to | word_from)
or equivalently:
pron_prob_dict[word_from][phone_from][end] *
pron_prob_dict[word_to][start][phone_to] *
bigram_prob_dict[word_from][word_to]
Thus the resulting HMM transition table has no start word and end word
states.
- Parameters:
bigram_prob_dict - A dictionary of dictionaries representing bigram probs.
pron_prob_dict - A dictionary of dictionaries representing with word keys
representing bigram pronunication probs.
states - the list of states for the current HMM.
- Returns: dictionary of dictionaries
|
print_viterbi_table(viterbi,
stringseq,
states,
start,
precision=6)
| source code
|
stringseq may be a list. Indicates phone list. Add '#'.
Precision specifies the precision for floats. Log probs are converted
back to probs, meaning high precisions may be needed to get non-zero
output. A precision level of 9 is required to reproduce the table in J&M, Fig. 7.10
|
A1
Transition probs for HMM 1, the head-tails machine. Transitions to start state
always have prob 0. Transitions to either of the other
state always have prob 0.5
- Value:
{ ' H ' : { ' H ' : -1.0, ' T ' : -1.0, ' s0 ' : -inf} ,
' T ' : { ' H ' : -1.0, ' T ' : -1.0, ' s0 ' : -inf} ,
' s0 ' : { ' H ' : -1.0, ' T ' : -1.0, ' s0 ' : -inf} }
|
|
B1
Observation probs for HMM 1, in the head-tails machine. Start state has unbiased
observation probs. The other 2 have observation probs 1.0 for heads and
tails respectively.
- Value:
{ ' h ' : { ' H ' : 0.0, ' T ' : -inf, ' s0 ' : -1.0} ,
' t ' : { ' H ' : -inf, ' T ' : 0.0, ' s0 ' : -1.0} }
|
|
B2
- Value:
{ ' aa ' : { ( ' aa ' , ' I ' ) : 0.0,
( ' aa ' , ' on ' ) : 0.0,
( ' ax ' , ' the ' ) : -inf,
( ' ay ' , ' I ' ) : -inf,
( ' d ' , ' need ' ) : -inf,
( ' dh ' , ' the ' ) : -inf,
( ' end ' , ' I ' ) : -inf,
( ' end ' , ' need ' ) : -inf,
...
|
|
A2
- Value:
{ ( ' aa ' , ' I ' ) : { ( ' aa ' , ' I ' ) : -inf,
( ' aa ' , ' on ' ) : -14.3769797176,
( ' ax ' , ' the ' ) : -inf,
( ' ay ' , ' I ' ) : -inf,
( ' d ' , ' need ' ) : -inf,
( ' dh ' , ' the ' ) : -12.5600097067,
( ' end ' , ' I ' ) : 0.0,
( ' end ' , ' need ' ) : -inf,
...
|
|
states2
The states for HMM 2, which is the HMM implicit in J&M, Figure 7.10.
- Value:
[ ( ' d ' , ' need ' ) ,
( ' iy ' , ' need ' ) ,
( ' n ' , ' need ' ) ,
( ' ax ' , ' the ' ) ,
( ' iy ' , ' the ' ) ,
( ' n ' , ' the ' ) ,
( ' dh ' , ' the ' ) ,
( ' n ' , ' on ' ) ,
...
|
|
A3
- Value:
{ ' COLD ' : { ' COLD ' : -0.736965594166,
' HOT ' : -1.32192809489,
' start0 ' : -inf} ,
' HOT ' : { ' COLD ' : -1.73696559417,
' HOT ' : -0.51457317283,
' start0 ' : -inf} ,
' start0 ' : { ' COLD ' : -2.32192809489,
' HOT ' : -0.321928094887,
...
|
|
B3
- Value:
{ ' 1 ' : { ' COLD ' : -1.0, ' HOT ' : -2.32192809489, ' start0 ' : -inf} ,
' 2 ' : { ' COLD ' : -1.32192809489, ' HOT ' : -1.32192809489, ' start0 ' : -inf} ,
' 3 ' : { ' COLD ' : -3.32192809489, ' HOT ' : -1.32192809489, ' start0 ' : -inf} }
|
|
phonedic
- Value:
{ ( ' aa ' , ' I ' ) : -inf,
( ' aa ' , ' on ' ) : -inf,
( ' ax ' , ' the ' ) : 0.0,
( ' ay ' , ' I ' ) : -inf,
( ' d ' , ' need ' ) : -inf,
( ' dh ' , ' the ' ) : -inf,
( ' end ' , ' I ' ) : -inf,
( ' end ' , ' need ' ) : -inf,
...
|
|