Package viterbi :: Module read_probs
[hide private]
[frames] | no frames]

Module read_probs

source code

Classes [hide private]
  DictDispatcher
Functions [hide private]
 
read_probs_file(filename, debug=False)
Read in either, case(a), a file representing 2 parameter probs in format:
source code
Variables [hide private]
  word_re = re.compile(r'(\w+):$')
  bigram_re = re.compile(r'(#|\w+)\s+(\w+)\s+(\d?\.\d+)$')
  blank_re = re.compile(r'\w*$')
  start_re = re.compile(r'(#)')
Function Details [hide private]

read_probs_file(filename, debug=False)

source code 

Read in either, case(a), a file representing 2 parameter probs in format:

  p1 p2 prob

Or, case(b), a file containing a pronounciation dict with lines of the form:

  word:
    phon1 phon2 prob
        ...
    phoni phonj prob

In Case (a) return 2-param prob dictionary:

  D[w1][2] = probability of w2 given w1

In Case (b) return 3-param prob dictionary:

  D[w][phon1][phon2] = probability of phon2 given phon1 when
                       pronouncing word w

These dictionaries are printed to files with the same base as filename, but a ".py" extension. Default: extension of input files will be ".txt".