Home | Trees | Indices | Help |
---|
|
This module contains code for turning different file representations of FSTs into Python FST reps: a list of dictionaries (with upper-lg keys) of dictionaries (with lower lgs keys). These can then be used by the transducer generation and analysis code in fsa_recognizer.fta_rec.
The toplevel function is read_fst_file.make_fst_from_file, which takes a compiler instance (particular to the format of the file being read) and a filename argument, returning an FST.
Implemented formats include ATT graphviz ("dot") files and xerox xfst ".net" files.
|
|||
FstCompiler Class for compiling FST representations in various file formats into equivalent raw Python dictionaries such that: |
|||
DotFstCompiler Code for compiling an ATT graphviz ("dot") representation of an FST into a Python FST rep. |
|||
XfstCompiler Code for compiling an Xerox xfst representation of an FST into a Python FST rep. |
|
|||
|
|||
|
|||
a set (of pairs of chars) |
|
||
|
|||
list of characters |
|
||
|
|
|||
alphabet =
|
|||
graph_type_re = re.compile(r'digraph')
|
|||
blank_line_re = re.compile(r'\s
|
|||
node_dec_re = re.compile(r'\s
|
|||
node_name_re = re.compile(r'\s
|
|||
node_connection_re = re.compile(r'\s
|
|||
label_re = re.compile(r'
|
|||
char_interval_re = re.compile(r'
|
|||
att_val_re = re.compile(r'\s
|
|||
att_val_re2 = re.compile(r'\s
|
|||
flags_re = re.compile(r'Flags:
|
|||
net_dec_re = re.compile(r'Net:')
|
|||
sigma_dec_re = re.compile(r'Sigma:
|
|||
size_dec_re = re.compile(r'Size:\s
|
|||
arity_dec_re = re.compile(r'Arity:\s
|
|||
state_desc_re = re.compile(r'
|
|
Open source file Assumptions for the dot file:
Assumptions for FST:
|
We assume states are string reps of ints and the state numbered 0 is always initial state. We create an FST whose state dictionaries are all totally defined for the set of feasiable pairs, filling in the gaps in raw_dictionary with transitions to ('und',). We sort the dictionary items according to the individual compiler's cmp fn, assuming that the start state will get ranked first. We assume every state fn in raw_dic is a dictionary. Two keys are special:
All other keys are upper lg chars whose values are dictionaries whose keys are lower lg chars: raw_dic[state][upper][lower] = the set of states transitioned to in teh state C{state} when upper char C{char} is paired with lower lg char C{lower} |
Assumptions for FST:
Search the raw_dic rep of the FST, adding any new pairs found to the initial mapping.
|
Expand a char interval rep like a..z into a list of characters.
|
|
node_connection_re
|
state_desc_re
|
Home | Trees | Indices | Help |
---|
Generated by Epydoc 3.0.1 on Wed Feb 4 22:00:34 2009 | http://epydoc.sourceforge.net |