Linguistics 581

Morphology and Finite-State Transducers

Morphological analysis: Finding morphological constituents

  1. angrily = angry + ly
  2. proven = prove + en
  3. * haven = have + en
  4. ducks = duck + pl
  5. ducks = duck + 3rdsg
  6. ground = ground (N, sg)
  7. ground = grind + pst
  8. undeniability = un + deny + able + ity

Part of speech: a class of words that share many properties. (more later). Examples: Nouns, verbs

Inflection vs. derivation

  1. duck vs ducks, cat vs. cats, ox vs. oxen: All nouns have plural forms (almost?: equipment, apparatus, furniture, infantry)
  2. walk vs walking, talk vs talking, smoke vs smoking: All nouns have -ing forms (gerund, present participle)
  3. Spanish verb amare ("to love")
Inflection: a morphological alternation common to all members of a part of speech:
    walking = walk + ing
    Form = Stem + Suffix

Sound system versus spelling

  1. English consonants
  2. English vowels

Allomorphy: plural s

  1. fox: foxes, /f aa k s/ + /ax z/
  2. dog: dogs, /d ao g/ + /z/
  3. duck: ducks, /d uh k/ + /s/
  4. lilly: lillies, /l ih l i/ + /z/

In this chapter we deal with spelling. This means we are concerned with spelling rules instead of phonological rules for allomorphy. We operate on orthographic representations not phonetic representations.

Spelling rules: plural
Orthographic Singular Phonology Orthographic Plural
teepee /t iy p iy/ + /z/ teepees
lilly /l ih l iy/ + /z/ lillies

Productive ending: [s] (the morpheme s, with its phonologically predictable allomorphs) versus irregular forms. Notice that many of the irregular forms are not formed by affixation.
Regular Irregular
  1. ducks = duck + PL
  2. lillies = lilly + PL
  3. fox = fox + PL
  4. hogs = hog + PL
  5. houses = house + PL
  6. cups = cup + PL
  7. bellies = belly + PL
  1. oxen = ox + PL
  2. children = child + PL
  3. deer = deer + PL
  4. mice = mouse + PL
  5. geese = goose + PL
  6. men = man + PL
  7. cacti = cactus + PL

For example, for plural forms, we say that all plural forms share the morphological feature +PL. The plural forms deer, men, mice, and geese, which are not realized by affixation, share the morphological feature +PL with forms like foxes and ducks, which are. The forms deer, man, mouse,goose,fox, and duck all share the morphological feature +SG.

We assume the category is a morphological feature.

Parsing versus Recognition

Morphological recognition: Accepts and rejects forms:

Morphological parsing produces a morphological analysis (stem first, followed by category of stem, followed by all affixes):

Morphotactic recognition

Morphotactics is the syntax of morphemes: what order they come in, what kind of units they make.

A basic morphotactic fact about affixes is where they attach with respect to the stem.

Plural -s is a suffix, un- is a prefix. There are also morphotactic facts about what kinds of things affixes attach to: The affix -able attaches to a verb and produces an adjective. The affix -ity attaches to an adjective and produces a noun.

Using FSAs to do recognition

Finite-State transducers

We introduce Finite-state transducers, an augmentation of FSAs in which there are two tapes

Relating intermediate (morphotactic) representation to surface (speeling rules).

The problem

  1. beg + ing = begging: Consonant doubling
  2. mak + ing = making: e-deletion
  3. watch + s = watches: e-insertion

E-insertion: insert an e after a morpheme ending in x, s, or z and before a word-final s.

Chomsky/Halle style "rewrite" rules:

This can be modeled with an FST.

Interpreting the rule:

Using the E-insertion rule when parsing: