Scope

& History

of Computational

Linguistics

* Practical Applications
* Linguistic Theory

Two Dimensions of variation

* System Functionality
    * Semantic decision/classification systems
    * Knowledge extraction systems
    * Information Extraction systems
    * Command systems
    * Dialog systems
    * Translation systems
* System Modality
    * Text
    * Speech
    * Multi-modal (primarily visual +)

Applications

* Semantic Decision/classification systems
* News article classification (profit/earnings article, terrorist attack article)
* Sentiment detection (good/bad review; left/right political orientation)
* Spam detection
* Complaint classification
* Knowledge extraction systems (Ontology Building)
* Named entity extraction (persons, organizations, places, events)
* Clustering terms into categories
* Hierarchy construction (finding the term "fruit" for the "orange", "apple", "grape",... cluster )
* Relation discovery
    * place/superplace
    * event time, temporal event/event relations
    * other part-whole relations (structural component, chemical constituent) [Medical, chemical weapon detection]
* Information Search Systems
* Database query
    * Bill Woods Moonrocks system (70s)
    * HP NL system (80s)
    * NLI Intellect System (Ginsparg, 90s)
    * Microsoft NL DB toolkit
* Information Retrieval (IR) [find and display relevant information containers]
    * Document retrieval
    * Web search
    Why this is not just simple document retrieval: Google
    * Text segmentation
* Information Extraction (IE) [manipulate information into canonical forms]
    * Text Summarization
      * Find the most important paragraphs or sentences
      * Try to generate an abstract
      * Interacts with translation: Do this in funny languages!
      * Get help from the world! Marti Hearst: Find papers citing this paper. The text around the citation will often be a summary. Works best in science papers.
    * Document database query
      * Doc DB: Find the query-relevant paragraph or cluster of sentences (Litigation support)
    * Online help system query
* Command-oriented Systems
* Online desktop
* Text editor
* Unix shell
* Robot [SRI's Flakey, with speech interface]
* Simulated armed forces (user=commander)
* Kitchen appliances
* Dialog Systems
* Intelligent Tutors (Language, Reading, Math, etc. )
* Student Modeling
    * Behavior, skill, and preference modeling
    * Error analysis
* Translation Systems
* MT systems: Not very good, not very widely used, except perhaps on the Web ("Translate this page" button. Type "Poisson frit" to Google; Type "ryba smazona" to Google The translation, The original)
* Translator's assistant: a resource for a human translator

Theoretical Applications

* Theory-specification tool
* Zellig-Harris Transformational Grammar and the Linguistic String Project (Transformational and Discourse Analysis Project, Sager, NYU String Grammar project)
* Chomskyan Transformational Grammars and ATNs (Woods, Moonrock system)
* Lexical Functional Grammar (LFG, Ron Kaplan, Xerox PARC system)
* Generalized Phrase Structure Grammar (GPSG, HP system)
* Head-Driven Phrase Structure Grammar (HPSG, HP system, Verbmobil English grammar at Stanford)
* Theoretical modeling
* Processing models
    * Syntax
      * Garden path models (Bever 1970, McDonald 1993 and Jurafsky 1996)
        * The horse raced past the barn fell.
        * The complex houses married and single students and their families.
        * The student forgot the solution was in the back of the book.
      * Parse preference models (attachment preferences modeled; Pereira 1985, models "minimal attachment" and "right association" of Kimball 1973)
      * "Shallow" parsing (finite-state) models of human processing (Church 1980, Ramshaw and Marcus 1995, Argammom, et al. 1998, Munoz et al. 1999, Chapter 10, our text)
    * Semantics
    The astronomer married the star.
    The movie director married a star.(Reder 1983, Uszkoreit 1990)
    * Speech recognition systems as crude phonological processing models
* Language Acquisition models
    * Unsupervised learning systems
    * Segmentation (Michael Brent)
    * Derivational Morphology (Watkinson and Manandhar)
    * PDP models
* Language change models (simulation)
Briscoe 2000: biased learning creates linguistic change

History


* Foundational insights (40s and 50s)
    * McCulloch-Pitts neuron: a simplified computational model of a neuron
    * Shannon (1948): automata for language, incorporating probablistic models
    * Chomsky (1956): formal language theory
    * Sound spectrograph (Koenig et al. 1946): foundation for instrumental phonetics.
    * First machine speech recognizers (Bell Labs, Davis et al. 1952)
    * Shannon-Weaver information theory: Noisy channel model
* Two camps (57-70)
    * AI
      * Beginnings of Artifical Intelligence as a field (John McCarthy, Marvin Minsky, Clause Shannon, Nathaniel Rochester)
      * Newell and Simon Logic Theorist and Problem Solver. computable models of reasoning and logic
        * Subjects speak aloud as they solve problems
        * Problem solving modeled with a production-system where the productions (or reasoning steps) correspond to the steps human reasoners took for those kinds of problems.
      * George Miller and Donald Broadbent: importing computational ideas into psychology.
      * Bayesian method and optical character recognition: Using probabilistic methods on recognition problems (see Chapter 5, history section, our text)
    * Zellig Harris project (mentioned above)
    * Brown Corpus
* Four paradigms (70-83)
    * Stochastic: HMMs in speech recognition (Jelinek, Bahl and Mercer at IBM)
    * Logic-based programming Prolog:
      * Q-systems aand metamorphosis grammars (Colmerauer)
      * Prolog, Definite Clause Grammars (Pereira and Warren 1980)
      * Unification grammar (Kay, Bresnan and Kaplan)
    * Natural language understanding(Winograd, serious attention to semantics)
      * Winograd SHRDLU, blocks world
      * Yale school: Scripts, plans, goals (Schank and Abelson, Wilensky, Lehnert). Story and text understanding.
    * Discourse-modeling (Grosz, Sidner, Perrault, Allen, Cohen). Discourse as plans guided by intentions and beliefs. Communicative acts as steps in plans.
* Empiricism and Finite-state models redux (83-99)
    * Finite-state models
      * phonology and morphology (Kaplan and Kay 1981, Koskenniemi, Karttunen)
      * syntax (Church 1980)
    * Probabilistic models
      * Speech recognition work at IBM
      * Part of speech tagging (history section, chapter 8)
        * utter, direct: adj, v
        * walk, pilot, sneer, help: N,V
        * hard: adj, adv
      * Probabilistic parsing (history section, chapter 12)
* The field comes together (94-99)
    * Spread of probabilistic methods to all kinds of problems
    * Commercial ventures using speech, some NLP
    * The web
    * Some lessened emphasis on theoretical work

Reasons for the expansion of Computational Linguistics in recent years

* Success of Speech recognition
* Increase in computing power and storage capacity of machines
* Availability of online text and speech resources in unprecedented quantities
* Success of statistical methods
* World-wide web (applications and data)
* Increasing demand for translation
* Globalization of information, standards, software
* Availability of venture capital (no longer so true!)