Computational Linguistics

Scope

& History

of Computational

Linguistics

Practical Applications
Linguistic Theory

Two Dimensions of variation

System Functionality

Semantic decision/classification systems

Knowledge extraction systems

Information Extraction systems

Command systems

Dialog systems

Translation systems

System Modality

Text

Speech

Multi-modal (primarily visual +)

Applications

Semantic Decision/classification systems News article classification (profit/earnings article, terrorist attack article) Sentiment detection (good/bad review; left/right political orientation) Spam detection Complaint classification

Knowledge extraction systems (Ontology Building) Named entity extraction (persons, organizations, places, events) Clustering terms into categories Hierarchy construction (finding the term "fruit" for the "orange", "apple", "grape",... cluster ) Relation discovery place/superplace event time, temporal event/event relations other part-whole relations (structural component, chemical constituent) [Medical, chemical weapon detection]

Information Search Systems

Database query

Bill Woods Moonrocks system (70s)
HP NL system (80s)
NLI Intellect System (Ginsparg, 90s)
Microsoft NL DB toolkit

Information Retrieval (IR) [find and display relevant information containers]

Document retrieval
Web search
Why this is not just simple document retrieval: Google
Text segmentation

Information Extraction (IE) [manipulate information into canonical forms]

Text Summarization

Find the most important paragraphs or sentences
Try to generate an abstract
Interacts with translation: Do this in funny languages!
Get help from the world! Marti Hearst: Find papers citing this paper. The text around the citation will often be a summary. Works best in science papers.

Document database query

Doc DB: Find the query-relevant paragraph or cluster of sentences (Litigation support)

Online help system query
Command-oriented Systems Online desktop Text editor Unix shell Robot [SRI's Flakey, with speech interface] Simulated armed forces (user=commander) Kitchen appliances
Dialog Systems Intelligent Tutors (Language, Reading, Math, etc. ) Student Modeling Behavior, skill, and preference modeling Error analysis
Translation Systems MT systems: Not very good, not very widely used, except perhaps on the Web ("Translate this page" button. Type "Poisson frit" to Google; Type "ryba smazona" to Google The translation, The original) Translator's assistant: a resource for a human translator

Theoretical Applications

Theory-specification tool Zellig-Harris Transformational Grammar and the Linguistic String Project (Transformational and Discourse Analysis Project, Sager, NYU String Grammar project) Chomskyan Transformational Grammars and ATNs (Woods, Moonrock system) Lexical Functional Grammar (LFG, Ron Kaplan, Xerox PARC system) Generalized Phrase Structure Grammar (GPSG, HP system) Head-Driven Phrase Structure Grammar (HPSG, HP system, Verbmobil English grammar at Stanford)
Theoretical modeling Processing models Syntax Garden path models (Bever 1970, McDonald 1993 and Jurafsky 1996) The horse raced past the barn fell. The complex houses married and single students and their families. The student forgot the solution was in the back of the book. Parse preference models (attachment preferences modeled; Pereira 1985, models "minimal attachment" and "right association" of Kimball 1973) "Shallow" parsing (finite-state) models of human processing (Church 1980, Ramshaw and Marcus 1995, Argammom, et al. 1998, Munoz et al. 1999, Chapter 10, our text) Semantics The astronomer married the star. The movie director married a star.(Reder 1983, Uszkoreit 1990) Speech recognition systems as crude phonological processing models Language Acquisition models Unsupervised learning systems Segmentation (Michael Brent) Derivational Morphology (Watkinson and Manandhar) PDP models Language change models (simulation) Briscoe 2000: biased learning creates linguistic change

History

Foundational insights (40s and 50s) McCulloch-Pitts neuron: a simplified computational model of a neuron Shannon (1948): automata for language, incorporating probablistic models Chomsky (1956): formal language theory Sound spectrograph (Koenig et al. 1946): foundation for instrumental phonetics. First machine speech recognizers (Bell Labs, Davis et al. 1952) Shannon-Weaver information theory: Noisy channel model
McCulloch-Pitts neuron: a simplified computational model of a neuron
Shannon (1948): automata for language, incorporating probablistic models
Chomsky (1956): formal language theory
Sound spectrograph (Koenig et al. 1946): foundation for instrumental phonetics.
First machine speech recognizers (Bell Labs, Davis et al. 1952)
Shannon-Weaver information theory: Noisy channel model
Two camps (57-70) AI Beginnings of Artifical Intelligence as a field (John McCarthy, Marvin Minsky, Clause Shannon, Nathaniel Rochester) Newell and Simon Logic Theorist and Problem Solver. computable models of reasoning and logic Subjects speak aloud as they solve problems Problem solving modeled with a production-system where the productions (or reasoning steps) correspond to the steps human reasoners took for those kinds of problems. George Miller and Donald Broadbent: importing computational ideas into psychology. Bayesian method and optical character recognition: Using probabilistic methods on recognition problems (see Chapter 5, history section, our text) Zellig Harris project (mentioned above) Brown Corpus
AI Beginnings of Artifical Intelligence as a field (John McCarthy, Marvin Minsky, Clause Shannon, Nathaniel Rochester) Newell and Simon Logic Theorist and Problem Solver. computable models of reasoning and logic Subjects speak aloud as they solve problems Problem solving modeled with a production-system where the productions (or reasoning steps) correspond to the steps human reasoners took for those kinds of problems. George Miller and Donald Broadbent: importing computational ideas into psychology. Bayesian method and optical character recognition: Using probabilistic methods on recognition problems (see Chapter 5, history section, our text)
Beginnings of Artifical Intelligence as a field (John McCarthy, Marvin Minsky, Clause Shannon, Nathaniel Rochester)
Newell and Simon Logic Theorist and Problem Solver. computable models of reasoning and logic Subjects speak aloud as they solve problems Problem solving modeled with a production-system where the productions (or reasoning steps) correspond to the steps human reasoners took for those kinds of problems.
Subjects speak aloud as they solve problems
Problem solving modeled with a production-system where the productions (or reasoning steps) correspond to the steps human reasoners took for those kinds of problems.
George Miller and Donald Broadbent: importing computational ideas into psychology.
Bayesian method and optical character recognition: Using probabilistic methods on recognition problems (see Chapter 5, history section, our text)
Zellig Harris project (mentioned above)
Brown Corpus
Four paradigms (70-83) Stochastic: HMMs in speech recognition (Jelinek, Bahl and Mercer at IBM) Logic-based programming Prolog: Q-systems aand metamorphosis grammars (Colmerauer) Prolog, Definite Clause Grammars (Pereira and Warren 1980) Unification grammar (Kay, Bresnan and Kaplan) Natural language understanding(Winograd, serious attention to semantics) Winograd SHRDLU, blocks world Yale school: Scripts, plans, goals (Schank and Abelson, Wilensky, Lehnert). Story and text understanding. Discourse-modeling (Grosz, Sidner, Perrault, Allen, Cohen). Discourse as plans guided by intentions and beliefs. Communicative acts as steps in plans.
Stochastic: HMMs in speech recognition (Jelinek, Bahl and Mercer at IBM)
Logic-based programming Prolog: Q-systems aand metamorphosis grammars (Colmerauer) Prolog, Definite Clause Grammars (Pereira and Warren 1980) Unification grammar (Kay, Bresnan and Kaplan)
Q-systems aand metamorphosis grammars (Colmerauer)
Prolog, Definite Clause Grammars (Pereira and Warren 1980)
Unification grammar (Kay, Bresnan and Kaplan)
Natural language understanding(Winograd, serious attention to semantics) Winograd SHRDLU, blocks world Yale school: Scripts, plans, goals (Schank and Abelson, Wilensky, Lehnert). Story and text understanding.
Winograd SHRDLU, blocks world
Yale school: Scripts, plans, goals (Schank and Abelson, Wilensky, Lehnert). Story and text understanding.
Discourse-modeling (Grosz, Sidner, Perrault, Allen, Cohen). Discourse as plans guided by intentions and beliefs. Communicative acts as steps in plans.
Empiricism and Finite-state models redux (83-99) Finite-state models phonology and morphology (Kaplan and Kay 1981, Koskenniemi, Karttunen) syntax (Church 1980) Probabilistic models Speech recognition work at IBM Part of speech tagging (history section, chapter 8) utter, direct: adj, v walk, pilot, sneer, help: N,V hard: adj, adv Probabilistic parsing (history section, chapter 12)
Finite-state models phonology and morphology (Kaplan and Kay 1981, Koskenniemi, Karttunen) syntax (Church 1980)
phonology and morphology (Kaplan and Kay 1981, Koskenniemi, Karttunen)
syntax (Church 1980)
Probabilistic models Speech recognition work at IBM Part of speech tagging (history section, chapter 8) utter, direct: adj, v walk, pilot, sneer, help: N,V hard: adj, adv Probabilistic parsing (history section, chapter 12)
Speech recognition work at IBM
Part of speech tagging (history section, chapter 8) utter, direct: adj, v walk, pilot, sneer, help: N,V hard: adj, adv
utter, direct: adj, v
walk, pilot, sneer, help: N,V
hard: adj, adv
Probabilistic parsing (history section, chapter 12)
The field comes together (94-99) Spread of probabilistic methods to all kinds of problems Commercial ventures using speech, some NLP The web Some lessened emphasis on theoretical work
Spread of probabilistic methods to all kinds of problems
Commercial ventures using speech, some NLP
The web
Some lessened emphasis on theoretical work

Reasons for the expansion of Computational Linguistics in recent years

Success of Speech recognition
Increase in computing power and storage capacity of machines
Availability of online text and speech resources in unprecedented quantities
Success of statistical methods
World-wide web (applications and data)
Increasing demand for translation
Globalization of information, standards, software
Availability of venture capital (no longer so true!)