文档介绍:When Corpus Meets Theory
James Pustejovsky
TSD 2002
September 10, 2002
Models and Data
Talk Outline
Goals for Language Modeling
The Role of Corpus in Theory
Disambiguation
Selection discovery
Clustering
Category modification and formation
Grammar induction
The Role of Theory in Corpus
Goals of Language Modeling
Statistically informed models improve application performance
Speech
Search
Clustering
Parsing
Machine translation
Summarization
Question answering
Theory Drives the Model
Corpus Behavior of words is determined by their type.
You can’t find what you can’t model.
But, you don’t want to find only what you model!
Theory allows a model of reality, but …
Corpus brings reality to the model.
Language Modeling with Generative Lexicon
Selection integrates paradigmatics and syntagmatics
Models the relationship between selectional contexts
Coercion in plex type (Dot Objects)
All major categories behave functionally
Qualia structure models much of this behavior
Semantic Types are differentiated and ranked:
Grammatical behavior follows (generally) from type
Quine’s Gambit in Corpora
Co-occurrence reveals surface relations.
Paradigmatics is first order.
Syntagmatics is first order.
LSA and other techniques create non-superficial associations.
Model Bias is necessary to create decision procedures
Example: Complex Types
Recognizing Selection
1. a. The man fell/died.
b. The rock fell/!died.
a. John forced/!convinced the door to open.
b. John forced/convinced the guests to leave.
a. John poured milk into /!on his coffee.
b. John poured milk into/on the bowl.
Modeling Paradigmatic Systems
Integrating Selection into Grammars
Qualia are used to create new types:
They are generative coherence relations between types.
Qualia Structure