文档介绍:*
David A. Smith (JHU UMass Amherst)
Jason Eisner (Johns Hopkins University)
Dependency Parsingby Belief Propagation
Outline
Edge-factored parsing
Dependency parses
Scoring the competing parses: Edge features
Finding the best parse
Higher-order parsing
Throwing in more features: Graphical models
Finding the best parse: Belief propagation
Experiments
Conclusions
New!
Old
MOD
Word Dependency Parsing
He reckons the current account deficit will narrow to only billion in September.
Raw sentence
Part-of-speech tagging
He reckons the current account deficit will narrow to only billion in September.
PRP VBZ DT JJ NN NN MD VB TO RB CD CD IN NNP .
POS-tagged sentence
Word dependency parsing
slide adapted from Yuji Matsumoto
Word dependency parsed sentence
He reckons the current account deficit will narrow to only billion in September .
SUBJ
ROOT
S-COMP
SUBJ
SPEC
MOD
MOD
COMP
COMP
What does parsing have to do with belief propagation?
loopy belief propagation
belief
loopy
propagation
*
Great ideas in NLP: Log-linear models (Berger, della Pietra, della Pietra 1996; Darroch & Ratcliff 1972)
In the beginning, we used generative models.
p(A) * p(B | A) * p(C | A,B) * p(D | A,B,C) * …
each choice depends on a limited part of the history
but which dependencies to allow?
what if they’re all worthwhile?
p(D | A,B,C)?
p(D | A,B,C)?
… p(D | A,B) * p(C | A,B,D)?
*
Great ideas in NLP: Log-linear models (Berger, della Pietra, della Pietra 1996; Darroch & Ratcliff 1972)
In the beginning, we used generative models.
Solution: Log-linear (max-entropy) modeling
Features may interact in arbitrary ways
Iterative scaling keeps adjusting the feature weightsuntil the model agrees with the training data.
p(A) * p(B | A) * p(C | A,B) * p(D | A,B,C) * …
which dependencies to allow? (given limited training data)
(1/Z)