Joint Modeling for Entity Analysis
Greg Durrett
UC Berkeley
Tuesday, November 25, 2014
12:30 p.m., Conference Room 5A
Many core NLP tasks require reasoning about the entities in a document: who is mentioned, what are they doing, and what else do we know about them? We present a joint model of three core tasks in the entity analysis stack: coreference resolution (within-document clustering), named entity recognition (coarse semantic typing), and entity linking (matching to Wikipedia entities). Our model is formally a structured conditional random field. It builds on top of highly accurate component models for each task: we demonstrate that the coreference component performs well due to a feature set that captures key linguistic phenomena in a simple, uniform way. Factors in the joint model then represent cross-task interactions, such as the constraint that coreferent mentions have the same semantic type. On the ACE 2005 and OntoNotes datasets, we achieve state-of-the-art results for all three tasks. Moreover, joint modeling improves performance on all tasks over the component models in isolation.
Bio:
Greg Durrett is a PhD candidate at UC Berkeley, advised by Dan Klein. He works on a range of topics in statistical natural language processing including coreference resolution, morphological analysis, and syntactic parsing.