Sunday, May 28, 2006, 11:15-12:15
Room 309
--------------------------------------------------------------------------------
Ronen Feldman
Bar
Title: Improving Self-Supervised Relation Extraction from the Web
Abstract:
Web extraction systems attempt to use the immense amount of unlabeled
text in the Web in order to create large lists of entities and
relations. Unlike traditional IE methods, the Web extraction systems
do not label every mention of the target entity or relation, instead
focusing on extracting as many different instances as possible while
keeping the precision of the resulting list reasonably high. SRES is a
self-supervised Web relation extraction system that learns powerful
extraction patterns from unlabeled text, using short descriptions of
the target elations and their attributes. SRES automatically generates
the training data needed for its pattern-learning component. The
performance of SRES is further enhanced by classifying its output
instances using the properties of the extracted patterns. The features
we use for classification and the trained classification model are
independent from the target relation, which we demonstrate in a series
of experiments. We also compare the performance of SRES to the
performance of the state-of-the-art KnowItAll system, and to the
performance of its pattern learning component, which uses a simpler
and less powerful pattern language than SRES.