Seminar – Managing Information on the
Web
Tova Milo, Spring 2011
Seminar Information
The seminar focuses on managing,
analyzing, sharing, and integrating data and applications across multiple
sources, either on the Internet or at enterprises. This topic has received much
attention in the database, AI, Web, IR and verification communities. We shall
read recent papers in this area, focusing on several specific issues, and then
explore possible future directions. A list of tentative topics/papers is
enclosed.
Papers
·
Data Exchange, Extraction and
Integration
1.
Evaluating Entity
Resolution Results, David Menestrina, Steven Whang, Hector Garcia-Molina, VLDB 2010 (Barak Cohen 2/3)
2.
Automatic
Rule Refinement for Information Extraction, Bin Liu, Laura Chiticariu, Vivian Chu, H. Jagadish,
Frederick Reiss, VLDB 2010 (Rinat Pichker 9/3)
3.
Exploiting
Content Redundancy for Web Information Extraction, Pankaj
Gulhane,
Rajeev Rastogi, Srinivasan Sengamedu, Ashwin Tengli, VLDB 2010 (Evgeny Budilovsky 16/3)
4.
Entity
Resolution with Evolving Rules, Steven Whang,
Hector Garcia-Molina, VLDB 2010 (Eran Kravitz
30/3)
5.
MapMerge: Correlating Independent Schema Mappings, Bogdan Alexe, Mauricio Hernandez,
Lucian Popa, Wang-Chiew
Tan, VLDB 2010
·
Web, Recommendations and Social
Networks
1. Active
Knowledge: Dynamically Enriching RDF Knowledge Bases by Web Services, Nicoleta Preda; Fabian Suchanek, Gjergji Kasneci, Thomas Neumann, Wenjun
Yuan, Gerhard Weikum SIGMOD 2010 (Amit Somech 6/4)
2. Human-Assisted Graph
Search: It's Okay to Ask Questions , A. Parameswaran, A. Das Sarma, H.
Garcia-Molina, N. Polyzotis, J. Widom.
To appear in VLDB 2011 (Bar Avidan
27/4)
3. Load-Balanced
Query Dissemination in Democratic Communities, Emiran
Curtmola; Alin Deutsch;
K.K. Ramakrishnan; Divesh Srivastava SIGMOD
2010 (Ofir
Weisse 4/5)
·
Provenance
1.
TRAMP:
Understanding the Behavior of Schema Mappings through Provenance, Boris Glavic, Gustavo Alonso, Renée Miller, Laura
Haas, VLDB 2010 (Yaron Margalit 11/5)
2.
Querying Data
Provenance, Grigoris Karvounarakis;
Zachary Ives, Val Tannen SIGMOD 2010 (Hila
Cohen 18/5)
3.
Efficient
Querying and Maintenance of Network Provenance at Internet-Scale, Wenchao Zhou; Micah Sherr; Tao Tao; Xiaozhou Li; Boon Thau Loo; Yun
Mao SIGMOD 2010
·
Cloud Computing
1.
Hadoop++: Making a Yellow Elephant Run Like a Cheetah
(Without It Even Noticing), Jens Dittrich, Jorge Quiane, Alekh Jindal, Yagiz Kargin, Vinay Setty, Jörg Schad, VLDB 2010 (Alexandra Shpindovsky
1/6)
2.
MRShare: Sharing Across Multiple Queries in MapReduce, Tomasz Nykiel, Michalis Potamias, Chaitanya Mishra, George Kollios, Nick Koudas, VLDB 2010 (Itay Maoz 9/6)
·
Probabilistic Data
1. Querying Probabilistic
Information Extraction , Daisy Zhe Wang, Michael
Franklin, Minos
Garofalakis, Joseph Hellerstein,
VLDB 2010
2. Lineage
Processing over Correlated Probabilistic Databases, BHARGAV KANAGAL,
University of Maryland; Amol Deshpande,
Univ of Maryland
SIGMOD 2010
3. Evaluation
of probabilistic threshold queries in MCDB, Luis Perez, Rice University; Subi Arumugam, U Florida;
Christopher Jermaine, Rice U. SIGMOD 2010
4. MCDB-R: Risk
Analysis in the Database, MCDB-R: Risk Analysis in the Database