Seminar – Managing Information on the
Web
Tova Milo, Winter 2010
Seminar Information
The seminar focuses on managing,
analyzing, sharing, and integrating data and applications across multiple
sources, either on the Internet or at enterprises. This topic has received much
attention in the database, AI, Web, IR and verification communities. We shall
read recent papers in this area, focusing on several specific issues, and then
explore possible future directions. A list of tentative topics/papers is
enclosed.
Papers
·
Probabilistic Data
1. Scalable
Probabilistic Databases with Factor Graphs and MCMC, Michael Wick, Andrew
McCallum, Gerome Miklau, VLDB 2010 [Boris Kostenko
3/11]
2. Querying Probabilistic
Information Extraction , Daisy Zhe Wang, Michael
Franklin, Minos
Garofalakis, Joseph Hellerstein,
VLDB 2010
3. Lineage
Processing over Correlated Probabilistic Databases, BHARGAV KANAGAL,
University of Maryland; Amol Deshpande,
Univ of Maryland
SIGMOD 2010
4. Evaluation
of probabilistic threshold queries in MCDB, Luis Perez, Rice University; Subi Arumugam, U Florida;
Christopher Jermaine, Rice U. SIGMOD 2010
5. MCDB-R: Risk
Analysis in the Database, MCDB-R: Risk Analysis in the Database
·
Query Processing
1. Why not, Adriane Chapman, H. V. Jagadish, SIGMOD 2009 [Alon Vekker 17/11 ]
2.
How
to ConQueR Why-Not Questions, Quoc
Trung Tran, Chee-Yong Chan,
SIGMOD 2010 [Slava Novgorodov 24/11]
·
Web, Recommendations and Social
Networks
1. Active
Knowledge: Dynamically Enriching RDF Knowledge Bases by Web Services, Nicoleta Preda; Fabian Suchanek, Gjergji Kasneci, Thomas Neumann, Wenjun
Yuan, Gerhard Weikum SIGMOD 2010 [Ohad Greenshpan 8/12]
2. Multiple
Features Fusion for Social Media Applications, in Cui, Anthony Tung; Ce Zhang; Zhe Zhao, SIGMOD 2010 [Ori Folger
15/12]
3. Recsplorer: Recommendation Algorithms based on Precedence
Mining, Aditya Parameswaran,
Georgia Koutrika, Benjamin Bercovitz,
Hector Garcia-Molina, SIGMOD 2010 [Rubi Boim 22/12]
4. Load-Balanced
Query Dissemination in Democratic Communities, Emiran
Curtmola; Alin Deutsch;
K.K. Ramakrishnan; Divesh Srivastava SIGMOD
2010
·
Data Exchange, Extraction and
Integration
1.
Towards
The Web of Concepts: Extracting Concepts from Large Datasets, Aditya Parameswaran, Hector
Garcia-Molina, Anand
Rajaraman, VLDB 2010 [Tom Yam 5/1]
2.
MapMerge: Correlating Independent Schema Mappings, Bogdan Alexe, Mauricio Hernandez,
Lucian Popa, Wang-Chiew
Tan, VLDB 2010
3.
Evaluating Entity
Resolution Results, David Menestrina, Steven Whang, Hector Garcia-Molina, VLDB 2010
4.
Entity
Resolution with Evolving Rules, Steven Whang,
Hector Garcia-Molina, VLDB 2010
5.
Exploiting
Content Redundancy for Web Information Extraction, Pankaj
Gulhane,
Rajeev Rastogi, Srinivasan Sengamedu, Ashwin Tengli, VLDB 2010
6.
Automatic
Rule Refinement for Information Extraction, Bin Liu, Laura Chiticariu, Vivian Chu, H. Jagadish,
Frederick Reiss, VLDB 2010
·
DBs and Flash
1.
FlashStore: High Throughput Persistent Key-Value Store,
B. Debnah, S. Sengupta, J.
Li, VLDB 2010 [Aviad Zuck 12/1]
·
Provenance
1.
Efficient
Querying and Maintenance of Network Provenance at Internet-Scale, Wenchao Zhou; Micah Sherr; Tao Tao; Xiaozhou Li; Boon Thau Loo; Yun
Mao SIGMOD 2010
1.
Querying Data
Provenance, Grigoris Karvounarakis;
Zachary Ives, Val Tannen SIGMOD 2010
2.
TRAMP: Understanding
the Behavior of Schema Mappings through Provenance, Boris Glavic, Gustavo Alonso, Renée Miller, Laura
Haas, VLDB 2010
·
Cloud Computing
1.
MRShare: Sharing Across Multiple Queries in MapReduce, Tomasz Nykiel, Michalis Potamias, Chaitanya Mishra, George Kollios, Nick Koudas, VLDB 2010
2.
Hadoop++: Making a Yellow Elephant Run Like a Cheetah
(Without It Even Noticing), Jens Dittrich, Jorge Quiane, Alekh Jindal, Yagiz Kargin, Vinay Setty, Jörg Schad, VLDB 2010