Seminar – Advanced Topics in Web Data Management

 

Tova Milo, Daniel Deutch, 2014/15

 

Meetings: Wednesdays 16-18, Schreiber 309

 


Seminar Information

 

The seminar focuses on managing, analyzing, sharing, and integrating data and applications on the web. Areas of interest include

crowdsourcing, data exploration, Big Data, probabilistic data and data provenance. We shall read recent

papers in this area, focusing on several specific issues, and then explore possible future directions. A tentative list of

papers is enclosed.

 

 

Meetings

 

Fall semester

 

.1211 Slava

Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing,

Chong Sun, Narasimhan Rampalli, Frank Yang, AnHai Doan

 

19.11 Eleanor

Discovering Queries based on Example Tuples.

Yanyan Shen; Kaushik Chakrabarti; Surajit Chaudhuri; Bolin Ding; Lev Novik. SIGMOD'14

26.11 No seminar (Data Science meeting 17:00-20:00)

 

3.12 Yuval


A Formal Approach to Finding Explanations for Database Queries

Sudeepa Roy; Dan Suciu, SIGMOD 2014

 

10.12 Yael

Anish Das Sarma, Aditya G. Parameswaran, Hector Garcia-Molina, Alon Y. Halevy:
Crowd-powered find algorithms. ICDE 2014: 964-975

17.12 Amir


SeeDB: Automatically Generating Query Visualizations

Manasi Vartak,Samuel Madden, Aditya Parameswaran, Neoklis Polyzotis

24.12 No seminar (Data Science meeting 17:00-20:00)

 

31.12 Anna
NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulation,

Sumit Gulwani; Mark Marron. SIGMOD'14

 

7.1     No seminar

 

14.1  Moria

Constructing an Interactive Natural Language Interface for Relational Databases.

Fei Li, H. V. Jagadish, PVLDB 8(1): 73-84 (2014)

 

21.1   Eyal

Indexing for Interactive Exploration of Big Data Series. Kostas Zoumpatianos,
Stratos Idreos; Themis Palpanas. SIGMOD'14

 

Spring Semester

 

18.3 Pierre Bouhris  on Provenance Circuits for tree and tree-like instances

 

Monday 30.3 Yuval+Amir

 

Monday 20.4 17:00 Data Science Meeting

 

6.5 Yael

 

Thursday 7.5 DB-IR Day at MSR Hertzelia

 

13.5 Eyal+Eleanor

 

20.5 Anna+Moria

 

10.6 Ahmad+Slava

 

 

Papers

 

 

CROWDSOURCING

Anish Das Sarma, Aditya G. Parameswaran, Hector Garcia-Molina, Alon Y. Halevy:
Crowd-powered find algorithms. ICDE 2014: 964-975

Aditya G. Parameswaran, Stephen Boyd, Hector Garcia-Molina, Ashish Gupta, Neoklis Polyzotis, Jennifer Widom:
Optimal Crowd-Powered Rating and Filtering Algorithms. PVLDB 7(9): 685-696 (2014)

Ju Fan, Meiyu Lu, Beng Chin Ooi, Wang-Chiew Tan, Meihui Zhang:

A hybrid machine-crowdsourcing system for matching web tables. ICDE 2014

Chong Sun, Narasimhan Rampalli, Frank Yang, AnHai Doan: Chimera: Large-Scale
Classification using Machine Learning, Rules, and Crowdsourcing.

PVLDB 2014, 1529-1540

DATA EXPLORATION

SeeDB: Automatically Generating Query Visualizations (VLDB'14 demo)
Manasi Vartak,Samuel Madden, Aditya Parameswaran, Neoklis Polyzotis

Discovering Queries based on Example Tuples. Yanyan Shen; Kaushik Chakrabarti; Surajit Chaudhuri; Bolin Ding; Lev Novik. SIGMOD'14

Interactive Data Exploration Using Semantic Windows. Alexander Kalinin;
Ugur Cetintemel; Stan Zdonik. SIGMOD'14

Explore-by-Example: An Automatic Query Steering Framework for Interactive
Data Exploration
. Kyriaki Dimitriadou; Olga Papaemmanouil; Yanlei
Diao. SIGMOD'14

Knowing When You’re Wrong: Building Fast and Reliable Approximate
Query Processing Systems
Sameer Agarwal; Henry Milner; Ariel Kleiner; Ameet Talwalkar; Michael Jordan; Samuel Madden;Barzan Mozafari; Ion Stoica. SIGMOD 2014

A Formal Approach to Finding Explanations for Database Queries
Sudeepa Roy; Dan Suciu. SIGMOD'14

Indexing for Interactive Exploration of Big Data Series. Kostas Zoumpatianos,
Stratos Idreos; Themis Palpanas. SIGMOD'14

NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulation,

Sumit Gulwani; Mark Marron. SIGMOD'14

 

 

 

MISCELLANEOUS

 Nurzhan Bakibayev, Tomás Kociský, Dan Olteanu, Jakub Zavodny: Aggregation and Ordering in Factorised Databases. PVLDB 6(14): 1990-2001 (2013)

Wenfei Fan, Floris Geerts, Frank Neven: Making Queries Tractable on Big Data with Preprocessing.

PVLDB 6(9): 685-696 (2013)

 

Floris Geerts, Grigoris Karvounarakis, Vassilis Christophides, Irini Fundulaki: Algebraic structures for capturing the provenance of SPARQL queries.

ICDT 2013: 153-164