Seminar – Advanced Topics in Web Data
Management
Tova Milo, Daniel Deutch, 2018/19
Meetings: Tuesdays 10-12
Seminar Information
The seminar focuses on managing, analyzing, sharing,
and integrating data and applications on the web. Areas of interest include
crowdsourcing, data exploration, Big Data,
probabilistic data and data provenance. We shall read recent
papers in this area, focusing on several
specific issues, and then explore possible future directions. A tentative list
of
papers is enclosed.
Schedule (Sem B)
19\3: EDBT Rehersals: Slava, Brit, Amit
2\4: ICDE Rehersals: Yuval, Naama, Tomer, Amit
30\4: Tomer, Slava, Naama
21\5: Shay, Kathy
28\5: Gefen, Ori
11\6: Uri, Yuval
Schedule (Sem
A)
4/11 Ori and Tomer W. Subjective
Knowledge Base Construction Powered By Crowdsourcing and
Knowledge Base. Hao Xin, Rui Meng, Lei
Chen, SIGMOD'18
11/11 Slava and Shay RC-index:
diversifying answers to range queries, Yue
Wang. Alexandra Meliou. Gerome Miklau, VLDB '18
18/11 Amit and Tomer H. The
Case for Learned Index Structures, Tim Kraska, Alex Beutel, Ed
Chi, Jeff Dean, Neoklis Polyzotis, SIGMOD'18
25/11 No
meeting
2/12 Gefen and Dvir Scalable
Semantic Querying of Text, Xiaolan Wang, Aaron Feng, Behzad
Golshan, Alon Halevy,
George Mihaila, Hidekazu Oiwa, Wang-Chiew Tan, VLDB'18
9/12 Hanuka
16/12
Talk by Raymond Ng
23/12 Naama and Nave Are
Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers?,
VLDB '18
30/12
Yuval and Shevach Provchain: A blockchain-based data provenance architecture in
cloud environment with enhanced privacy and availabilityþ
X Liang, S Shetty, D Tosh, C Kamhoua, Socc
2017
6/1 Kathy and Rony Navigating
the Data Lake with Datamaran: Automatically
Extracting Structure from Log Datasets, Yihan Gao, Silu
Papers
Advanced query processing
The Case for Learned Index Structures, Tim Kraska, Alex
Beutel, Ed
Chi, Jeff Dean, Neoklis Polyzotis, SIGMOD'18
FastQRE: Fast Query Reverse Engineering, Dmitri
Kalashnikov, Laks
V.S. Lakshmanan, Divesh Srivastava, SIGMOD'18
Navigating the Data Lake with Datamaran: Automatically
Extracting
Structure from Log Datasets, Yihan Gao, Silu Huang, Aditya
Parameswaran, SIGMOD'18
Bias in OLAP Queries: Detection, Explanation, and Removal,
Babak
Salimi, Johannes Gehrke, Dan Suciu, SIGMOD'18
The Vadalog System: Datalog-based Reasoning for Knowledge
Graphs.
Luigi Bellomarini, Emanuel Sallinger, Georg Gottlob, VLDB'18
LevelHeaded: A Unified Engine for Business
Intelligence and Linear
Algebra Querying, Christopher Aberger, Andrew Lamb, Kunle
Olukotun,
Christopher Re, ICDE'18
Cleaning, dependencies, and entity resolution
Explaining Repaired Data with CFDs, Joeri Rammelaere, Floris
Geerts,
VLDB'18
Efficient Discovery of Approximate Dependencies, Sebastian
Kruse,
Felix, VLDB'18
Parallel Reasoning of Graph Functional Dependencies,
Wenfei Fan,
Xueli Liu, Yingjie Cao, ICDE'18
Discovering Graph Functional Dependencies, Wenfei Fan,
Chunming Hu,
Xueli Liu, Ping Lu, SIGMOD'18
Entity Matching with Active Monotone Classification, Yufei
Tao,
PODS18 (best paper)
Semantics
Scalable Semantic Querying of Text, Xiaolan Wang, Aaron Feng,
Behzad
Golshan, Alon Halevy, George Mihaila, Hidekazu Oiwa,
Wang-Chiew Tan,
VLDB'18
Seeping Semantics: Linking Datasets using Word Embeddings for
Data
Discovery. Raul Castro Fernandez, Essam Mansour, Abdulhakim
Qahtan,
Ahmed Elmagarmid, Ihab Ilyas, Samuel Madden, Mourad Ouzzani,
Michael
Stonebraker, Nan Tang, ICDE'18
Crowdsourcing
Task Relevance and Diversity as Worker Motivation in
Crowdsourcing,
Julien Pilourdault, Sihem Amer-Yahia, Senjuti Basu
Roy, Dongwon Lee,
ICDE'18
Knowledge Base Enhancement via Data Facts and Crowdsourcing,
Linnan
Jiang, Lei Chen, ZHao Chen, ICDE'18
Incentive-Based Entity Collection using Crowdsourcing,
Chengliang
Chai, Ju Fan, Guoliang Li, ICDE'18
Worker Recommendation for Crowdsourced Q&A Services: A
Triple-Factor
Aware Approach, Zheng Liu, Lei Chen, VLDB'18
Subjective Knowledge Base Construction Powered By
Crowdsourcing and
Knowledge Base. Hao Xin, Rui Meng, Lei Chen, SIGMOD'18
Provenance
Muller, Dietrich, Grust, VLDB 2018
DfAnalyzer: runtime dataflow analysis
of scientific applications using provenanceþ
V Silva, D De Oliveira, P Valduriez, VLDB 2018
Provchain: A
blockchain-based data provenance architecture in cloud environment
with enhanced privacy and availabilityþ
X Liang, S Shetty, D Tosh, C Kamhoua,
VLDB 2017