Seminar – Advanced Topics in Web Data
Management
Tova Milo, Daniel Deutch, 2015/16
Meetings: Tuesdays 18-20, Kaplun 324
Seminar Information
The
seminar focuses on managing, analyzing, sharing, and integrating data and
applications on the web. Areas of interest include
crowdsourcing,
data exploration, Big Data, probabilistic data and data provenance. We shall
read recent
papers
in this area, focusing on several specific issues, and then explore possible
future directions. A tentative list of
papers
is enclosed.
Schedule (Sem. B)
15.3 Slava, Nave
29.3 Yizhak
5.4 Ahmad, Amit
3.5 Yuval, Amir
10.5 Brit, Shevah
24.5 Efrat, Yonatan, Matan
7.6 Chai, Eyal, Elian
Schedule (Sem. A)
27.10 Slava , "Argonaut: Macrotask Crowdsourcing for Complex
Data Processing"
3.11 Tomer, "TransactiveDB: Tapping into
Collective Human Memories"
10.11 Amit,
"Efficient Top-K SimRank-based Similarity Join"
24.11 Yizhak, "Preference-aware
Integration of Temporal Data"
1.12 Ahmad, "Scaling
Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning"
8.12 Amir,
"Association Rules with Graph Patterns"
15.12
Nave, "Linearized and Single-Pass Belief Propagation"
22.12 Yehonatan,
"Incremental Knowledge Base Construction Using DeepDive"
29.12 Brit, "The Importance of Being Expert:
Efficient Max-Finding in Crowdsourcing"
Matan, "Worker Skill
Estimation in Team-Based Tasks"
12.1 Elian, Relational
Data Processing in Spark
Shevah, JetScope: Reliable and Interactive Analytics at Cloud Scale
Rethinking Data-Intensive Science
Papers
CROWDSOURCING
Argonaut: Macrotask Crowdsourcing for Complex Data Processing
Daniel Haas,Jason Ansel,Lydia Gu,Adam Marcus,
VLDB 2015 (Industrial
track)
TransactiveDB: Tapping into Collective Human Memories
Michele Catasta, Alberto Tonon, Djellel Eddine Difallah, Gianluca
Demartini,
Karl Aberer, Philippe Cudré-Mauroux, VLDB 2015
Worker Skill Estimation in Team-Based Tasks
Habibur Rahman, Saravanan Thirumuruganathan, Senjuti Basu Roy, Sihem
Amer-Yahia, Gautam Das, VLDB 2015
Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active
Learning
Barzan Mozafari, Purna Sarkar, Michael Franklin, Michael Jordan, Sam
Madden, VLDB 2015
Hear the Whole Story: Towards the Diversity of Opinion in Crowdsourcing
Markets
Ting Wu, Lei Chen, Pan Hui, CHEN ZHANG, Weikai Li, VLDB 2015
The Importance of Being Expert: Efficient Max-Finding in Crowdsourcing
Aris Anagnostopoulos, Luca Becchetti, Adriano Fazzone, Ida Mele,
Matteo
Riondato, SIGMOD 2015
GRAPH PROCESSING
Association Rules with Graph Patterns
Wenfei Fan, Xin Wang, Yinghui Wu, Jingbo
Xu, VLDB 2015
Efficient Top-K SimRank-based Similarity Join
Wenbo Tao, Minghe Yu, Guoliang Li, VLDB 2015
Efficient Enumeration of Maximal k-Plexes
Devora Berlowitz, Sara Cohen, Benny Kimelfeld, SIGMOD 2015
PROVENANCE
Dynamic
provenance for SPARQL Updates,
Harry
Halpin and James Cheney, ISWC 2014
Linearized
and Single-Pass Belief Propagation
Wolfgang
Gatterbauer, Stephan Günnemann, Danai Koutra, Christos Faloutsos, VLDB
2015
Answering
Why-not Questions on Reverse Top-k Queries
Yunjun
Gao, Qing Liu, Gang Chen, Baihua Zheng, Linlin Zhou, VLDB 2015
INFORMATION INTEGRATION
Preference-aware
Integration of Temporal Data
Bogdan Alexe, Mary Roth, Wang-Chiew Tan,
VLDB 2015
Enriching Data Imputation with Extensive Similarity Neighbors
Shaoxu Song, Aoqian Zhang, Lei Chen, Jianmin Wang, VLDB 2015
Incremental Knowledge Base Construction Using DeepDive
Jaeho Shin, Sen Wu, Feiran Wang, Christopher De Sa, Ce Zhang,
Christopher
Re, VLDB 2015
A Declarative Framework for Linking Entities
Douglas Burdick, Ronald Fagin, Phokion Kolaitis, Lucian Popa,
Wang-Chiew
Tan, ICDT 2015 (Best paper award)
JetScope: Reliable and
Interactive Analytics at Cloud Scale
Eric Boutin, Paul Brett, Xiaoyu Chen,
Jaliya Ekanayake, Tao Guan,
Anna Korsun, Zhicheng Yin, Nan Zhang, Jingren Zhou, VLDB 2015
Rethinking Data-Intensive Science Using Scalable Analytics Systems
Frank Austin Nothaft, Matt Massie, Timothy Danford, Zhao
Zhang, Uri Laserson,
Carl Yeksigian, Jey Kottalam, Arun Ahuja, Jeff Hammerbacher,
Michael
Linderman, Michael J. Franklin, Anthony D. Joseph, David A.
Patterso, SIGMOD 2015 (Insustrial)
MISCELLANEOUS
Spark SQL: Relational
Data Processing in Spark
Michael Armbrust, Reynold S. Xin, Cheng Lian,
Yin Huai, Davies Liu, Joseph K. Bradley,
Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi,
Matei
Zaharia, SIGMOD 2015
Mining Subjective Properties on the Web
Immanuel Trummer, Alon Halevy, Hongrae Lee, Sunita Sarawagi,
Rahul
Gupta, SIGMOD 2015
Making Queries Tractable on Big Data with
Preprocessing
Wenfei Fan, Floris Geerts,
Frank Neven, VLDB 2013
PVLDB 6(9): 685-696 (2013)