CS525. Advanced Topics in Database Systems
Large-Scale Data Management
Home Textbook & Reading List
Grading
Project
Schedule Additional Resources



Tentative Schedule: Schedule might change slightly later on as appropriate.


Week
Day
Topic(s)
Readings
Comments
Presenter
Week 1
01/10/2013
No Class on Jan 10th. Our first meeting is on Jan 15.



Week 2 01/15/2013
Introduction, Course Logistics
http://wiki.apache.org/hadoop/
 
Instructor
01/17/2013
MapReduce Framework/Hadoop


Instructor
Week 3 01/22/2013
Hadoop Ecosystem I:
Pig Language

Instructor
01/24/2013
Hadoop Ecosystem II:
Hive & Hadoop Streaming

Instructor
Week 4
01/29/2013
A comparison of join algorithms for log processing in mapreduce (SIGMOD '10)


Ya Liu &
Hao Zhou
01/31/2013
Restore: reusing results of mapreduce jobs (VLDB '12)


Vijay Sukhadeve 
&  Jun Fan
Week 5
02/05/2013
Nova: continuous pig/hadoop workflows (SIGMOD '11)


Karim Ibrahim &
Anh Pham
02/07/2013 Haloop: efficient iterative data processing on large clusters (VLDB '10)


Carl Erhard &
Zahid Mian
Week 6
02/12/2013 Ricardo: integrating R and Hadoop (SIGMOD '10)


Yuguan Li &
Luyang Zhang
02/14/2013 HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads (VLDB '09)


Jiefeng He &
Xiaolu Xiong
Week 7
02/19/2013  Mrshare: sharing across multiple queries in mapreduce (VLDB '10)


Xiaolan Wang &
Pengfei Tang
02/21/2013 Hadoop Analytics & HBase - Part 1


Instructor
Week 8
02/26/2013 Hadoop Analytics & HBase- Part 2

Instructor
02/28/2013 Provenance for generalized map and reduce workflows (CIDR '11)

Yue Lu &
Pei Zhang
Week 9
03/05/2013 Break

03/07/2013
Week 10
03/12/2013 Hadoop++: Making a yellow elephant run like a cheetah (without it even noticing) (VLDB '10)

Jason Kost &
Skyler Whorton
03/14/2013 Overview on Parallel and Distributed Databases



Instructor
Week 11
03/19/2013 Cancelled (Instructor attending EDBT conference)



03/21/2013 Cancelled (Instructor attending EDBT conference)


Week 12
03/26/2013 Column-Oriented Storage Techniques for MapReduce. PVLDB 4(7): 419-429 (2011)


Yuguan Li &
Luyang Zhang
03/28/2013 A platform for scalable one-pass analytics using mapreduce. In SIGMOD, pages 985–996, 2011.


Karim Ibrahim &
Anh Pham
Week 13
04/02/2013 Map-reduce-merge: simplified relational data processing on large clusters. SIGMOD, pages 1029–1040, 2007.


Jiefeng He &
Xiaolu Xiong
04/04/2013 Starfish: A self-tuning system for big data analytics. In CIDR, pages 261–272, 2011.


Carl Erhard &
Zahid Mian
Week 14
04/09/2013 Adaptive MapReduce using situation-aware mappers. EDBT 2012: 420-431


Ya Liu &
Hao Zhou
04/11/2013 Efficient parallel set-similarity joins using mapreduce. In Proceedings of the 2010 international conference on Management of data, pages 495–506, 2010


Xiaolan Wang &
Pengfei Tang
Week 15
04/16/2013 Extending Map-Reduce for Efficient Predicate-Based Sampling .In ICDE, pages 486–497, 2012


Jason Kost &
Skyler Whorton
04/18/2013 Early accurate results for advanced analytics on MapReduce. Proc. VLDB Endow., 5(10):1028–1039, 2012.


Yue Lu &
Pei Zhang
Week 16
04/23/2013 Only Aggressive Elephants are Fast Elephants. PVLDB, 5(11):1591–1602, 2012


Vijay Sukhadeve 
&  Jun Fan
04/25/2013                    Start of Project 5 Demos....