CS585/DS503: Big Data Management

CS585/DS503: Big Data Management

Syllabus. Readings.
Grading.
Projects.
Topics + Schedule. Additional Resources.


Course Site

         http://web.cs.wpi.edu/~cs585/f17

Class Meetings

         Semester: Fall 2017
         NEW!!! Room: Washburn Bldg: WB229
         Date/Time: Monday/Wednesday at 4:00pm - 5:20pm.
     

Professor:
          Prof. E. Rundensteiner
         Location: FULLER LABS 135
         Email address: rundenst (at) cs.wpi.edu  
         Office Hours:: Mon/Wed at 5:20pm-6:00pm (right after class).    Check Canvas for Adjustments, if any.
         Contacting Me:  If you email me, do make sure to use "CS585" in the subject line.
         I will do my best to respond to you within 24 hours, when possible.


Graduate Assistants:
        Wen Liu :     wliu3 (at) wpi.edu
         Office Hour Times: Tuesday 2:15 - 3:15PM; and Thursday. 1:30 - 2:30PM.
         Location: Data Science Innovation Lab (AK013) in Atwater Kent (basement level).


Course Overview (Catalog Info)
Emerging applications from science, engineering, business to leasure generate and collect data at unprecedented speed, scale, and complexity that need to be managed and analyzed efficiently. This course is designed to introduce students to the emerging techniques and infrastructures developed for big data management. It is for students interested in understanding the ins and outs of big database systems, those interested in getting a solid foundation in the general area of data-intensive processing, those dealing with large-scale data management and analysis in the broader sense, or are interested in database and information systems research and in conducting an MS thesis or a dissertation in a data related topic. Topics covered include but are not limited to distributed database systems, MapReduce infrastructure, Spark, HBase, NoSQL Databases, and cloud-based computing. Query processing, optimization, access methods, storage layouts, and scalable analytics techniques developed on these infrastructures may be covered. Students are expected to engage in hands-on projects using one or more of these technologies.


Course Objectives
Objectives of this course include:
   1-  Learn about state-of-art techniques in data management systems that you can apply to your future research and practical work.
   2-  Practice how to read, review and present technical papers known to be an essential skill for professionals.
   3-  Work on hands-on projects with different big data infrastructures.


Coursework
The course is organized as series of seminars presented by the instructor and students. The instructor will present lectures covering the state-of-art techniques in various topics. Students, typically in teams, will also present papers on a relevant big data topics. Students, again in teams, will work on several course projects. A project typically involves implementing some of the techniques covered in class, modifying and extending these techniques, or performing a comparative study between alternative techniques. However, projects do not not have to be limited to the covered material. Instead, the student is invited to be creative about exploring new innovative big data technologies. A good project could possibly result in writing a publishable paper.


Prerequisites

Students are expected to have strong background and knowledge of relational database management systems. Prior courses in databases, e.g., CS542, CS4432, or equivalent courses, are strongly recommended. Also students are expected to have strong skills in programming with languages such as Java or C++.

 
WPI E-System
In addition to this website, the course is also available at canvas.wpi.edu. Grades, assignments and lecture-slide decks will all be linked into CANVAS.


Discussion Board
Please use the discussion board available at canvas.wpi.edu for any course-related discussion and exchange of emails. In addition, in rare cases, the instructor may contact you via the dedicated mailing list for "cs585".