CS585/DS503: Big Data Management

CS585/DS503: Big Data Management

Syllabus. Readings.
Grading.
Projects.
Topics + Schedule. Additional Resources.


Course Site

         http://web.cs.wpi.edu/~cs585/s19

Class Meetings

         Semester: Spring 2019
         Room: Fuller Labs 320 (NEW)
         Date/Time: Mondays at 6:00pm - 8:50pm.
     

Professor:
          Prof. E. Rundensteiner
         Location: FULLER LABS 135
         Email address: rundenst (at) wpi.edu  
         Office Hours:: Mon right after class (See CANVAS)   
         Contacting Me:  If you email me, do make sure to use "CS585" in the subject line.
         I will NOT be able to respond to your inquiry otherwise.


Graduate Assistants:
        TA and Office Hours: See CANVAS.    
         Office Hour Location: Data Science Innovation Lab (AK013) in Atwater Kent (basement level).


Course Overview (Catalog Info)
Emerging applications from science, engineering, business to leasure generate and collect data at unprecedented speed, scale, and complexity that need to be managed and analyzed efficiently. This course is designed to introduce students to the emerging techniques and infrastructures developed for big data management. It is for students interested in understanding the ins and outs of big database systems, those interested in getting a solid foundation in the general area of data-intensive processing, those dealing with large-scale data management and analysis in the broader sense, or are interested in database and information systems research and in conducting an MS thesis or a dissertation in a data related topic. Topics covered include but are not limited to distributed database systems, MapReduce infrastructure, Spark, HBase, NoSQL Databases, and cloud-based computing. Query processing, optimization, access methods, storage layouts, and scalable analytics techniques developed on these infrastructures may be covered. Students are expected to engage in hands-on projects using one or more of these technologies.


Course Objectives
Objectives of this course include:
   1-  Learn about state-of-art techniques in data management systems that you can apply to your future research and practical work.
   2-  Practice how to read, review and present technical papers known to be an essential skill for professionals.
   3-  Work on hands-on projects with different big data infrastructures.


Coursework
The course is organized as series of seminars presented by the instructor and students. The instructor will present lectures covering the state-of-art techniques in various topics. Students, typically in teams, will also present papers on a relevant big data topics. Students, again in teams, will work on several course projects. A project typically involves implementing some of the techniques covered in class, modifying and extending these techniques, or performing a comparative study between alternative techniques. However, projects do not not have to be limited to the covered material. Instead, the student is invited to be creative about exploring new innovative big data technologies. A good project may result in writing a publishable paper.


Prerequisites

Students are expected to have strong background and knowledge of relational database management systems. Prior courses in databases, e.g., CS542, CS4432, or equivalent courses, are strongly recommended. Also students are expected to have strong skills in programming with languages such as Java or C++.

 
WPI E-System
In addition to this website, the course is also available at canvas.wpi.edu. Grades, assignments and lecture-slide decks will be in CANVAS.


Discussion Board
Please use the discussion board at canvas.wpi.edu for course-related discussion and exchange of emails. In addition, in rare cases, the instructor may contact you via email.