CS585/DS503: Big Data Management

CS585/DS503: Big Data Management

SPRING 2021: PLEASE GO TO CANVAS ABOUT ACCESS TO THIS COURSE.

Syllabus. Readings.
Grading.
Projects.
Topics + Schedule. Additional Resources.


Course Site

         http://web.cs.wpi.edu/~cs585/s21

Class Meetings

         Semester: Spring 2021
         Date/Time: Mondays at 6:00pm - 8:50pm.
     
Professor:
          Prof. E. Rundensteiner
         Location: FULLER LABS 135 (during normal times)
         Email address: rundenst (at) wpi.edu  
         Office Hours:: See CANVAS.   
         Contacting Me:  If you email me, do make sure to use "CS585" in the subject line.
         I may miss your inquiry otherwise.


Graduate Assistants:
        TA and Office Hours: See CANVAS.    
         Office Hour Location: ON-LINE AS WILL BE ANNOUNCED ON CANVAS.


Course Overview (Catalog Info)
Emerging applications from science, engineering, business to leasure generate and collect data at unprecedented speed, scale, and complexity that need to be managed and analyzed efficiently. This course is designed to introduce students to the emerging techniques and infrastructures developed for big data management. It is for students interested in understanding the ins and outs of big database systems, those interested in getting a solid foundation in the general area of data-intensive processing, those dealing with large-scale data management and analysis in the broader sense, or are interested in database and information systems research and in conducting an MS thesis or a dissertation in a data related topic. Topics covered include but are not limited to distributed database systems, MapReduce infrastructure, Spark, HBase, NoSQL Databases, and cloud-based computing. Query processing, optimization, access methods, storage layouts, and scalable analytics techniques developed on these infrastructures may be covered. Students are expected to engage in hands-on projects using one or more of these technologies.


Course Objectives
Objectives of this course include:
   1-  Learn about state-of-art techniques in data management systems that you can apply to your future research and practical work.
   2-  Practice how to read, review and present technical papers known to be an essential skill for professionals.
   3-  Work on hands-on projects with different big data infrastructures.


Coursework
The course is organized as series of seminars presented by the instructor and students. The instructor will present lectures covering the state-of-art techniques in various topics. Students, typically in teams, will also present papers on a relevant big data topics. Students, again in teams, will work on several course projects. A project typically involves implementing some of the techniques covered in class, modifying and extending these techniques, or performing a comparative study between alternative techniques. However, projects do not not have to be limited to the covered material. Instead, the student is invited to be creative about exploring new innovative big data technologies. A good project may result in writing a publishable paper.


Prerequisites

Students are expected to have strong background and knowledge of relational database management systems. Prior courses in databases, e.g., CS542, CS4432, or equivalent courses, are strongly recommended. Also students are expected to have strong skills in programming with languages such as Java or C++.

 
WPI E-System
In addition to this website, the course is also available at canvas.wpi.edu. Grades, assignments and lecture-slide decks will be in CANVAS.


Discussion Board
Please use the discussion board at canvas.wpi.edu for course-related discussion and exchange of emails. In addition, in rare cases, the instructor may contact you via email.