WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS548 Knowledge Discovery and Data Mining 
Schedule of Classes - Spring 2016

PROF. CAROLINA RUIZ 

WARNING: Changes to this schedule may be made during the course of the semester. 
------------------------------------------

WEEK DATE DUE TOPIC READINGS
Tan, Steinbach, Kumar's Textbook
1 Jan. 19   Introduction to KDD & Data Mining
Data & Data Preparation
  • Concepts, instances, attributes
  • Data preprocessing
  • Attribute selection
  •   Chp. 1, 2
    2 Jan. 26   Data & Data Preparation (cont.)
  • Data integration
  • Data warehousing & OLAP
  • Dimensionality reduction
  •   Chp. 3, Appendix B.1
    3 Feb. 2 Project 1 Mining process
  • Training and Testing
  • Cross validation
  • Performance evaluation
    Project 1 discussion and Test 1
  •   Sect. 4.5
    4 Feb. 9   Classification
  • Decision trees
    Showcase: Decision Trees
  •   Sect. 4.1-4.4.
    5 Feb. 16   Numeric Predictions
  • linear regression
  • model trees
  • regression trees
    Showcase: Model and Regression Trees
  •   Appendix D, and
    all numeric prediction materials on Ruiz' lecture notes
    6 Feb. 23   Association Analysis
  • association rules
  •   Sec. 6.1-6.3, 6.7-6.9.
    7 Mar. 1 Project 2 Association Analysis (cont.)
  • association rules
    Project 2 discussion and Test 2
    Showcase: Association Rules
  •   Chp. 6
      Mar. 8   Spring Break
    may be needed to make up for weather related cancellations
     
    8 Mar. 15   Cluster Analysis
  • partitioning methods
  • hierarchical methods
  • density-based methods
    Showcase: Clustering I
  •   Chp. 8
    9 Mar. 22 Project 3 Cluster Analysis (cont.)
  • grid-based methods
  • model-based methods
    Project 3 discussion and Test 3
    Showcase: Clustering II
  •   Chp. 8
    10 Mar. 29   Anomaly Detection
  • model-based methods
  • proximity-based methods
  • density-based methods
    Showcase: Anomaly Detection
  •   Chp. 10
    11 Apr. 5 Project 4 Advanced topics
  • Visualization
  • Text mining
    Project 4 discussion and Test 4
    Showcase: Text Mining
  •   Sect. 3.3.
    all visualization materials on Ruiz' lecture notes
    text mining materials marked with ** on Ruiz' lecture notes
    12 Apr. 12   Advanced topics (cont.)
  • Sequence mining
  • Multimedia data mining
    Showcase: Sequence Mining
  •   Sect. 7.4
    13 Apr. 19 Project 5 Advanced topics (cont.)
  • Web mining
  • Industrial applications of data mining
  • Scientific applications of data mining
    Project 5 presentations and Test 5
    Showcase: Web Mining
  •   all web mining materials on Ruiz' lecture notes
    14 Apr. 26   Project 5 presentations and discussion (cont.)  
    15 May 3   To be announced