WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS548 Knowledge Discovery and Data Mining 
Schedule of Classes - Fall 2018

PROF. CAROLINA RUIZ 

WARNING: Changes to this schedule may be made during the course of the semester. 
------------------------------------------

WEEK DATE DUE TOPIC READINGS
Tan, Steinbach, Kumar's Textbook
1 Aug. 28 & 30   Introduction to KDD & Data Mining
Data & Data Preprocessing
  • Concepts, instances, attributes
  • Data sampling
  • Missing values
  •   Chp. 1, 2
    2 Sept. 4
    No class on Sept. 6
      Data & Data Preprocessing (cont.)
  • Attribute discretization
  • Dimensionality reduction:
  • Feature Selection
  •   Chp. 2
    3 Sept. 11 & 13 Project 1
    & Test 1
    Data & Data Preprocessing (cont.)
  • Dimensionality reduction:
  • Feature Extraction
  • Project 1 discussion and Test 1
      Online Appendix B.1
    4 Sept. 18 & 20   Classification
  • Decision trees
    Showcase: Decision Trees
    Numeric Predictions
  • linear regression
  • model trees
  • regression trees
    Showcase: Model and Regression Trees
  •   Sect. 3.1-3.3.
    Online Appendix D and
    all numeric prediction materials on Ruiz' lecture notes
    5 Sept. 25 & 27 HW Model construction and evaluation
  • Training and Testing
  • Cross validation
  • Performance evaluation
  •   Sect. 3.4-3.8
    6 Oct. 2 & 4 HW Model Comparison
    Experimental Design
    Deep Learning Networks
  • Neural Networks
  • Deep Learning
  •   Sect. 3.9, 4.7-4.8
    7 Oct. 9 & 11 Project 2
    & Test 2
    HW
    Deep Learning Networks (cont.)
  • Neural Networks
  • Deep Learning
    Showcase: Neural Networks and Deep Learning
    Project 2 discussion and Test 2
  •   Sect. 4.7-4.8
      Oct. 16 & 18   Semester Break  
    8 Oct. 23 & 25   Cluster Analysis
  • partitioning methods
  • hierarchical methods
  • density-based methods
    Showcase: Clustering
  •   Chp. 7
    9 Oct. 30 & Nov. 1 Project 3
    & Test 3
    HW
    Cluster Analysis (cont.)
  • grid-based methods
  • model-based methods
    Project 3 discussion and Test 3
  •   Chp. 7
    10 Nov. 6 & 8 HW Anomaly Detection
  • statistical approaches
  • proximity-based approaches
  • density-based approaches
  • reconstruction-based approaches
    Showcase: Anomaly Detection
  •   Chp. 9
    11 Nov. 13 & 15 Project 4
    & Test 4
    HW
    Advanced topics
  • Text mining
    Showcase: Text Mining
    Project 4 discussion and Test 4
  •  
    text mining materials marked with ** on Ruiz' lecture notes
      Nov. 20 HW Advanced topics (cont.)
  • Sequence mining
  •  
    12 Nov. 27 & 29   Advanced topics (cont.)
  • Data Visualization
  • Multimedia data mining
    Showcase: Data Visualization
  •   Sect. 7.4
    all visualization materials on Ruiz' lecture notes
    13 Dec. 4 & 6 Project 5
    & Test 5
    Advanced topics (cont.)
  • Web mining
  • Ethical and Societal Implications of Data Mining
    Showcase: Web Mining
    Project 5 presentations and Test 5
  •   all web mining materials on Ruiz' lecture notes
    14 Dec. 11 & 13   Project 5 presentations and discussion (cont.)