WPI Worcester Polytechnic Institute

Computer Science Department


------------------------------------------

CS 525M KNOWLEDGE DISCOVERY AND DATA MINING  
PROJECT 6 - Web Mining. Fall 2001

PROF. CAROLINA RUIZ 

DUE DATE: This project is due on Tuesday Dec. 11, 2001 at 1 pm. 
------------------------------------------


PROJECT DESCRIPTION

The purpose of this project is to find patterns in web access data with the goal of predicting what pages from a website a typical user will visit based on what other pages on the same website the person has at. For this project, the
Microsoft Anonymous Web Data will be used. This dataset is available at the UCI KDD Repository

PROJECT ASSIGNMENT

The precise prediction task is described in the "classification/collaborative filtering task" that accompanies the dataset. For this prediction task, you are allowed to employ any of the data mining techniques that we have studied during the semester, or (better yet!) a combination of them. As usual, the more ideas you explore and the more robust your experimentation is, the better your grade on the project will be.

The dataset is also accompanied by references to Breese, Heckerman, and Kadie's work on this dataset and you're encouraged to read their paper and/or the Microsoft Technical Report that is available in the dataset's webpage.

Students are free to work individually on this project or in groups of two. If you decide to work with another student in the class on this project, you are required to let me know by email by Thursday, Dec. 6th (midnight).

The following are guidelines for the analysis of the data:


REPORT AND DUE DATE