
WARNING:
Changes to this schedule may be made during the course of the semester.

See below:
- Sign up for a showcase topic of your interest.
Check space availability for each topic in the schedule below.
- Work together with the group of students assigned to the same topic
to identify a real-world application of the data mining topic
you are assigned to.
- Discuss your chosen data mining application with the professor
at least 2 weeks in advance to the presentation.
You need to get the professor's approval of your selected application
before you start preparing your presentation.
- The team should investigate the application in depth and
prepare and deliver a 15 minute in-class presentation describing this application in as much detail as possible, focusing on its data mining aspects.
- Your presentation should contain the following sections:
- A cover page with the following title and subtitle,
replacing the parts in red with the information for your particular showcase:
CS548 Spring 2016 <Data Mining Technique> Showcase by
< students' names >
Showcasing work by < application authors or company > on
<"Title or name of the application you are showcasing" >
- A list of references and resources that you used for your presentation.
This should be included right after the cover page.
If you used articles and research papers, include the full reference
not just a link to the articles.
For this, follow the IEEE formatting rules available at
IEEE citation style.
Follow this format style to reference books, journal articles, conference articles, online references, and other published or unpublished work.
The richer your set of references, the better.
- A detailed description of the application.
- Email the following materials to the professor
at least 48 hours in advance to your class presentation.
- Your presentation slides.
Please name your representation slides as follows:
CS548S16_Showcase_<Data Mining Techique>.<file extension>
If at all possible, please send us the slides in an editable format (e.g., pptx) so that we can make small edits if needed.
- A short description of your application (3-4 sentences) to be included
in this webpage under "Short Description" in your showcase entry below.
- Rehearse your oral presentation to make sure it is polished,
transitions between speakers work well, and the full presentation
stays within the time allowed (15 minutes).
-
Feb. 9: Decision Trees
- Students: Brandon Boos and Yi Jiang.
- Application Topic/Title:
A decision tree method for building energy demand modeling
- Short description:
Building energy consumption is a substantial user of energy worldwide and has been steadily increasing. This paper applies a decision tree approach to modeling building energy consumption data. The resulting decision tree can enable building designers and owners to make energy conscious choices based on data.
- Slides:
CS548S16_Showcase_Decision_Trees
-
Feb. 16: Model and Regression Trees
- Students: Thanaporn "March" Patikorn, Yanran Ma, Boya Zhou.
- Application Topic/Title:
"Real Time Head Pose Estimation with Random Regression Forests"
- Short description:
This paper introduces a system that can detect head pose in real-time. The detection algorithm uses random regression forests, which utilize several regression trees to make predictions. Since images do not have typical features for tree construction, the trees were trained from sets of randomly generated rules that maximize information gain.
- Slides:
CS548S16_Showcase_Model_and_Regression_Trees
-
Mar. 1: Association Rules
- Students: Yuting Liang, Shijie Jiang, Zheng Nie.
- Application Topic/Title: Web usage mining to improve the design of an e-commerce website: OrOliveSur.com
- Short description:
This paper describes the application of association rules in web usage mining for optimizing the design of the e-commerce website OrOliveSur.com.
- Slides:
CS548S16_Showcase_Association_Rules
-
Mar. 15: Clustering I
- Students:Yuxuan Xia, Shanhao Li, Xiaozhou Zou
- Application Topic/Title:
The application of hierarchical cluster analysis and non-negative matrix factorization to European atmospheric monitoring site classification
- Short description:
This paper assesses whether the two Level II atmosphere monitoring sites, named Auchencorth and Harwell, are good representatives of all atmospheric monitoring sites in UK. To achieve this goal, It performs hierarchical cluster analysis on the 4-year averaged monthly-diurnal ozone concentration dataset and sorts the sites within a cluster by how much the air condition is influenced by human.
- Slides:
CS548S16_Showcase_Clustering_I
-
Mar. 22: Clustering II
- Students: Alexander Witt, Chengle (Gary) Zhang
- Application Topic/Title:
"Active authentication using scrolling behaviors"
- Short description:
In this showcase, we present on a research paper that evaluates the application of different classification and clustering techniques to the task of active user authentication. Within the work the type of data that was used was temporal data indicating when a user has either scrolled or resized a window within a web-browser while reading a loaded read-only PDF document. The end result was that k-means clustering performed better than any of the classification approaches that were evaluated by the authors.
- Slides:
CS548S16_Showcase_Clustering_II
-
Mar. 29: Anomaly Detection
- Students: Max Levine, Jie Gao, Jeff Bibeau.
- Application Topic/Title:
"Outlier detection for patient monitoring and alerting"
- Short description:
Preventable medical errors often go unnoticed in a clinical setting. To alert health care professionals of these errors a series of action specific anomaly detection models were generated. These models employed conditional probability and support vector machines for anomaly detection. This detection scheme was reviewed by health care professionals and deemed helpful in finding medical errors.
- Slides:
CS548S16_Showcase_Anomaly_Detection
-
Apr. 5: Text Mining
- Students:
Haiyan Liang, Huayi Zhang
- Application Topic/Title:
"Mine Your Own Business: Market-Structure Surveillance Through Text Mining"
- Short description:
This paper proposes a solution by which marketing
researchers can listen to consumers. ongoing
discussions over the Web with the goal of converting
online discussions to market-structure insights.
The paper uses text mining to overcome the difficulties
involved in extracting and quantifying the wealth of
online data that consumers generate, and it uses network
analysis tools to convert the mined relationships
into co-occurrence among brands or between brands
and terms.
- Slides:
CS548S16_Showcase_Text_Mining
-
Apr. 12: Sequence Mining
- Students: Michaela Kachadoorian, Scott Judson, Daniel Duhaney.
- Application Topic/Title:
"Using consumer behavior data to reduce energy consumption in smart homes"
- Short description:
Widespread adoption of smart home technology requires that smart homes are able to learn the habits and patterns of the occupants without explicit programming. The paper we present in this showcase uses sequential pattern mining of smart home event data to learn the occupants behavior and make recommendations that will reduce energy use while maintaining comfort.
- Slides:
CS548S16_Showcase_Sequence_Mining
-
Apr. 19: Web Mining
- Students: Cansu Sen, Nichole Etienne, Cut Famelia.
- Application Topic/Title:
"A Practical Approach for Content Mining of Tweets"
- Short description:
In this showcase we cover areas of web mining related to the social media network Twitter. Twitter has been used to gain real-world insights to promote healthy behaviors. The purpose of this showcased project was to describe a practical approach to analyzing Tweets' content and to illustrate an application of the approach to the topic of physical activity. The approach taken included five steps: (1) selecting keywords to gather an initial set of Tweets to analyze; (2) importing data; (3) preparing data; (4) analyzing data, and (5) interpreting data.
- Slides:
CS548S16_Showcase_Web_Mining