

The common themes of the research projects in our group are data mining and knowledge discovery in databases. Knowledge discovery is the process of finding general patterns/principles that summarize/explain a set of "observations". Very large databases have become the standard, making it impossible for human beings to mine the data "by hand" looking for interesting patterns. Automated tools are therefore needed to help during the extraction of these patterns. Examples of application domains include astronomical data from the Hubble telescope, data on consumer preferences obtained by credit card companies, medical histories, genomic data, web usage data, etc.The knowledge discovery process in databases consists of several steps that can be grouped as follows:
- Data Integration: Collecting the target data observations from the different data sources, removing noise from the observations, and integrating them into an appropriate format.
- Data Mining: Applying a concrete algorithm to find useful and novel patterns in the integrated data.
- Evaluation: Interpreting mined patterns, evaluating them according to usefulness/interestingness criteria, and possibly using visualization tools to aid in understanding the patterns graphically.
Our research projects concentrate mainly on the data mining stage of the knowledge discovery process, though some of them address also the data integration and pattern evaluation stages.
PROJECTS
- Our Novel Data and Sequence Mining Algorithms and Tools
- Association Rule Mining
- Efficient Mining of Association Rules
- Mining Association Rules over Complex Data
- Sequence Mining
- Using Background Knowledge in Data Mining
- Data Mining for Genetic Analysis
- Motif- and Expression-Based Classification of DNA
- Mining Genetic Polymorphisms for patterns in Human Diseases
- Mining Distance-Based Association Rules for Gene Expression
- Data Mining for Medical Data Analysis
- Data Mining for Electronic Commerce
- Association Rules for Recommender Systems
- Collaborative and Content-Based Filtering using Association Rules
- Collaborative and Content-Based Filtering using Neural Networks
- Data Mining on other Application Domains
- Web Metasearch
- Evaluation of Data Mining Tools
MEMBERS
Faculty
- Prof. Carolina Ruiz
ruiz@cs.wpi.edu
Office: FL 232
Phone Number: (508) 831-5640
- Other affiliate faculty: Professors Sergio A. Alvarez, Julia Krushkal, Elizabeth Ryder, Mathew Ward.
Graduate Students (Current and Former)
- Stuart Floyd
- Shivin Misra
- Yuan Gao
- Dharmesh Thakkar
- John Hayward
- Senthil K Palanisamy
- Zachary Stoecker-Sylvia
- Keith A. Pray
- Jonathan Freyberger
- Maged El-Sayed
- Parameshvyas Laxminarayan
- Aleksandar Icev
- Wendy Kogel
- Michael Sao Pedro
- Christopher Shoemaker
- Weiyang Lin
- Aparna Varde
- Ali Benamara
- Ji Chen
- Wenhong Fan
- Kavita Kanetkar
- Rohan Parulekar
- Larisa Orlova
- Geraldine Rosario
Undergraduate Students (Current and Former)
- Jonathan Rudolph
- Piotr Mardziel
- Michael McCowan
- James Martineau
- Eduardo Paredes
- Iavor N. Trifonov
- Takeshi Kawato
- Cindy Leung
- Sam Holmes
- John Baird
- Jay Farmer
- Rebecca Gougian
- Ken Monterio
- Paul Young
- Zachary Stoecker-Sylvia
- Kristin Blitsch
- Ben Lucas
- Sarah Towey
- Wendy Kogel
- Brooke LeClair
- Christopher St. Yves
- Brian Murphy
- David Phu
- Ian Pushee
- Frederick Tan
- Daniel Doyle
- Jared Judecki
- James Lund
- Bryan Padovano
- Christopher Cole
- Michael Ciman
- John Gulbrandsen
- Tara Halwes
- Christopher Martino
- Matthew Berube
- Anna Novikov
- Amy Kao
- Dana Rock
PUBLICATIONS
Publications
COURSES
Courses
RESOURCES
Resources

![[Return to the CS Homepage]](http:/images/new_cs.gif)