List of currently open MQP projects
I currently work mainly on two areas of database research: (1) Data Integration
[Wikipedia Link]
(2) Health Information Technology [Wikipedia Link],
focusing primarily on health records [Wikipedia Link].
If you are excited about any of these topics, or if you have an idea of your
own and would like my advice or like to be my advisor, you can come to see me.
Details of projects in the 2 areas, and also some additional miscellaneous
projects are listed below.
- Data Integration We explore data integration solutions where XML
is used as the data model for integration data from different sources. Our
focus at WPI DB research is on query processing and optimization in such
a distributed setting, view maintenance etc. People who are interested in
this project must have taken DB I and DB II (or equivalent). Good understanding
of algorithms, OS are essential and knowledge of distributed systems would help.
- Health Information Technology Our project is based on the premise
that Health IT has potential to address several of the issues that we face in
health care -- including avoiding medication errors, ensuring access to
information etc. There are other department faculty members who are involved in
this project, including Mgmt Dept, WPI, and UMass Medical School.
- Miscellaneous Projects of Interest
Some other miscellaneous projects of interest to me are below.
-
Hidden Web
Consider a site like amazon.com. It has its data about books, CDs etc hidden
behind a form interface. We would like to get to this hidden data. This
data can then be indexed, searched and integrated with other data sources.
The goal of this project is to research and implement techniques for getting to
this hidden data, and to develop a software tool that does the same.
- Database for Environmental Science Researchers - Co-advised with Stanley
Selkow, Betsy Colburn (Harvard Forest) -- This project is scheduled to finish
in May 2008 -- so not sure if it will be offered next year
A vernal pond is a body of water which lasts through much of the summer but dries
by autumn. Its ecological importance is that fish can not survive in a vernal
pond, and a number of animals (including certain frogs and salamanders) lay their
eggs and pass their early stages of life only in vernal ponds. Researchers at the
US Environmental Protection Agency and others all through the northeast need a
centralized data base into which school groups, consultants, regulators,
naturalists, and scientists can put data on vernal pools they observe (physical
dimensions, hydrologic characteristics, biota, etc). This project will produce an
online database which will accumulate the data from many sources.
- Projects on processing XML streaming data (think RSS feeds).
List of completed MQPs
- XPath to SQL translation
Consider XML documents stored in relational databases. This project involves
efficient translation of XPath queries to SQL queries. Also translate XML updates to
SQL updates. This work has been done by several students - Rich Nordin,
Jennifer Schweers, Rich Tamalavitch, Rich Omar, Greg Labonte, Aaron McDewitt.
- ER to XML Schema translation
Consider designing an XML schema for an application. One comes up with an ER schema,
and then translates this ER schema to XML schema. This project developed a tool to
convert an ER schema to an XML schema. Tool developed by Min Son and Jinho Kang.
-
SAX-Lite
SAX-Lite is part of our research, where we build a light version of SAX parser
especially useful
for XML stream processing. The SAX-Lite parser will not be a
conformant SAX parser (it will not even check whether the XML document
is well-formed). In stead, it will be a simple lexical analyzer that
takes in a stream and emits SAX tokens such as begin element, end
element, text etc. As part of this project, you will need to implement
SAX-Lite, and compare the performance against other ideas used for
boosting XML streaming performance such as SIX (Stream Index) from
University of Washington. Completed by Nikhil Sreenath in Spring 2007.
- co-advised ebay data mining project; location-based grocery shopping
recommender system