Matthew Ward's Thesis Project Topics
- Visual Data Mining:
Data mining involves exploring databases to try and discover data relationships
which are not explicitly stored within the databases. Traditional techniques
involve statistical analysis, clustering, and pattern matching. Many current
efforts are underway to integrate visualization into this process. This
project involves examining the benefits that would result from using
multivariate data visualization in conjunction with analytic methods
to explore databases.
- Perceptual Benchmarking:
There have been a large number of techniques developed over time for the
display of multivariate data, yet little work has been done on evaluating
the relative effectiveness of each technique. It is conjectured that the
usefulness of each method depends both on the characteristics of the data
(size, number of parameters) and the perception task at hand
detect/classify, patterns/clusters/anomalies). This project involves the
development of benchmarks for comparing the effectiveness of visualization
techniques and running experiments with human subjects to assess the
quality of the benchmarks.
- Extensions to XmdvTool:
XmdvTool
is a package developed by myself and Allen Martin (MS '95) for the
visualization and exploration of multivariate data using a variety of
projection and interaction methods. Additional work on this tool can focus on
a number of areas, including the integration of animation, specialized tools
for each of the projection methods (similar to work done by Jeff LeBlanc MS '91
and Rajeev Tipnis MS '92), tailoring visualizations based on data semantics,
an intelligent data configuration front-end, and customizable "smart" probes.
A current NSF grant is focused on extending the visualization techniques in
XmdvTool to handle very large, hierarchical data sets.
- Extensions to MAVIS:
MAVIS
is a program written by Chris Bentley (MS '96) which uses a statistical
technique known as Multidimensional Scaling (MDS) to display multivariate data.
MDS is an iterative refinement method for positioning n-dimensional data into
a lower dimensional space, and MAVIS animates this process in 1-D, 2-D, or 3-D.
It supports numerous ways of visualizing the evolution over time via animation
and flow visualization, and provides numerous ways of controlling the process.
Additional work on this tool include incorporating and comparing other
dimensional reduction techniques, add a clustering capability to allow
hierarchical processing, and experimenting with other flow visualization
methods.
- Visualizing Nominal Data:
Most visualization techniques cater to ordinal data, i.e. data with values that
have an order associated with them. This is because most graphical attributes
are ordinal in nature (e.g. size, position, intensity). However, a great deal
of data is of a nominal type, such as categorical data. The problem with
visualizing this in traditional ways is that the ordinal nature of the
graphical attribute may introduce errors in the interpretation of the
visualization. This project will focus on exploring methods for visually
presenting nominal data which minimize or eliminate the distortion or error
introduced by the graphical mapping/perception process.
- Extensions to XSauci:
XSauci is a package developed by myself, Dave Nedde (MS '91), and Maureen
Higgins (MS BB '92) for the display of information regarding genetic sequences
(John Rasku, MS '93 used similar methods for analyzing shapes). Techniques
supported include correlation images (which plot a matrix showing matching
seqence elements) and density/distribution charts. Additional work on this
tool can focus on integrating more elaborate matching algorithms (which can
deal with fuzzy matching, substitutions, and gaps), tying in some quantitative
analysis tools (e.g. dynamic programming, statistical methods), and enhancing
the visual presentation of the data.
Matthew O. Ward (matt@cs.wpi.edu)