next up previous
Next: About this document

Animating Multidimensional Scaling to Visualize Large N-Dimensional Data Sets

Chris L. Bentley
Computer Science Department
Worcester Polytechnic Institute


Goal: Present enhancements to Multidimensional Scaling (MDS) to extend technique to large data sets ( tex2html_wrap_inline219 points) with many dimensions (up to 50)


What is N-dimensional data?

We are familiar with 2 dimensional data defined by coordinate pairs (x,y). The Cartesian plane makes axes explicit, depicting entire data space. We can infer coordinate values of each point from the axes

2-column table makes values explicit - leaves us to imagine data space

N-dimensional data consists of points in a space with N coordinate axes

In table form, N-dimensional data is a mxn matrix. Each row i defines a data point and is an n-vector. Each column is an axis or dimension.

Analysing N-Dimensional Data

How do we find information in an mxn matrix of numbers?

Visualizing N-Dimensional Data

How do we picture data with tex2html_wrap177 dimensions? There are four basic approaches

Reducing Dimensionality

Goal: to reduce mxn matrix to mxp, where tex2html_wrap178

Several methods can be used:

Multidimensional Scaling: Central Idea

Is a perfect 2D configuration always possible? No.



Minimizing Stress

Figure shows vectors acting on point A, and cummulative vector in which point A will be updated

Measuring MDS

Problems With MDS

Solution #1: Subset-MDS

How to extend MDS so that it can be used with large data sets?


Subset-MDS: Selecting Distances

Subset-MDS could select k random distances for each point

Or, Subset-MDS could retain closest k/2 & farthest k/2 distances

Subset-MDS vs. MDS

Solution #2: Animation + Randomness

Random Perturbation

Random noise can be added to 2D display point coordinates

Or point updates can be scaled so points overshoot low stress position

It may be necessary to interpolate between successive positions



Future Work

next up previous
Next: About this document

Matthew Ward
Thu Oct 10 13:28:20 EDT 1996