Next: About this document ...
CS525D - Data Visualization - Spring, 2014
Prof. Matthew Ward
FL231, 831-5671, matt@cs.wpi.edu
Office Hours: Tuesday at 2, Thursday at 1, Friday at 10,
others by appointment
Overview:
Visualization is the graphical communication of data and information for the
purposes of presentation, confirmation, and exploration. For thousands of
years, images have been used to convey numbers, concepts, and relationships
using techniques such as maps, icons, graphs, and other visual forms. In the
past 2 decades, visualization has evolved into a discipline, drawing from such
fields as graphics, human-computer interaction, perceptual psychology, and
art.
The goal of this course is to expose students to the field of data
visualization and familiarize them with the stages of the visualization
pipeline, including data modeling, mapping data attributes to graphical
attributes, existing visualization techniques, tools, and paradigms, perceptual
issues, and evaluating the effectiveness of visualizations for specific data,
task, and user types.
Textbook:
Interactive Data Visualization: Foundations, Techniques, and Applications, by
M. Ward, G. Grinstein, and D. Keim. ISBN 9781568814735.
Additional Resources: All documents for the course will be made available
at the website http://www.cs.wpi.edu/~matt/courses/cs525D/
. Also, I
have an
extensive library of books and conference proceedings on visualization (see the
list below). If there is any topic that you'd like to delve deeper into, or
look for clarification or alternate viewpoints, feel free to borrow any of my
collection for a week or two.
Assignments:
Each week the assignment will consist of several components, each with an
expected amount of time you should dedicate:
- Reading: Each week we will focus on a different chapter of the book. I
would expect the reading to require 2-3 hours per week.
- Programming Project: Each student, either working alone or in pairs,
will be
responsible for selecting data that is of interest to them and designing and
developing a visualization tool to display and explore that type of data.
You should be prepared to discuss in class your data set and the approaches to
visualization you are considering or developing (see schedule below).
For the end of the term you should create a report (5-10 pages) and a poster
about your project. I expect each student to spend 3-5 hours a week on their
projects.
- Research Project: Each student (individual) is responsible for
researching a topic
in visualization, reviewing 2 or more papers on the topic, and creating a
report and 10-15 minute talk on the topic. In the ideal situation, this topic
would be related to your programming project (either on the same kind of data
or using a technique that you are planning to implement). The talks will be
presented throughout the term (see schedule below). I expect this activity
to take between 5 and 10 hours of total effort.
- Tool Testing: Each student (individual) should identify an existing
tool for data
visualization and create a web page to describe the tool and show examples of
its use. When you are finished with the website, send me a copy in a zip file
and I'll add it to the course website. It is due by the end of the semester.
I expect this activity to take between 5 and 10 hours of total effort.
- Book Suggestions: My co-authors and I are working on the second edition
of the textbook for this course, and would like each of you to contribute to
it! Each week you should do one or more of the following (up to 2 hours of
work per week):
- Identify a paper that the book has not referenced (especially recent ones)
that are appropriate for the topic of the chapter. If possible, indicate
where in the chapter it could be referenced and briefly why you think it is
appropriate.
- Write a new exercise or project for the end of the chapter.
- Write pseudo-code for an algorithm that would be appropriate for the
chapter (if applicable).
- Write code to implement an algorithm from the chapter, using either Java
or C++ (like those in the book appendix), or a language of your choice (e.g.,
Processing). Don't just reimplement one of the algorithms already programmed
in the book.
Additionally, if you spot any typos or grammar issues, please let me know.
Exams:
There will be no exams given for this course.
Term Project:
The steps of the programming
project are as follows:
- Select some socially relevant data set or information source as
a focus for visual analysis. Confirm your topic with Prof. Ward.
- locate 1 to 3 papers that present methods for visualizing
this kind of data. Summarize and include references to them in your project
report.
- Design or extend a visualization to allow exploration of your
data/information. You are not allowed to just use Excel! There should be some
programming involved.
- Explore your dataset and identify a modest number of "interesting"
features in the data.
- Write a short (between 5 and 10 pages, single spaced) paper
describing the data, the papers you read related to visualizing this type of
data, the process you followed in developing your visualization, the methods
used for exploration, and the things you discovered. Include screen shots
and relevant references.
- Create a poster describing your data and how you visualized it. Show
more than one view and, if possible, more than one data set.
This project is due by the start of our last class. We will hold a poster
session for all to see what everyone has been doing.
Grading:
Your grade will be roughly computed as follows:
- Programming Project: 60%
- Research Project: 15%
- Tool Testing: 15%
- Book Suggestions: 10%
For each part, I will assign a letter grade, and your final
grade will be a weighted average of these grades. The rough grading will be
as follows:
check/B = met expectations, check+/A = exceeded expectations, check-/C =
did not meet expectations, X/F insufficient to earn passing grade. Late
assignments without prior permission may
have a negative impact on the grade.
I will grade the programming project based on all aspects of the work,
including the progress reports (presentations), report, and poster.
Academic Honesty:
Copying the work of others and turning it in as your own is considered academic
dishonesty, and is strictly forbidden in this class. Violators of this policy
will receive a 0 grade for the assignment, and the incident will be reported to
the department chair and the Dean of Students' Office.
Facilities:
You can use whatever computer you have at your disposal, as long as
your programs can be demonstrated on a machine on
campus.
Software Resources:
OpenGL, Java2D, Java3D, Processing, or X can be used for software
development. Basically
whatever language you used in your graphics course will do. In most cases,
you can get by with 2-dimensional graphics, though for some types of
visualization, 3-D is essential. When you turn in your assignments, please
include instructions for compiling and executing the program. I may decide
to instead have you demonstrate the programs in action if it is too
time-consuming for me to figure out how to build and run them.
It may also be possible to build your assignments using an existing
visualization tool as a base (you are expected to add code to these). Some
visualization tools that you can download and test include:
- XmdvTool -
http://davis.wpi.edu/xmdv
- SpiralGlyphics -
http://davis.wpi.edu/~matt/projects/SpiralGlyphics/
- OpenDX -
http://www.opendx.org
- Prefuse -
http://prefuse.org
- DeVise -
http://www.cs.wisc.edu/~devise/
- VTK -
http://public.kitware.com/VTK/
- CViz -
http://www.alphaworks.ibm.com/tech/cviz
- VolVis -
http://www.cs.sunysb.edu/~vislab/volvis\_home.html
- extra points for finding others (other than WPI-developed)
Books Available from Prof. Ward:
- Bartz, Dirk, Visualization in Scientific Computing '98, Springer, 1998.
- Bederson, Ben, and Shneiderman, Ben. The Craft of Information
Visualization, Morgan Kaufman, 2002.
- Berthold, Michael, and Hand, David, Intelligent Data Analysis (2nd edition),
Springer, 2003.
- Brown, Judith. et al., Visualization: Using computer graphics to explore
data and present information, Wiley and Sons, 1995.
- Card, Stuart, et al.. Readings in Information Visualization, 1999.
- Chen, Chaomei. Information Visualization and Virtual Environments.
Springer, 1999.
- Chen, Chaomei et al., Handbook of Data Visualization, Springer, 2008.
- Cleveland, William, Visualizing Data, Hobart Press, 1993.
- Di Battista, Giuseppe et al., Graph Drawing, Prentice Hall, 1999.
- Diehl, Stephan, Software Visualization, Springer, 2007.
- Fayyad, Usama, e. al.. Information Visualization in Data Mining and
Knowledge Discovery. Morgan-Kaufmann, 2002.
- Few, Stephen, Show Me the Numbers, Analytics Press, 2004.
- Friendly, Michael, Visualizing Categorical Data, SAS Publishing, 2000.
- Grave, Michael, et al., Visualization in Scientific Computing, Springer-Verlag,
1994.
- Hagen, Hans, et al., Scientific Visualization - Dagstuhl '97, IEEE CS Press, 2000.
- Harris, Robert. Information Graphics, a Comprehensive Illustrated
Reference, Oxford University Press, 1999.
- Keller, Peter, and Keller, Mary. Visual Cues: Practical Data
Visualization. IEEE Press, 1993.
- Kerren, Andreas, et al.. Information Visualization: Human-Centered Issues and
Perspectives, Springer, 2008.
- Kosslyn, Stephen. Elements of Graph Design, W.H. Freeman, 1994.
- Lichtenbelt, Barthold, et al. Introduction to Volume Rendering.
Prentice-Hall, 1998.
- Mullet, Kevin, and Darrell Sano, Designing Visual Interfaces, Prentice
Hall, 1995.
- Nelson, Gregory, et al.. Visualization in Scientific Computing. IEEE
CS Press, 1990.
- Nelson, Gregory, et al.. Scientific Visualization: Overviews,
Methodologies, Techniques. IEEE CS Press, 1997.
- Post, Fritz et al., Data Visualization: the state of the art, Kluwer,
2003.
- Schroeder, Will, et al.. The Visualization Toolkit (2nd edition).
Prentice-Hall, 1998.
- Soukup, Tom, and Davidson, Ian, Visual Data Mining, Wiley, 2002.
- Spence, Robert. Information Visualization. Addison-Wesley, 2001.
- Stasko, John, et al., Software Visualization, MIT Press, 1998.
- Telea, Alexandru, Data Visualization Principles and Practice, AK Peters,
2008.
- Thalmann, Daniel, Scientific Visualization and Graphics Simulation, Wiley, 1990.
- Thomas, James, and Cook, Kristin. Illuminating the Path: the Research
and Development Agenda for Visual Analytics, IEEE CS Press, 2005.
- Tufte, Edward. The Visual Display of Quantitative Information.
Graphics Press, 1983.
- Tufte, Edward. Envisioning Information. Graphics Press, 1990.
- Tufte, Edward. Visual Explanations. Graphics Press, 1997.
- Tufte, Edward. Beautiful Evidence, Graphics Press, 2006.
- Ware, Colin. Information Visualization: Perception for Design.
Morgan-Kaufmann, 1999.
- Wilkinson, Leland, The grammar of graphics (2nd edition), Springer, 2005.
- Woolman, Matt. Digital Information Graphics, Watson Guptill Publishers,
2002.
- Proceedings of IEEE Visualization Conference. 1990 - present.
- Proceedings of IEEE Symposium on Information Visualization. 1995 -
present.
- Proceedings of IEEE Symposium on Visual Analytics Science and Technology,
2006-present.
- Proceedings of International Conference on Information Visualization.
1999, 2005.
- Proceedings of the Eurographics Visualization Symposium. 2003, 2004.
- Proceedings of Volume Visualization and Graphics Symposium. 1998, 2000,
2002.
- Proceedings of Parallel Visualization and Graphics Symposium. 1999.
- Proceedings of Parallel and Large-Data Visualization and Graphics
Symposium. 2001.
Tentative Schedule:
- January 21:
- introduction and foundations
- January 28:
- data models and preprocessing
- February 4:
- perceptual issues (Data Sets Chosen and Approved)
- February 11:
- visualization frameworks and taxonomies (Research 1)
- February 18:
- spatial data visualization techniques (Projects A)
- February 25:
- geovisualization techniques (Projects B)
- March 4:
- non-spatial data visualization techniques (Research 2)
- March 11:
- Term Break, no class
- March 18:
- trees and graphs (Projects A)
- March 25:
- text visualization techniques (Projects B)
- April 1:
- interaction concepts (Research 3)
- April 8:
- interaction techniques (Projects A)
- April 15:
- designing effective visualizations (Projects B)
- April 22:
- evaluating visualizations (Research 4)
- April 29:
- future directions (Posters)
- May 6:
- Snow Day
Every programming project group should decide on their datasets and have
them approved by me by week 3. Thus you should plan to e-mail me your
preliminary choice (and maybe a second choice) by February 1.
The programming project teams will be divided into Group A and B, and each
will present their progress 3 times during the term. I will assign teams to
groups once I have approved the dataset choices. For the research projects,
there
will be 4 sessions for people to do their 10-15 minute talks. I will seek
volunteers for week 4, and assign the rest at random.
Some Data Sources:
- Tons of data the government collects-
http://www.data.gov
- National Center for Health Statistics -
http://www.cdc.gov/nchs/datawh/ftpserv/ftpdata/ftpdata.htm
- National Archive of Criminal Justice Data -
http://www.icpsr.umich.edu/NACJD/
- StatLib at CMU -
http://lib.stat.cmu.edu/
- Links to more statistics datasets -
http://it.stlawu.edu/~rlock/maa51/data.html
- Everything about baseball -
http://www.baseball1.com/
- Weather data -
http://www.ncdc.noaa.gov/oa/climate/climatedata.html
- UC Irvine KDD Archive -
http://kdd.ics.uci.edu/
- Inter University Consortium for Political and Social Research -
http://www.icpsr.umich.edu/
- InfoVis and Vis conference contest data sets - conference web sites
- KDD Cup Data Discovery Challenges -
http://kdd.org/kddcup/index.php
- A nice collection of data and information visualization challenges -
http://http://romain.vuillemot.net/blog/data-and-information-visualization-challenges-contests-calendar/
- Border bouncing data from NVAC - see Prof. Ward
Next: About this document ...
Matthew Ward
2014-01-27