Worcester Polytechnic Institute (WPI)

http://web.cs.wpi.edu/images/cs_banner.gif

  Faculty Candidate COLLOQUIUM

A Database Server for Scientific Data Management

 

Mohamed Eltabakh
Purdue University, West Lafayette, Indiana

Computer Science Faculty Candidate

Abstract:

The growth of scientific information and the increasing automation of data collection have made databases integral to many scientific disciplines including life sciences, physics, meteorology, earth and atmospheric sciences, and chemistry. These sciences pose new data management challenges to current database system technologies. In this talk, I will focus on the following two challenges:

Annotation Management: Annotations and provenance information are important metadata that go hand-in-hand with scientific data. Annotating scientific data represents a vital mechanism for sharing knowledge and building an interactive and collaborative environment among scientists. A major challenge is: How to manage efficiently large volumes of annotations, especially at various granularities, e.g., cell, column, and row level annotations, along with their corresponding data items.

Complex Dependencies Involving Real-world Activities: The cycle of processing scientific data and generating new results is complex and may involve sequences of activities external to the database system, e.g., wet-lab experiments, instrument readings, and manual measurements. These external activities may incur inherently long delays to prepare for and to conduct. Hence, updating a database value may render parts of the database inconsistent until some external activities are executed and their output results are reflected back in the database. A major challenge is: How to integrate these external activities within the database engine and make the intermediate results instantly available for querying while maintaining the consistency of the data inside the database.

I will present various techniques and algorithms that extend the capabilities and functionalities of the database engine to address the above challenges. The proposed extensions enable scientific data to be stored and processed within its natural habitat; the database system.

______

 Mohamed Eltabakh is a Ph.D. candidate at Purdue University, West Lafayette, Indiana. His Ph.D. advisors are Walid G. Aref and Ahmed K. Elmagramid. Mohamed’s research focuses on extending the functionalities of current database systems to cope with the requirements and challenges of emerging applications. Mohamed visited Google in 2007 and Microsoft Research in 2008 as a summer intern. He is a student member of the ACM and the IEEE. For more information and publications of Mohamed Eltabakh, see http://www.cs.purdue.edu/~meltabak.

 

Host: Prof. Michael Gennert

Refreshments will be served.

 

Maintained by webmaster@cs.wpi.edu
Last modified: 03/16/2010
 

[WPI][Home][Top]