Database
Conversion Tool
CS542
Fall 2003
Project
Proposal
Lisa Choy
Randy Chong
Joshua Lee
Design Overview
The Database Conversion Tool program will allow a user to convert an Oracle database to a mySQL database and vice-versa. The user will be able to perform a live conversion or a dormant conversion. The live conversion will extract the information from the source database and import it into the destination database. The dormant conversion will only generate scripts that a database administrator can use to import the database into the destination database at any time, perhaps during off-peak system usage.
The program can be used from the command line or a GUI. With the GUI, the database administrator will be able to review all of the SQL statements that will be used to generate the information in the destination database.
The program will assume that an empty database has been created on the destination database, since it is not guaranteed to have privileges to create a new database.
Class Overview
Database Objects:
This is a group of classes to represent each of the supported database objects. Each object will hold information such as names, values, and the SQL statements required to create itself in the database. These classes will include a Table and a View class.
Reader.
The Reader accesses the source database and generates the Database Objects. Additionally, it dumps the data from the source database into delimited data files.
Writer.
The Writer generates the SQL statements to create each of the database objects in the destination databases format. Additionally, in dormant convert mode, it generates a script of SQL statements used to create all of the objects in the destination database.
Converter.
The Converter is the main/controller class for the program. It can be interfaced via command line or a GUI. It passes database access information to the Reader and Writer classes and controls program flow from a high level.
Exporter.
This class takes the SQL statements in the Database Objects and executes them in the destination database. It also inserts the data from the data files generated by the Reader into the destination database.
GUI Classes:
This is a group of classes that will represent different components of the GUI.
Program Overview
The program will invoke the Converter via the command line or GUI. The user will input access information for both databases and the type of conversion into either interface. Then, the Converter invokes the Reader. After the Reader finishes processing the source database, the Converter then passes the Database Objects to the Writer, which generates SQL statements to create each object in the destination database. Once the Writer is finished, the Converter will store all of the queries generated by the Writer in files for each Database Object, e.g. all of the table creation queries will be stored in a tables file. If running in dormant mode, conversion is complete leaving the user a script of SQL statements to generate all of the database objects, as well as a set of data files that can be used with a bulk loader.
If running in live conversion mode, the program now performs the conversion. If the user invoked the Converter via the command line, then the Converter will automatically invoke the Exporter. If the user invoked the Converter with the GUI, then he or she will be able to browse and edit the queries that have been generated by the Writer before invoking the Exporter from the GUI. The Exporter will then run all of the SQL statements to create the database objects. When this is complete, each of the data files will be loaded.
Environment
The project work will be implemented using the Oracle 9.2.0.1.0 and mySQL 3.23.57 databases available on the WPI ccc server cluster. Team members will use their existing Oracle accounts created for this course. In addition, a database named 542Conv has been created on the mySQL server for the team's use on this project.
The database conversion program will be written in Java, using the Java 2 Standard Edition SDK, version 1.4.0_03. To connect to the Oracle database, we will use the Oracle SQLJ JDBC drivers included in the classes12.zip file from Oracle. To connect to the mySQL database, we will use the mySQL Connector/J JDBC driver from mySQL version 3.0.8.
We have developed small prototypes using the software tools listed above. The prototypes query the database metadata in both Oracle and mySQL, returning the names of the tables in the database. This prototyping proves the feasibility of connecting to both Oracle and mySQL from one application running in this environment. However, the team plans on doing more prototyping early in the development of the software in order to validate the feasibility of working with the metadata for each type of database object the program will operate on.
Testing
The project team intends to test the database conversion tool at both a detailed unit level and an overall functionality level. This should ensure that the program performs as specified, as well as allowing the team to document any issues that there is insufficient time to resolve in the final version of the software.
Unit level testing of individual classes and functions will occur continuously throughout the development of the program by all team members. This will be an informal process using relatively small datasets and simple schemas. The team may use different test database setups in our individual Oracle accounts, in order to cover as many possible situations as possible, and may share some of the contents of the mySQL database.
A written test plan will be created to test the overall functionality of the program. This will include testing of all of the functionality contained in the design documents submitted with the final report. The results of running this test on the submitted program will be included in the final report, as well.
Ideally, this verification of the design of the program will be performed using two large datasets from a specific domain. One dataset will be from an application using an Oracle server, preferably containing some of the constructs that present a challenge when converting to mySQL. The other dataset should be from an application using a mySQL server. The team is still conducting research aimed at locating two specific and appropriate sets of data.
If the team can not find two appropriate sets of data for the design verification, SQL scripts will be created that set up tables, views, assertions, constraints, etc appropriate for testing specific aspects of the program's functionality. A minimum amount of effort will be used in creating the content stored in these databases, as the structure is more important. At the very least, a data generator that creates random values based on database data types will be used to populate test databases. This may be extended to specialized domains if time is available. The team will also investigate whether an open source program for generating data is available.
Background
Material
The team will need to study Java and specifically, the use of JDBC for obtaining database metadata information. The JDBC SDK includes a DatabaseMetadata class. Additional research and prototyping will be required to determine how well the selected drivers and databases implement this class. If the DatabaseMetadata class is not implemented well, we will need to design appropriate queries to access the metadata that we need.
The team will also need to research Oracle and mySQL's support of various datatypes and database objects such as triggers, assertions, key constraints, etc. To date, our research has shown that the version of mySQL which we will be using for this project may not support foreign key constraints. A workaround for this and possibly other limitations will need to be explored.
The following list of online manuals and tutorials will be used to gain the required background needed to complete the project.
Oracle
Oracle JDBC Driver
http://otn.oracle.com/software/tech/java/sqlj_jdbc/index.html
Oracle Documentation
http://download-west.oracle.com/docs/cd/A91202_01/901_doc/appdev.901/a88876/toc.htm
Notes on Oracle from Stanford
http://www-db.stanford.edu/~ullman/fcdb/oracle.html
mySQL
mySQL Database documentation and tutorial
http://www.mysql.com/doc/en/index.html
mySQL Connector/J driver
http://www.mysql.com/downloads/api-jdbc-stable.html
Driver documentation:
http://www.mysql.com/documentation/connector-j/index.html
Java Documentation
Java 2 Platform, Standard Edition, v 1.4.1 API Specification
http://java.sun.com/j2se/1.4.1/docs/api/
JDBC Tutorial
http://developer.java.sun.com/developer/onlineTraining/Database/JDBC20Intro/JDBC20.html
Project
Deliverables
Intermediate deliverables include this project proposal and a progress report. The final deliverables include source code as well as a compiled executable program. The executable program will only be guaranteed to run correctly in the environment specified in this document. Scripts used for creating test databases and for design verification will be supplied. A final report and design documentation will be provided.
Project Schedule
Week Ending |
Tasks Completed |
10/04/03 |
Investigate and test tools (drivers, compilers, mySQL database access) for development environment |
10/08/03 |
Project Proposal |
10/16/03 |
Research/Prototype support for database objects/data types. Research/Prototype DatabaseMetadata class support in JDBC drivers. |
10/23/03 |
Design/Code Database Objects classes Design/Code Reader class |
10/30/03 |
Design/Code Writer class Prepare progress report |
11/06/03 |
Design/Code Exporter class Design/Code Converter class |
11/13/03 |
Generate/Acquire test datasets |
11/20/03 |
Design/Code GUI classes |
12/04/03 |
Run final functionality test, document issues Setup up databases for In-class demonstration |
12/11/03 |
Prepare Final Report |