BCB4003/ BCB503

Biological and Biomedical Database Mining

Syllabus A term / Fall 2011

Prof. Carolina Ruiz

WARNING: Small changes to this syllabus may be made during the semester.


This course will investigate computational techniques for discovering patterns in and across complex biological and biomedical sources, including genomic and proteomic databases, clinical databases, digital libraries of scientific articles, and ontologies. Techniques covered will be drawn from several areas including sequence mining, statistical natural language processing and text mining, and data mining.



BCB4003/BCB503: MTuThF 1:00-1:50 pm during A term (Aug. 25th - Oct. 13th)
BCB503: 2 hours per week (time TBD) after A term.

Room: SL11


Prof. Carolina Ruiz

Office: FL 232
Phone Number: (508) 831-5640
Office Hours:

Tuesdays 2-3 pm, or by appointment.


    No textbook will be required. Readings will be assigned throughout the term/semester.


BCB4003 Recommended Background: CS 2102, CS 2223, MA 2610 or MA 2611, and one or more biology courses.

BCB503 Prerequisite: Strong programming skills, an undergraduate or graduate course in algorithms, an undergraduate course in statistics, and one or more undergraduate biology courses.


3 Problem Sets / Projects (25% each): 75%
Final Exam 20%
Class Participation 5%


3 Problem Sets / Projects (15% each): 45%
1 Final Project: 25%
Midterm Exam 15%
Advanced Readings 10%
Class Participation 5%

Your final grade will reflect your own work and achievements during the course. Any type of cheating will be penalized with an NR/F grade for the course and will be reported to the WPI Judicial Board in accordance with the Academic Honesty Policy.


Students are expected to read the material assigned for each class in advance and to participate in class discussions. Class participation will count for 5% of the students' final grades.


There will be a total of 3 individual projects for BCB4003, and 4 individual projects for BCB503. Each problem set / project may include programming, assigned readings, and theoretical problems. For most of the projects, you can choose any of the following systems:

  • Matlab. Available through the CCC. The Statistics Toolbox and the Bioinformatics Toolbox are particularly useful.
  • Weka. Weka is a machine-leaning/data-mining environment. It provides a large collection of Java-based mining algorithms, data preprocessing filters, and experimentation capabilities. Weka is open source software issued under the GNU General Public License. For more information on the Weka system, to download the system and to get its documentation, look at Weka's webpage (http://www.cs.waikato.ac.nz/ml/weka/). You should download and use the latest Developer Version (currently weka-3-7-4) of the system.
  • Your own code. You can use Python (for Python tutorials, see its documentation), other script languages, or any high level programming language to implement your own programs and scripts to complement the functionality of the systems above.

Detailed descriptions of the problem sets and projects will be posted to the course webpage at the appropriate times during the term/semester. An in-class presentation of each of the assignments will be required.


There are two mailing lists for this class:

reaches the professor and the BCB4003 students, and

reaches the professor and the BCB503 students.

During A term, messages and announcements will be sent to both mailing lists.


Announcements will be posted on the web pages and/or the class mailing list, so you are required to check your email and the class web pages frequently.


Policy on Americans with Disabilities Act accommodations:

The federal Office of Civil Rights strongly suggests that faculty include a statement in their syllabus that states that accommodations are available for students with disabilities, the correct procedure for receiving the accommodations and that you are willing to provide the accommodations. WPI's Office of Disability Services developed the following statement:

If you need course adaptations or accommodations because of a disability, or if you have medical information to share with me, please make an appointment with me as soon as possible. If you have not already done so, students with disabilities who believe that they may need accommodations in this class are encouraged to contact the Office of Disability Services (ODS) as soon as possible to ensure that such accommodations are implemented in a timely fashion. This office is located in the West St. House (157 West St), (508) 831.4908.


Small changes to this syllabus may be made during the course of the semester.


See my lists of additional


WPI Worcester Polytechnic Institute