BCB4003/503 CS4803/583

Biological and Biomedical Database Mining

Syllabus A term / Fall 2013

Prof. Carolina Ruiz

WARNING: Small changes to this syllabus may be made during the semester.


This course will investigate computational techniques for discovering patterns in and across complex biological and biomedical sources, including genomic and proteomic databases, clinical databases, digital libraries of scientific articles, and ontologies. Techniques covered will be drawn from several areas including sequence mining, statistical natural language processing and text mining, and data mining.


  • A term (Aug 29 - Oct 17):
    All Undergraduate and Graduate sections combined.
    Time: Tuesdays and Fridays 1:00-2:50 pm.
    Room: SL411

  • B term (Oct 28 - Dec 20):
    Graduate sections only.
    Time: 2 hours per week (time to be determined).


Prof. Carolina Ruiz

Office: FL 232
Phone Number: (508) 831-5640
Office Hours:

Thursdays 1-2 pm. Email me if you'd like to schedule a meeting at a different day and/or time.


    No textbook will be required. Readings will be assigned throughout the term/semester.


BCB4003/CS4803 Recommended Background: CS 2102, CS 2223, MA 2610 or MA 2611, and one or more biology courses.

BCB503/CS583 Prerequisite: Strong programming skills, an undergraduate or graduate course in algorithms, an undergraduate course in statistics, and one or more undergraduate biology courses.


Problem Sets 75%
Final Exam 20%
Class Participation 5%


Problem Sets 50%
Final Project: 25%
Midterm Exam 15%
Advanced Readings 5%
Class Participation 5%

Your final grade will reflect your own work and achievements during the course. Any type of cheating will be penalized with an NR/F grade for the course and will be reported to the WPI Judicial Board in accordance with the Academic Honesty Policy.


Students are expected to read the material assigned for each class in advance and to participate in class discussions. Class participation will count for 5% of the students' final grades.


There will be several individual problem sets. In addition, students in the graduate sections will work on a final project. Each problem set / project may include programming, experimental work, assigned readings, and theoretical problems. For most of the projects, you can choose any of the following systems:

  • Matlab. Available through the CCC. The Statistics Toolbox and the Bioinformatics Toolbox are particularly useful.

  • Weka. Weka is a machine-leaning/data-mining environment. It provides a large collection of Java-based mining algorithms, data preprocessing filters, and experimentation capabilities. Weka is open source software issued under the GNU General Public License. For more information on the Weka system, to download the system and to get its documentation, look at Weka's webpage (http://www.cs.waikato.ac.nz/ml/weka/). You should download and use the latest Developer Version (currently weka-3-7-10) of the system.

  • Your own code. You can use
    • Python (for Python tutorials, see its documentation),
    • R (for R manuals, follow its Manuals link),
    • or other script languages, or any high level programming language to implement your own programs and scripts to complement the functionality of the systems above.

Detailed descriptions of the problem sets and projects will be posted to the course webpage at the appropriate times during the term/semester. An in-class presentation of each of the assignments will be required.


The mailing list for this couse is

this mailing list reaches the professor and all students in the class.


Announcements will be posted on the web pages and/or the class mailing list, so you are required to check your email and the class web pages frequently.


The federal Office of Civil Rights strongly suggests that faculty include a statement in their syllabus that states that accommodations are available for students with disabilities, the correct procedure for receiving the accommodations and that you are willing to provide the accommodations. WPI's Office of Disability Services developed the following statement:

"If you need course adaptations or accommodations because of a disability, or if you have medical information to share with me that may impact your performance or participation in this course, please make an appointment with me as soon as possible. If you have approved accommodations, please go to the Exam Proctoring Center (EPC) in Morgan Hall to pick up Letters of Accommodation. If you have not already done so, students with disabilities who need to utilize accommodations in this class are encouraged to contact the Office of Disability Services (ODS) as soon as possible to ensure that such accommodations are implemented in a timely fashion. This office can be contacted via email: DisabilityServices@wpi.edu, via phone: (508) 831-4908, or in person: 137 Daniels Hall."


Small changes to this syllabus may be made during the course of the semester.


WPI Worcester Polytechnic Institute