SELECTED DATA/TEXT SOURCES, ONTOLOGIES, AND SYSTEMS
Data / Text / Information Sources
- NCBI
The National Center for Biotechnology Information advances science and health by providing access to biomedical and genomic information.- GenBank
GenBank® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences.- Pubmed
PubMed comprises more than 23 million citations for biomedical articles from MEDLINE and life science journals. Citations may include links to full-text articles from PubMed Central or publisher web sites.- OMIM
OMIM ® (Online Mendelian Inheritance in Man ® ) is a comprehensive, authoritative, and timely compendium of human genes and genetic phenotypes.- UniProt
The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. (See "what we provide" and "site tour".)- Registry of standard biological parts
The Registry is a continuously growing collection of genetic parts that can be mixed and matched to build synthetic biology devices and systems.- EMBL
The EMBL Nucleotide Sequence Database (also known as EMBL-Bank) constitutes Europe's primary nucleotide sequence resource.- Worm Database
Online bioinformatics database of the biology and genome of the model organism Caenorhabditis elegans (C. elegans) and related nematodes.- Saccharomyces Genome Database
SGDTM is a scientific database of the molecular biology and genetics of the yeast Saccharomyces cerevisiae, which is commonly known as baker's or budding yeast.- Medical/Clinical Datasets:
- Physionet
Research resource for complex physiologic signals.- UCI's Cardiotocography Data Set
Ontologies
- Gene Ontology
The Gene Ontology project is a major bioinformatics initiative with the aim of standardizing the representation of gene and gene product attributes across species and databases.- PAMGO
PAMGO extends Gene Ontology to include terms describing various processes related to microbe-host interactions.- See Trends in Microbiology (July 2009 V. 17 Issue 7) for articles about uses and extensions of Gene Ontology in the microbial domain.
Information Source Integration, Platforms, and Existing Software
- GQuery: Global cross-database NCBI search
Simultaneously search multiple life sciences databases at the National Center for Biotechnology Information (NCBI). (Formerly known as "Entrez"?)- VBI Genome Browser
The VBI Genome Browser is a tool that allows viewing of genomic data that adheres to the Genomics Unified Schema (GUSDB) data storage standard.- GeneCards
GeneCards is a searchable, integrated database of human genes that provides concise genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes.- eTBLAST
eTBLAST is a unique search engine for searching biomedical literature that lets you input an entire paragraph and returns MEDLINE abstracts that are similar to it.- iHOP Information Hyperlinked Over Proteins. Gene centric search Engine.
- EBIMed EBIMed is a web application that combines Information Retrieval and Extraction from MEDLINE
- GoPubMed Clusters documents based on Gene/MesH Ontology
- Textpresso The Textpresso project serves the biological and biomedical research community by providing: (1) Full text literature searches of model organism research and subject-specific articles at individual sites. (2) Text classification and mining of biomedical literature for database curation. (3) Linking biological entities in PDF and online journal articles to online databases.
- MeSH
U.S. National Library of Medicine's Medical Subject Headings.- ABNER: A Biomedical Named Entity Recognizer
ABNER is a software tool for molecular biology text analysis.- The Stanford Natural Language Processing Group
Their research has resulted in state-of-the-art technology for robust, broad-coverage natural-language processing in many languages. These technologies include a part-of-speech tagger; a high performance probabilistic parser; a competition-winning biological named entity recognition system; and algorithms for processing Arabic, Chinese, and German text.- ISI Web of Knowledge
ISI Web of Knowledge is an online academic database provided by Thomson Scientific.s Institute for Scientific Information. It provides access to many databases and other resources.- W3C
The World Wide Web Consortium (W3C) is an international community where member organizations, a full-time staff, and the public work together to develop web standards.
- Prof. Kellis' Algorithms for Computational Biology course (MIT)
- Profs. Alterovitz's, Kellis', and Ramoni's Bioinformatics and Proteomics course (MIT)
- Prof. Yemini's Computational Genomics course (Columbia Univ.)
- Prof. Mneimneh's Computational Biology course (Hunter College)
- Prof. Moran's Algorithms in Computational Biology course (Technion Univ.)
- Prof. Subramanian's From Sequence to Structure: An Introduction to Computational Biology course (Rice Univ.)
- Rosalind is a joint project between the University of California at San Diego and Saint Petersburg Academic University along with the Russian Academy of Sciences.