Overview
Disease Ontology is a controlled medical vocabulary developed at the Bioinformatics Core Facility in collaboration with the NuGene Project at the Center for Genetic Medicine. It was designed to facilitate the mapping of diseases and associated conditions to particular medical codes such as ICD9CM, SNOMED and others. This mapping is useful if efforts like the NuGene Project because it allows request for particular tissue type requests to be mapped quickly and with high fidelity to a set of ICD9 codes that can then be used to retrieve appropriate samples from the tissue bank. Without such a mapping, clinicians are forced to manually search through ICD9CM coding booklets to find all possible applicable codes matching their request. Given the complex organization of ICD9CM and difficulty of the manual process, codes and therefore tissue samples are often overlooked. In one sample case, an early version of the Disease Ontology doubled concept coverage while reducing the overall misclassification error percentage. Eventually we envision that the Disease Ontology can also be used to associate model organism phenotypes to human disease as well as medical record mining.
Disease Ontology is implemented as a directed acyclic graph (DAG) and utilizes the Unified Medical Language System (UMLS) as its immediate source vocabulary to access medical Ontologies such as ICD9CM. Using this standard, much of the process of updating the ontology can be handled by UMLS, freeing resources for clinicians to pursue more urgent tasks. For situations where the graph needs to be directly edited, the open source graph editor DAGEdit can be used. DAGEdit can readily manipulate and view the Disease Ontology because it is stored in Open Biomedical Ontologies (OBO) format in order to take advantage DAGEdit and any other OBO standards compliant tools. The screenshot of the Disease Ontology was taken using DAGEdit and show version 3 of Disease Ontology.
As a graph, the Disease Ontology can be thought of as a subset of UMLS. It fills a niche in the medical ontology world as a lightweight ontology offering context-free concept identifiers designed specifically to facilitate mapping to medical billing codes. Other Ontologies such as SNOMED and MESH lack these features.
The previous version of Disease Ontology (v2.1) is based almost entirely on ICD9CM with additional concepts included that are useful for mapping common disease requests. It is a lightweight ontology containing 19136 concept nodes and is currently available for download. The newest version Disease Ontology 3 (revision 21) is based on primarily on freely available UMLS vocabularies (including ICD9) and is currently under development.
Back to Top
Communication
We understand the importance of community input in order to create a vocabulary of utility for research and representation. Public suggestion and debate through our communication tools (email, mailing list, forums and news) is encouraged. Input from domain experts in clinical practice as well as molecular biologists is necessary to improve and extend the structure and content of the ontology.
Back to Top
Future Development
The ontology will be continually refined in the following ways:
* integration of term and re-organizational requests from the research community
* mapping to other ontologies (MESH, ICD10, etc…)
* yearly versioning information from UMLS
Back to Top
Disease Ontology Version and Downloads
The first version of Disease Ontology was released in August of 2003 and due to various problems is no longer supported. Future releases of the Disease Ontology will be backwards compatible.
Version 1.0 Disease Ontology (no longer supported)
Version
2.1 Disease Ontology
DAGEdit (viewing tool)
Back to Top
Projects Using Disease Ontology
NUgene
NUgene collects and stores genetic (DNA) samples along with associated healthcare information from patients of Northwestern-affiliated hospitals and clinics. It is currently the only study of its kind in Chicago and one of a few in the nation. This resource is available to scientists to conduct groundbreaking genetic research.
Back to Top
Contacts
Rex Chisholm -
Director, Center for Genetic Medicine, Northwestern University
- Project leader
Warren Kibbe
- Director of Bioinformatics, Northwestern University
- Questions relating to the Bioinformatics Facility
John Osborne
- Senior Bioinformatics Analyst, Northwestern University
- Technical questions regarding the Disease Ontology
Wendy Wolf
- NUgene Project Director
- Questions regarding NUgene adoption of Disease Ontology
Michael Doyle- Informatics Lead, NUgene Project
Angela Ou Doyle- IT Coordinator, Center for Genetic Medicine
Maureen Smith- Clinical Director, NUgene Project
Julie Zhu- Associate Director of Bioinformatics
