Skip to main content

MED 267: Modeling Clinical Data and Knowledge for Computation

Instructor: Michael Hogarth, M.D.
Quarter: Spring 2024
Units: 4

Overview: Consistently and unambiguously representing data and storing it is the critical first step toward reusing the data for various purposes such as patient care, research, and outcome analysis.  This course describes a number of data persistence information models as well as how to represent  biomedical data to achieve data reusability.

Objectives: Upon completion of this course, students will be able to:

  • Explain why data standardization is important
  • Explain the use of various types of information persistence models (relational, graph, Bigtable) and their advantages and disadvantages
  • Explain the difference between classification and terminology systems.
  • Explain purpose and usage of prevailing healthcare classification systems (CPT, ICD9, ICD10)
  • Explain usage, purpose, strengths, and weaknesses of available biomedical terminological systems (MeSH, RxNORM, SNOMED CT, LOINC, UMLS).  
  • Decide which terminology systems to use to encode the data in a particular domain
  • Develop a simple domain ontology (application ontology) using Protégé
  • Implement this simple domain ontology using a graph database (Neo4J or Amazon's Neptune) 

Class organization:  In class lecture, hands-on practice, discussion and student presentation


  • Database (data persistence) information models
  • Standardized health terminologies and classifications
  • Application ontology building
  • Graph information system standards (SPARQL)


  • Class reading materials will be distributed before each session as necessary
  • A laptop computer will be required for the course