MED 267: Modeling Clinical Data and Knowledge for Computation

Instructor: Michael Hogarth, M.D.
Quarter: Spring 2026
Units: 4

Overview: Consistently and unambiguously representing data and storing it is the critical first step toward reusing the data for various purposes such as patient care, research, and outcome analysis. This course describes a number of data persistence information models as well as how to represent biomedical data to achieve data reusability.

Objectives: Upon completion of this course, students will be able to:

Explain why data standardization is important
Explain the use of various types of information persistence models (relational, graph, Bigtable) and their advantages and disadvantages
Explain the difference between classification and terminology systems.
Explain purpose and usage of prevailing healthcare classification systems (CPT, ICD9, ICD10)
Explain usage, purpose, strengths, and weaknesses of available biomedical terminological systems (MeSH, RxNORM, SNOMED CT, LOINC, UMLS).
Decide which terminology systems to use to encode the data in a particular domain
Develop a simple domain ontology (application ontology) using Protégé
Implement this simple domain ontology using a graph database (Neo4J or Amazon's Neptune)

Class organization: In class lecture, hands-on practice, discussion and student presentation

Topics:

Database (data persistence) information models
Standardized health terminologies and classifications
Application ontology building
Graph information system standards (SPARQL)

Materials:

Class reading materials will be distributed before each session as necessary
A laptop computer will be required for the course