Skip to main content

Abstract: Found in Translation: Transforming Rules-Based Diabetes Phenotyping Algorithms into Reproducible Diabetes Cohorts from Real-World Data

Erin M. Tallon, MS, RN1, Mark A. Clements, MD, PhD,2 Chi-Ren Shyu, PhD, FAMIA1

1University of Missouri, Institute for Data Science and Informatics, Columbia, MO; 2Children’s Mercy Hospital, Kansas City, MO

Abstract

Consistent, reproducible identification of electronic medical records (EMR)-based cohorts of individuals with diabetes is needed to enable comparability of research findings across studies and diverse patient populations. In this work, we develop and implement a rigorous, well-defined process for using criteria from the SUrveillance, PREvention, and ManagEment of Diabetes Mellitus algorithm to identify individuals with diabetes from EMR data.

Introduction

Designed to facilitate diabetes research across 11 integrated health systems, the SUrveillance, PREvention, and ManagEment of Diabetes Mellitus algorithm (SUPREME-DM) is a rules-based electronic medical record (EMR) clinical phenotyping algorithm used to identify individuals with diabetes of any type (type 1 diabetes, type 2 diabetes, or diabetes of rare or uncertain type), excluding gestational diabetes.1 However, to date, SUPREME-DM has not been fully translated into criteria that can be reproducibly applied to nationwide, multi-site EMR datasets.

Methods

The December 2020 version of Cerner’s Coronavirus Disease 2019 (COVID-19) database contains longitudinal EMR data for 490,373 patients with suspected or confirmed COVID-19 who were seen or admitted at 87 U.S.-based health systems. We translated expert-defined rules from SUPREME-DM to raw code in Cerner’s HealtheDataLab data analytics environment to extract a well-defined cohort of patients with diabetes of any type, excluding gestational diabetes. Using an approach based on the Method to Acquire Delivery Date Information from Electronic health records (MADDIE) algorithm,2 we identified 40,824 pregnancy delivery dates for 33,158 individuals in the COVID-19 database. Diabetes data (e.g., elevated blood glucose and hemoglobin A1c [HbA1c] results) documented during periods of pregnancy were excluded when constructing a SUPREME-DM cohort of individuals with diabetes. We used an online conversion tool (ICD10Data.com) to map ICD-10-CM codes in the database to ICD-9 codes specified by SUPREME-DM. We also developed comprehensive lists of laboratory (e.g., Logical Observation Identifiers Names and Codes [LOINC]) codes for blood glucose and HbA1c results, as well as a complete listing of generic and brand-name diabetes medications, to identify additional diabetes criteria outlined in SUPREME-DM.

Results

Using the criteria developed through this work, we analyzed encounter, diagnosis, laboratory, and medication data to identify a well-defined cohort of 109,970 individuals from Cerner’s COVID-19 database who met SUPREME-DM criteria for having diabetes of any type. We subsequently used absence of all diabetes-related criteria developed through this work to identify a cohort of 314,798 individuals who did not have diabetes.

Conclusion

Researchers working with EMR datasets can use criteria developed through this work to extract comparable cohorts of individuals with diabetes. Using consistent, reproducible methods for cohort selection across studies will enhance scientific rigor as well as improve comparability of findings across patient populations. Next steps for this work will involve evaluation of this algorithm’s performance against naïve algorithms that use only diabetes diagnosis codes.

Acknowledgement

EMT is supported by the NIH / National Library of Medicine under award number T32LM012410.

 

References

  1. Nichols G, Desai J, Elston J, et al. Construction of a multisite DataLink using electronic health records for the identification, surveillance, prevention, and management of diabetes mellitus: The SUPREME-DM project. Prev Chronic Dis. 2012;9:110311.

  2. Canelón SP, Burris HH, Levine LD, et al. Development and evaluation of MADDIE: Method to Acquire Delivery Date Information from Electronic health records. Int J Med Inform. 2021;145:104339.

 

Link: https://knowledge.amia.org/76677-amia-1.4637602