The Center for Population Health Information Technology and the ACG SystemTeam – part of the Department of Health Policy and Management at the Johns Hopkins Bloomberg School of Public Health – have been “putting lots of effort into mining the electronic medical record for predictive modeling,” reports Jonathan P. Weiner DrPH, Professor of Health Policy & Management and of Health Informatics; Director, Center for Population Health Information Technology (CPHIT); ACG System co-developer and Executive Director of Research. The latest result: “We recently published a breakthrough article in Medical Care presenting and evaluating the ACG System’s new expanded Geriatric Risk/Frailty Risk metrics for predictive modeling derived from both ‘structured’ and ‘free text’ EHRs.” The risk metric, he notes, has considerable relevancy for Medicare and disabled populations.
Here are additional details (as featured in Predictive Modeling News April 2018) from “Defining and Assessing Geriatric Risk Factors and Associated Health Care Utilization Among Older Adults Using Claims and Electronic Health Records”.
The researchers set out to “define and compare geriatric risk factors derivable from claims, structured EHRs and unstructured EHRs, and estimate the relationship between geriatric risk factors and health care utilization.” JHU “collaborated with the well-known Atrius medical group,” Weiner adds, which boasts more than 1 million patients in Massachusetts and constitutes the main physician group of the Harvard Pilgrim Health Plan. Atrius provided EMR data, both text and structured, he also notes, and claims for a cohort of about 25,000 Medicare Advantage patients for a multi-year period.
The team, the paper reports, “defined 10 individual geriatric risk factors and a summary geriatric risk index based on diagnosed conditions and pattern matching techniques applied to EHR free text.” Prevalence was estimated using claims, structured EHRs and structured and unstructured EHRs combined; “the association of geriatric risk index with any occurrence of hospitalizations, emergency department visits and nursing home visits was estimated using logistic regression adjusted for demographic and comorbidity covariates.” Here are excerpts from the results described in the paper:
Predictive Modeling News talked to study authors Weiner, Hong J. Kan PhD MPP MA and Hadi Kharrazi MD PhD MHI, all faculty at the Johns Hopkins Bloomberg School of Public Health Center for Population Health IT and members of the Johns Hopkins ACG System Research and Development Team.
Hong J. Kan PhD MPP MA, Hadi Kharrazi MD PhD MHI and Jonathan Weiner DrPH: Thanks for the kinds words and for featuring our work! The main novelty of this study lies in its comparison of the added incremental predictive value of insurance claims, structured electronic health records and EHR free text when used to measure a wide range of risk factors relevant to geriatric populations. We hypothesized that not all risk information is equally recorded across the three data sources. Certain concepts, such as dementia, may be well-recorded using diagnosis codes in claims and structured EHRs, but other important factors, such as lack of social support, will more likely be captured in EHR free text based on the clinicians’ notes. Indeed, we found that when information derived from unstructured data using text mining techniques was combined with claims or EHR structured data, the prevalence of risk factors increased significantly.
For example, when EHR free text notes were mined, the prevalence of “walking difficulty” among our elderly cohort had a prevalence of 19.9% (a 2.5-fold increase over EHR structured data only); the prevalence of falls was 13.2% (a 2.5-fold increase); weight loss, 6.9% (a 1.3-fold increase); and dementia, 4.5% (a 1.4-fold increase). The largest change was seen in social support, which was negligibly recorded (less than .1 %) in both structured EHRs and claims, but its prevalence increased to over 11% when the free text was searched. The results demonstrate the value of information from free text in EHRs, especially when information is not necessarily recorded using structured data — due possibly to the lack of financial incentives and/or existing ICD codes. In addition, the study shows that geriatric risk factors extracted from EHRs and claims independently predict future health care utilization in terms of use of hospitalizations, emergency department visits and nursing home use.
HJK, HK & JW: We know that frailty and other geriatrics syndrome-related factors are clinical conditions prevalent in older adults and that the presence of these factors matters in many ways. Historically, “frailty” is measured with surveys and inperson assessments, often requiring clinician administration, which limits their feasibility for large populations. We developed and refined a set of geriatric risk factors that can be extracted from existing electronically captured data, such as claims and EHRs. These risk factors can be potentially used to identify patients with high geriatric risk, including aspects of frailty, based on existing data for improved care management and population health management. This geriatric risk does add explanatory power above and beyond standard disease-oriented diagnostic risk adjustment/predictive modeling. Updated geriatric risk algorithms based on structured EHR and claims data are available in the next release of the Johns Hopkins ACG System. We expect NLP/text mining versions in the near future.
HJK, HK & JW: We believe EHRs holds a great promise for improving care both at the individual and population levels, especially as it nears universal adoption and real-time data availability. Further standardization of data capture and improved interoperability of EHR systems will eventually allow EHRs to become a powerful source of data for clinical care management and population/public health management. We envision that there will be many breakthroughs along the way as EHRs become easier to access and analyze. Our research team at the Johns Hopkins School of Public Health Center for Population Health IT has a large research portfolio focusing on integrating EHRs and other novel data sources, including free-text/NLP, and also new types of clinical information, such as lab data and vital signs, and also the all important non-medical social determinants of health.
HJK, HK & JW: Geriatric risk factors are health status characteristics common in older adults who have potential to be intervened upon and ameliorated if identified in a timely manner. Since our case identification algorithm is based on existing diagnosis data (ICD-9 or ICD-10 codes) from EHRs and claims, they can be readily operationalized by most provider and payer organizations. Our study shows that patients with multiple geriatric risk factors are particularly susceptible to future hospitalizations, ED visits and nursing home use. We recognize that there will be challenges in applying EHRs’ free text data, depending on specific EHR setup and tools used to extract frailty markers in a health care organization.
HJK, HK & JW: As their interoperability and standardization increase, EHRs, with their broad array of clinical data, will add important information to the mix. After careful assessment, such as that described in the study we published in the Medical Care journal; EHR-derived data should be fully integrated into predictive modeling and analytics activities. Moreover, in the future, we believe that claims and other administrative datasets, which are the mainstay of most activities today, will disappear
and give way to these more clinically based health IT systems.
HJK, HK & JW: At the CPHIT Center at Johns Hopkins and within the Johns Hopkins ACG System R&D team, we have a very large portfolio of related work, including two articles featured in past issues of this newsletter, as well as another paper recently published that describes additional details of our approach in using EHRs’ free-text: Anzaldi LJ, Davison A, Boyd C, Leff B, Kharrazi H. “Comparing clinician descriptions of frailty and geriatric syndromes using electronic health records: A retrospective cohort study.” BMC Geriatrics. 2017; 17 (247): 1-7 (https://www.ncbi.nlm.nih.gov/pubmed/29070036). As part of our research, we also are comparing the results of our digitally derived ACG Frailty Risk/Geriatric Risk measures to standardized frailty assessments completed in person. Also, two other papers are currently under review that will describe the technical details of our NLP approach as well as the geriatric syndrome’s case identification rate using claims versus EHRs (both structured and
free text).