Physicians' Academy for Cardiovascular Education

Risk score based on electronic health record data improves prediction of CAD risk

Coronary Risk Estimation Based on Clinical Data in Electronic Health Records

Literature - Petrazzini BO Chaudhary K, Márquez-Luna C, et al. - J Am Coll Cardiol. 2022 Mar 29;79(12):1155-1166. doi: 10.1016/j.jacc.2022.01.021

Introduction and Methods


To identify individuals at risk of developing coronary artery disease (CAD), several risk assessment tools have been designed, such as the pooled cohort equations (PCE) and polygenic risk score (PRS). The applicability of the conventional clinical risk score, the PCE score, is limited by underestimation and overestimation of CAD risk [1-3], as well as by biases in certain populations [4-6], while the clinical utility of a high PRS for CAD is still under investigation [7]. As electronic health records (EHRs) contain core characteristics of several diseases, they could be a valuable resource for more precise risk prediction and stratification.

Aim of the study

The authors investigated whether a prediction score based on EHR clinical features (EHR score) can improve short-term CAD risk prediction and reclassification beyond that of the PCE score and PRS.



To develop the EHR score, a machine learning–based approach was applied to clinical data extracted from EHRs.

The predictive power of the EHR score, PCE score, and PRS was first assessed in a hospital-based, multiethnic, EHR-linked cohort (BioMe Biobank cohort) and then externally validated in a population-based cohort with available EHR and genotype data (UK Biobank cohort).

Both study populations were selected based on clinical guidelines to guide initiation of statin treatment. The inclusion criterion was age 40–79 years, and individuals taking statins or with second-degree relatedness or higher were excluded. The BioMe Biobank cohort comprised 555 CAD cases and 6349 control subjects and the UK Biobank cohort 3130 CAD cases and 378,344 controls.

The predictive performance of the three models was assessed by determining the area under the receiver-operating characteristic curve, and the net reclassification improvement was calculated.



The first endpoint was prediction of CAD 1 year prior to diagnosis in cases (primary analysis). A secondary analysis was performed on risk of stroke alone and risk of ASCVD (CAD, angina, and stroke). In addition, the ability of the EHR score and PRS to reclassify individuals for the 1-year CAD risk based on the PCE score was assessed.

Main results



Positive and negative predictive values, sensitivity, and specificity and false positives


The EHR score increased the prediction of the 1-year CAD risk and improved reclassification of individuals for the 1-year CAD risk, compared with the PCE score and PRS, particularly in low-risk individuals. According to the authors, the clinical use case of this score “would be to identify high-risk individuals (who are flagged as low risk by traditional scores), optimizing prevention and care using embedded pathways.”


Show references

Find this article online at J Am Coll Cardiol.

Share this page with your colleagues and friends: