Predictive performance of risk scores for ischemic stroke in AF

Comprehensive comparison of stroke risk score performance: a systematic review and meta-analysis among 6 267 728 patients with atrial fibrillation

Literature - Van der Endt VHW, Milders J, Penning de Vries BBL, et al. - Europace. 2022 Jul 27;euac096. doi: 10.1093/europace/euac096

Introduction and methods


In patients with AF, the risk of ischemic stroke (IS) can be predicted by several different risk scores. However, current risk scores appear to have a limited overall ability to predict IS in AF patients [1-3]. At the same time, newer risk scores and updates of earlier scores seem to be understudied.

Aim of the study

The objectives of this study were to identify and systematically review all available risk scores predicting IS risk in AF patients, their external validations, and updates, to assess the methodological quality of the studies on these risk scores, and to calculate a pooled estimate of the predictive performance.


The authors conducted a literature search in PubMed and Web of Science for studies in which a multivariable prognostic IS risk score for AF patients was developed, validated, or updated. The Prediction model Risk Of Bias ASsessment Tool (PROBAST) was used to examine the methodological quality of the studies. To assess discrimination (i.e., score’s ability to discriminate between events and nonevents), pooled c-statistics were calculated for 6,267,728 patients and 359,373 events of IS using a random-effects meta-analysis. The level of calibration (i.e., agreement between predicted and observed risk) was also estimated based on different statistics . For each of the validation studies included in the meta-analysis, 8 sensitivity analyses were performed.

Main results

  • In total, 19 scores were identified, which were validated in 107 studies (329 external validations in total) and updated at least once in 40 studies (76 updates in total).

Methodological quality of studies

  • The PROBAST indicated a risk of methodological bias in all studies, which was mostly due to a high or unclear risk of bias in the outcome domain and a high risk of bias in the analysis domain of this tool.
  • For all studies, there were also concerns regarding the applicability of the risk score.

Development studies

  • Discrimination was reported in 16 of the 19 development studies (c-statistic ranged from 0.61 to 0.86).
  • Information on calibration (presented as observed vs. expected risks and/or Hosmer–Lemeshow statistic) was available for 8 risk scores and indicated “adequate calibration” or “no evidence of poor calibration.”

Validation studies

  • Of the 329 validations, most were performed on the CHA2DS2-VASc score (n=147) and CHADS2 score (n=75). Several of the newer risk scores had not been validated.
  • All 107 validation studies presented information on discrimination, indicating poor (<0.60) to reasonable (0.60–0.80) and in exceptional cases good (>0.80) discrimination. For risk scores developed after the publication of the CHA2DS2-VASc score in 2010, no discrimination <0.60 was reported.
  • Calibration was reported in 14 studies (13%), which showed improved calibration measures for the modified CHADS2, ABC, and GARFIELD-AF scores compared with the CHA2DS2-VASc score.

Update studies

  • The 40 update studies only reported on the CHA2DS2-VASc score (n=25), CHADS2 score (n=7), or both scores (n=8).
  • For the CHA2DS2-VASc score, 46/49 updates showed improved discrimination compared with the original CHA2DS2-VASc score, whereas 13/17 updates for the CHADS2 score indicated improved discrimination compared with the original CHADS2 score.

Pooled c-statistic

  • For 10 scores (CHA2DS2-VASc, CHADS2, AFI, Framingham, ATRIA, SPAF, ABC, modified CHADS2, GARFIELD-AF, and GARFIELD-AF II), pooled c-statistics could be calculated, which ranged from 0.598 (95%CI: 0.558–0.636; 12 validations) for the AFI to 0.715 (95%CI: 0.674–0.754; 10 validations) for the modified CHADS2 score.
  • The pooled c-statistic was 0.644 (95%CI: 0.635–0.653) for the CHA2DS2-VASc score and 0.658 (95%CI: 0.644–0.672) for the CHADS2 score.

Sensitivity analyses

  • Most sensitivity analyses showed no or only marginal differences, which indicated that factors such as age, anticoagulation therapy, and the PROBAST score, had no or only limited effect on the discriminative abilities of most risk scores.


This systematic review and meta-analysis identified 19 risk scores (with 329 validations and 76 updates in total) to predict IS risk in patients with AF. Although all risk scores showed poor to reasonable performance, newer scores tended to have a slightly better discriminative ability than the conventional, guideline-endorsed CHA2DS2-VASc score. The modified CHADS2 score showed overall the best discriminative abilities. Methodological bias was observed in all studies. The authors stress that “additional external validations and data on calibration are required before considering the newer scores in clinical practice.”


1. Van Staa TP, Setakis E, Di Tanna GL, Lane DA, Lip GY. A comparison of risk stratification schemes for stroke in 79,884 atrial fibrillation patients in general practice. J Thromb Haemost. 2011;9:3-48.

2. Borre ED, Goode A, Raitz G, Shah B, Lowenstern A, Chatterjee R, et al. Predicting thromboembolic and bleeding event risk in patients with non-valvular atrial fibrillation: a systematic review. Thromb Haemost. 2018;118:2171-87.

3. Fang MC, Go AS, Chang Y, Borowsky L, Pomernacki NK, Singer DE. Comparison of risk stratification schemes to predict thromboembolism in people with nonvalvular atrial fibrillation. J Am Coll Cardiol. 2008;51:81-5.

