Original Papers

The diagnostic yield of three clinical prediction rules for pulmonary embolism

Alirio Rodrigo Bastidas-Goyes, Nazhda Ivette Faizal-Gómez, Santiago Ortiz-Ramírez, Giuly Aguirre-Contreras • Chía (Colombia)

Dr. Alirio Rodrigo Bastidas-Goyes: Médico Internista, Neumólogo, Epidemiólogo; Dres. Nazhda Ivette Faizal-Gómez y Santiago Ortiz-Ramírez: Residentes de Medicina Interna, Dra. Giuly Aguirre-Contreras: Médico General Egresado del Programa de Medicina. Universidad de La Sabana. Chía (Colombia).

Correspondencia: Dr. Alirio Rodrigo Bastidas-Goyes. Chía (Colombia). E-mail: alirio.bastidas@unisabana.edu.co Received: 16/VI/2019 Accepted: 10/II/2020

DOI: https://doi.org/10.36104/amc.2020.1384


Objective: pulmonary embolism (PE) is the third cause of cardiovascular death worldwide. The evaluation of pre-test probability using the Wells, Geneva and Pisa clinical prediction rules has been amply validated in prior studies. However, there are insufficient data for evaluating their diagnostic yield in a Colombian population. The goal of this article is to evaluate the yield of these scales in our population.

Methods: this was a retrospective cohort study with diagnostic test analysis in a tertiary level hospital from 2009 to 2017, which included all subjects over the age of 18 who had undergone a chest computed tomography angiography (CTA) due to a clinical suspicion of PE. All the necessary variables for constructing the Wells, Geneva and Pisa rules were recorded. Each score was calculated numerically and then classified according to probability. Pulmonary embolism was diagnosed through a CTA read by a radiologist. The data were entered on an Excel spreadsheet and analyzed using a licensed SPSS statistical program.

Results: a total of 507 subjects were included for Wells and Geneva scores and 339 for the Pisa score. The average age was 56 years (SD: 19.8) and 56.6% were males. A statistically significant relationship was found between the different calculated scores and the diagnosis of pulmonary embolism: low, intermediate and high Wells probability p<0.001; less probable and probable Wells p<0.001; low, intermediate and high Geneva p=0.006; and low, intermediate, moderate and high Pisa p=0.001. The ACOR for Wells was 0.715 (95% CI:0.663-0.767) (p<0.001), for Geneva was 0.611 (95% CI:0.553-0.668) (p<0.001), and for Pisa was 0.643 (95% CI:0.574-0.713) (p<0.001).

Conclusions: the study showed a greater PE diagnostic yield using the Wells score in our setting. There are limitations to the application and development of the Pisa score asociated with a lower yield in our patients. (Acta Med Colomb 2020; 45. DOI: https://doi.org/10.36104/amc.2020.1384).

Key words: embolism and thrombosis, reproducibility and validity, diagnosis, chest computerized tomography angiography, probability.


Pulmonary embolism (PE) is defined as the obstruction of blood flow through pulmonary vessels causing a pulmonary ventilation/perfusion mismatch with potentially fatal consequences (1, 2). This condition is associated with high morbidity and is recognized as the third cause of cardiovascular death after acute myocardial infarction and acute neurovascular syndrome (2, 3). It is estimated that approximately one million venous thromboembolic events occur every year in European countries, with 75% of them due to inpatient PE (3). In Colombia, there are few epidemiological data on PE; a study by Dennis et al. in various hospitals in 1996 found a prevalence close to 7% (4).

Since the advent of computed tomography pulmonary angiography (CTPA) as the gold standard, a PE "overdiagnosis" phenomenon has grown, which, according to some authors, has led to finding insignificant thrombi and treating patients who perhaps do not need it (5). This has led to the need to strengthen pre-test scales in order to provide direction on which patients will benefit from in-depth studies (1-3). The CTPA is an invasive, costly and potentially risky test, and the need for it should therefore be carefully examined (6). Clinical decision-making rules may be comparable to the diagnostic yield of a physician experienced in the diagnosis of pulmonary embolism, and they are useful in avoiding unnecessary and potentially harmful tests without increasing the risk of underdiagnosis (7-9).

There are various clinical prediction scales or scores for diagnosing PE, with the most validated being the Wells, Geneva and Pisa scores. If a high probability is obtained using these clinical rules, a chest computed tomography scan or pulmonary ventilation/perfusion scan is recommended, and if the probability is low, D-dimer (DD) is useful. The Wells scale has been extensively validated and is the most frequently used in our clinical practice (9, 10). The calculation is based on a correlation between the history and physical exam which allows the calculation of a low, medium or high probability for PE (traditional Wells) or a likely or unlikely PE result (modified Wells). However, it has been reported that in up to 50% of cases it is not used or is used incorrectly (11, 12). There also seems to be a reduction in the scale's precision in elderly patients. Although the modified scale performs similarly to the original scale, it is easier to correlate with the ventilation/perfusion scan results (1, 13). The Geneva scale classifies patients in three probability categories (low, intermediate and high), and the Pisa score can estimate the risk of PE through clinical and radiological findings, classifying patients as low (<10%), intermediate (>10% or <50%), moderately high (>50% and <90%) or high (>90%) risk (14). The Wells and Geneva scores have a similar yield for diagnosing acute pulmonary embolism (15-17), and the Pisa model appears to be more precise (1, 18). However, the yield of these different scores in our context is not fully known, due in part to the fact that the use and application of these clinical prediction rules is variable, familiarity with them is not uniform, and the variables needed for their construction are different.

The difficulty in making diagnostic decisions and the eligibility of patients compared to costs motivated this diagnostic test study, whose objective is to describe the diagnostic yield of the Wells, revised Geneva and Pisa scores using chest x-rays in subjects with a diagnostic suspicion of PE, both on admission to the emergency room as well as in the inpatient environment.

Materials and methods

This was a retrospective cohort study with diagnostic test analysis at a tertiary care hospital from 2009-2017. All subjects over the age of 18 who had undergone a CTPA due to clinical suspicion of PE were included. Subjects who had undergone the test due to a suspicion of other pathologies such as aortic aneurysm, suspected vascular trauma, suspected non-traumatic aortic disease, or acute aortic syndrome; subjects with no data and those whose charts could not be located due to data system problems were excluded.

All the clinical and paraclinical variables necessary for constructing Wells, revised Geneva and Pisa scores were recorded, along with radiological findings, following the authors' recommendations for the construction of each of these scores when a PE diagnosis is suspected (19, 20). The following variables were included independently: age and sex, site where the disease was suspected (emergency room or inpatient ward), history of cardiovascular disease, history of acute myocardial infarction (AMI), heart failure, arterial hypertension (HTN), atrial fibrillation, valvular disease, pulmonary disease (COPD, asthma, pulmonary fibrosis), surgery within the previous four weeks, use of general anesthesia, a history of deep vein thrombosis, pulmonary embolism, trauma within the previous four weeks, leg fractures, malignancies, active cancer within the previous year, treatment for malignancy, or use of oral contraceptives; clinical findings of immobility for more than three days, dyspnea, length of dyspnea, chest pain, hemoptysis, signs and symptoms of deep vein thrombosis, pulmonary embolism as the most probable diagnosis, mental status, temperature, systolic arterial pressure, diastolic arterial pressure, heart rate, respiratory rate, arterial oxygen saturation (SaO2), ECG findings of right ventricular (RV) overload, radiological findings of pulmonary oligohemia, hilar artery amputation, consolidation with or without pulmonary infarct, and DD PE diagnosis by CTPA (21).

Each score was calculated numerically and then classified according to probability: the Wells score in three levels (low: <2, intermediate: 2-6 and high >6) (10, 22) and two levels (less likely ≤4 and likely >4) (23); the revised Geneva score in three levels (low: 0-3, intermediate 4-10, high 11-22) (15); and the Pisa score in four levels (low: 0-10, intermediate: 11-50, moderately high: 51-80 and high 81-100) (1). The PE diagnosis was made with the CTPA results read by a radiologist as positive for pulmonary embolism (8, 24).

The sample size was calculated using the results of the meta-analysis by Lucassen W (19) in which the Wells scale was found to have a sensitivity of 0.85 and a specificity of 0.51, and the revised Geneva score had a sensitivity of 0.91 and a specificity of 0.37. In order to calculate a confidence interval with a 30% prevalence of the disease, 5% precision and 95% confidence level, a minimum of 503 subjects were needed; the subjects were included using consecutive convenience sampling until the required number was completed.

Subsequently, data were entered on an Excel spreadsheet and then analyzed using a licensed SPPS statistical program. The qualitative variables were summarized as frequencies and percentages and the quantitative variables according to their distribution: for normally distributed variables, mean and standard deviation, and for non-normal variables, median and interquartile range. A bivariate analysis was performed with each of the study variables. The quantitative variables were compared according to their distribution using the Student's t-test or Mann-Whitney U, and the qualitative variables were compared using Chi square. Once the scores were constructed, they were compared with the diagnosis or lack of diagnosis of PE using tomography, and then an area under the receiver operating characteristic curve (AUC) analysis was constructed and performed for each of the quantitative scores of the clinical prediction rules, calculating the 95% confidence interval and considering a p <0.05 to be statistically significant. The Helsinki ethical guidelines and Resolution 8430 of 1993 for human research were followed, as well as data protection and confidentiality.


A total of 507 subjects were included in the final analysis for the Wells and Geneva scores and 339 for the Pisa score. Figure 1 shows the inclusion of study subjects. Pulmonary embolism was found in 24.8% of all evaluated subjects. Table 1 shows the sociodemographic characteristics and medical history of the population and the relationship to PE. A statistically significant relationship was found between a history of HTN and PE (p=0.023). Table 2 shows the clinical characteristics and physical exam findings of the population. It includes clinical symptoms such as chest pain, hemoptysis, clinical signs of deep vein thrombosis and vital signs (temperature, heart rate and respiratory rate), whose positivity showed a statistically significant relationship compatible with the finding of PE (p<0.05).

An electrocardiogram (ECG) was performed on 404 subjects (79.5%), with no statistically significant relationship between a report of RV overload and the presence of PE (p=0.567). On the other hand, a chest x-ray was performed on 463 subjects (91.3%), finding that oligohemia and consolidation are related to the tomographic diagnosis. D-dimer was performed on 122 subjects (43.5%), finding higher levels in patients with pulmonary embolism. Table 3 shows the characteristics of ECG, radiological and DD findings in the study population.

A statistically significant relationship was found between the various clinical prediction rules and the diagnosis of PE; the higher the scores, the greater the proportion of subjects with PE findings on CTPA. The Pisa score, constructed with a smaller number of subjects, also showed significant differences in discriminating between subjects with and without PE. Table 4 summarizes these findings in relation to the evaluated scores and diagnosis of PE.

The Wells score showed a greater AUC than the revised Geneva and Pisa scores in the study population. The detailed values of the various AUCs are shown in Table 5. These values were analyzed using the same number of subjects with which the three clinical prediction rules were able to be constructed completely.


According to the literature found, this is the first study in our setting to evaluate the diagnostic yield of three clinical prediction rules for diagnosing pulmonary embolism. It took subjects from the emergency room and inpatient wards, finding that the Wells score (25) had the best diagnostic yield. However, our findings differ from those of Miniati et al. (14) who, in a cohort of 1,100 patients with similar characteristics and a 40% prevalence of PE, found a high diagnostic yield for the Pisa score (18). Nevertheless, the Pisa score may be more precise in diagnosing PE in certain clinical contexts, as it uses a greater number of both clinical and paraclinical variables which modify the probability of PE (1). The low yield in our study could have been influenced by the chest x-ray and ECG readings, which may be interpreted according to the evaluators' expertise. However, if this assertion is correct, the Pisa score would have a drawback in our setting, compared to Wells or Geneva, since the latter scores could be constructed more easily (13).

In our findings, the Wells score is superior to the Geneva score, even using subjective criteria, which favors and facilitates the diagnostic approach to this condition in our patients. Lucassen et al., in a meta-analysis limited by heterogeneity, found a sensitivity between 60 and 85% and a specificity between 51 and 80% at the various Wells cut-off points, with the Geneva score having a greater sensitivity at 84 to 91%, but lower specificity between 37 and 50% in this study (19, 26). The Wells scale also showed greater validity than the Geneva scale in a study of a cohort of 203 patients hospitalized for dyspnea or chest pain, in which both scales were compared and 79-90% sensitivity and specificity were reported for the Wells scale, and 66-51% for the revised Geneva scale (22, 25).

Among other findings in our study, we found that variables such as a prior history of PTE (5.5% vs 14.2% p<0.001), clinical findings compatible with DVT (6.6% vs 17.5%, p<0.001), considering PE to be the most probable diagnosis (39.6% vs 78.6%, p<0.001), consolidation read as pulmonary infarction on chest x-ray, elevated DD and tachypnea (5.4% vs 6.5%, p<0.001), concur with statistically significant differences found in the original studies of the study scores. However, a history of malignancy, immobility for more than three days, oligohemia on chest x-ray, chest pain and hemoptysis had a trend towards statistical significance (27, 28), which could be related to population differences. Nonetheless, the general analysis of the scores is satisfactory (12).

Recent studies indicate that CTPA yield is enhanced if clinical scales are applied prior to its performance (2, 6). This suggests that the probability of obtaining a positive result for PE after the diagnostic test depends not only on the sensitivity and specificity of the test per se, but also on the clinical probability before performing the confirmatory test (24). In addition, using CTPA as a reference standard also helps rule out alternative diagnoses, such as pneumonia, whose findings may be present in 8 to 22% of patients undergoing CTPA. Lower DD levels are useful in ruling out PE when the clinical probability is low; in our findings, DD was lower in subjects with a negative CTPA (8,29, 30).

The weaknesses of this study include its retrospective nature, in which there may be underreporting of the clinical variables needed to calculate all the scores, especially the Geneva and Pisa scores. The evaluation of patients at a tertiary care level could introduce a disease spectrum bias, since the most gravely ill and symptomatic patients may be included. While CTPA was used to evaluate the final diagnosis, it should be understood that up to 2% of patients with a negative CTPA may have a pulmonary embolism in the following 90 days and in low risk populations there may be positive results in 6 to 10% (8). This study opens the door to new research, and, in the future, additional studies with larger populations may be used to evaluate D-dimer and the PERC scale in patients with a low probability of PE in our setting, as well as to evaluate the economic impact on our healthcare centers (8, 29, 31). Finally, special populations such as pregnant women, individuals with high risk PTE, limited life expectancy and the pediatric population were not studied; populations in which these clinical prediction rules may have a different yield (7, 32).


In our setting, the Wells score was found to have a greater yield for diagnosing PE. There are limitations in the application and development of the Pisa score associated with a lower yield in our patients. The results could be corroborated in larger populations and may not be extrapolatable to special populations such as pregnant women, populations with high risk PTE and patients with limited life expectancy.


1. Cronin P, Dwamena BA. A Clinically Meaningful Interpretation of the Prospective Investigation of Pulmonary Embolism Diagnosis (PIOPED) Scintigraphic Data. Acad Radiol. 2017;24(5):550-62.

2. Cronin P, Dwamena BA. A Clinically Meaningful Interpretation of the Prospective Investigation of Pulmonary Embolism Diagnosis (PIOPED) II and III Data. Acad Radiol. 2018;25(5):561-72.

3. Righini M, Gal GL, Bounameaux H. Approach to Suspected Acute Pulmonary Embolism: Should We Use Scoring Systems? Semin Respir Crit Care Med. 2017;38(1):3-10.

4. Dennis R, Rodríguez MN. Estudio nacional sobre tromboembolismo venoso en población hospitalaria en Colombia. Acta Med Colomb. 1996;21:55-63.

5. Wiener RS, Schwartz LM, Woloshin S. Time trends in pulmonary embolism in the United States: evidence of overdiagnosis. Arch Intern Med. 2011;171(9):831-7.

6. Stein PD, Hull RD. Multidetector computed tomography for the diagnosis of acute pulmonary embolism. Curr Opin Pulm Med. 2007;13(5):384-8.

7. Biss TT, Brandao LR, Kahr WH, Chan AK, Williams S. Clinical probability score and D-dimer estimation lack utility in the diagnosis of childhood pulmonary embolism. J Thromb Haemost. 2009;7(10): 1633-8.

8. Freund Y, Cachanado M, Aubry A, Orsini C, Raynal PA, Feral-Pierssens AL, et al. Effect of the Pulmonary Embolism Rule-Out Criteria on Subsequent Thromboembolic Events Among Low-Risk Emergency Department Patients: The PROPER Randomized Clinical Trial. JAMA. 2018;319(6):559-66.

9. van Belle A, Buller HR, Huisman MV, Huisman PM, Kaasjager K, Kam-phuisen PW, et al. Effectiveness of managing suspected pulmonaiy embolism using an algorithm combining clinical probability, D-dimer testing, and computed tomography. JAMA. 2006;295(2):172-9.

10. Yap KS, Kalff V, Turlakow A, Kelly MJ. A prospective reassessment of the utility of the Wells score in identifying pulmonary embolism. Med J Aust. 2007;187(6):333-6.

11. Newnham M, Stone H, Summerfield R, Mustfa N. Performance of algorithms and pre-test probability scores is often overlooked in the diagnosis of pulmonary embolism. BMJ. 2013;346:f1557.

12. Geersing GJ, Erkens PM, Lucassen WA, Buller HR, Cate HT, Hoes AW, et al. Safe exclusion of pulmonary embolism using the Wells rule and qualitative D-dimer testing in primary care: prospective cohort study. BMJ. 2012;345:e6564.

13. Bass AR, Fields KG, Goto R, Turissini G, Dey S, Russell LA. Clinical Decision Rules for Pulmonary Embolism in Hospitalized Patients: A Systematic Literature Review and Meta-analysis. Thromb Haemost. 2017;117(11):2176-85.

14. Miniati M, Monti S, Bottai M. A structured clinical model for predicting the probability of pulmonary embolism. Am J Med. 2003; 114(3): 173-9.

15. Klok FA, Mos IC, Nijkeuter M, Righini M, Perrier A, Le Gal G, et al. Simplification of the revised Geneva score for assessing clinical probability of pulmonary embolism. Arch Intern Med. 2008;168(19):2131-6.

16. Le Gal G, Righini M, Roy PM, Sanchez O, Aujesky D, Bounameaux H, et al. Prediction of pulmonary embolism in the emergency department: the revised Geneva score. Ann Intern Med. 2006;144(3):165-71.

17. Robert-Ebadi H, Mostaguir K, Hovens MM, Kare M, Verschuren F, Girard P, et al. Assessing clinical probability of pulmonary embolism: prospective validation of the simplified Geneva score. J Thromb Haemost. 2017;15(9): 1764-9.

18. Miniati M, Bottai M, Monti S. Comparison of 3 clinical models for predicting the probability of pulmonary embolism. Medicine (Baltimore). 2005;84(2): 107-14.

19. Lucassen W, Geersing GJ, Erkens PM, Reitsma JB, Moons KG, Buller H, et al. Clinical decision rules for excluding pulmonary embolism: a meta-analysis. Ann Intern Med. 2011;155(7):448-60.

20. Ceriani E, Combescure C, Le Gal G, Nendaz M, Perneger T, Bounameaux H, et al. Clinical prediction rules for pulmonary embolism: a systematic review and meta-analysis. J Thromb Haemost. 2010;8(5):957-70.

21. Douma RA, Mos IC, Erkens PM, Nizet TA, Durian MF, Hovens MM, et al. Performance of 4 clinical decision rules in the diagnostic management of acute pulmonary embolism: a prospective cohort study. Ann Intern Med. 2011;154(11):709-18.

22. Wells PS, Ginsberg JS, Anderson DR, Kearon C, Gent M, Turpie AG, et al. Use of a clinical model for safe management of patients with suspected pulmonary embolism. Ann Intern Med. 1998;129(12):997-1005.

23. Schouten HJ, Geersing GJ, Oudega R, van Delden JJ, Moons KG, Koek HL. Accuracy of the Wells clinical prediction rule for pulmonary embolism in older ambulatory adults. J Am Geriatr Soc. 2014;62(11):2136-41.

24. Fabia Valls MJ, van der Hulle T, den Exter PL, Mos IC, Huisman MV, Klok FA. Performance of a diagnostic algorithm based on a prediction rule, D-dimer and CT-scan for pulmonary embolism in patients with previous venous thromboembolism. A systematic review and meta-analysis. Thromb Haemost. 2015;113(2):406-13.

25. Di Marca S, Cilia C, Campagna A, D'Arrigo G, Abd ElHafeez S, Tripepi G, et al. Comparison of Wells and Revised Geneva Rule to Assess Pretest Probability of Pulmonary Embolism in High-Risk Hospitalized Elderly Adults. J Am Geriatr Soc. 2015;63(6):1091-7.

26. Lucassen WA, Erkens PM, Geersing GJ, Buller HR, Moons KG, Stoffers HE, et al. Qualitative point-of-care D-dimer testing compared with quantitative D-dimer testing in excluding pulmonary embolism in primary care. J Thromb Haemost. 2015;13(6):1004-9.

27. Chunilal SD, Eikelboom JW, Attia J, Miniati M, Panju AA, Simel DL, et al. Does this patient have pulmonary embolism? JAMA. 2003;290(21):2849-58.

28. Wang RC, Bent S, Weber E, Neilson J, Smith-Bindman R, Fahimi J. The Impact of Clinical Decision Rules on Computed Tomography Use and Yield for Pulmonary Embolism: A Systematic Review and Meta-analysis. Ann Emerg Med. 2016;67(6):693-701.e3.

29. Kearon C, de Wit K, Parpia S, Schulman S, Afilalo M, Hirsch A, et al. Diagnosis of Pulmonary Embolism with d-Dimer Adjusted to Clinical Probability. N Engl J Med. 2019;381(22):2125-34.

30. van der Hulle T, Cheung WY, Kooij S, Beenen LFM, van Bemmel T, van Es J, et al. Simplified diagnostic management of suspected pulmonary embolism (the YEARS study): a prospective, multicentre, cohort study. Lancet. 2017;390(10091):289-97.

31. Singh B, Parsaik AK, Agarwal D, Surana A, Mascarenhas SS, Chandra S. Diagnostic accuracy of pulmonary embolism rule-out criteria: a systematic review and meta-analysis. Ann Emerg Med. 2012;59(6):517-20.e1-4.

32. Agha BS, Sturm JJ, Simon HK, Hirsh DA. Pulmonary embolism in the pediatric emergency department. Pediatrics. 2013; 132(4): 663-7.