Diagnostic performance of artificial intelligence algorithms for detection of pulmonary involvement by COVID-19 based on portable radiography

Cobeñas, Ricardo Luis; de Vedia, María; Florez, Juan; Jaramillo, Daniela; Ferrari, Luciana; Re, Ricardo

doi:10.1016/j.medcle.2022.04.020

Article information

Abstract

Full Text

Bibliography

Download PDF

Statistics

Figures (3)

Show moreShow less

Abstract

Introduction and objectives

To evaluate the diagnostic performance of different artificial intelligence (AI) algorithms for the identification of pulmonary involvement by SARS-CoV-2 based on portable chest radiography (RX).

Material and methods

Prospective observational study that included patients admitted for suspected COVID-19 infection in a university hospital between July and November 2020. The reference standard of pulmonary involvement by SARS-CoV-2 comprised a positive PCR test and low-tract respiratory symptoms.

Results

493 patients were included, 140 (28%) with positive PCR and 32 (7%) with SARS-CoV-2 pneumonia. The AI-B algorithm had the best diagnostic performance (areas under the ROC curve AI-B 0.73, vs. AI-A 0.51, vs. AI-C 0.57). Using a detection threshold greater than 55%, AI-B had greater diagnostic performance than the specialist [(area under the curve of 0.68 (95% CI 0.64–0.72), vs. 0.54 (95% CI 0.49–0.59)].

Conclusion

AI algorithms based on portable RX enabled a diagnostic performance comparable to human assessment for the detection of SARS-CoV-2 lung involvement.

Keywords:

Artificial intelligence

COVID-19

Thoracic RX

Pneumonia

Machine learning

Lung

Resumen

Introducción y objetivo

Evaluar el rendimiento diagnóstico de diferentes algoritmos de inteligencia artificial (IA) para la identificación de compromiso pulmonar por SARS-CoV-2 basados en radiografía (Rx) de tórax portátil.

Material y método

Estudio observacional prospectivo que incluyó pacientes ingresados por sospecha de infección por COVID-19 en un hospital universitario entre julio y noviembre de 2020. El patrón de referencia de compromiso pulmonar por SARS-CoV-2 comprendió una PCR positiva y síntomas respiratorios bajos.

Resultados

Se incluyeron 493 pacientes, 140 (28%) con PCR positiva y 32 (7%) con neumonía por SARS-CoV-2. El algoritmo AI-B tuvo el mejor rendimiento diagnóstico (áreas bajo la curva ROC AI-B 0,73 vs. AI-A 0,51 vs. AI-C 0,57). Utilizando un umbral de detección superior al 55%. AI-B presentó mayor precisión que el especialista (área bajo la curva de 0,68 [IC 95%: 0,64–0,72] vs. 0,54 [IC 95%: 0,49–0,59]).

Conclusión

Los algoritmos de IA basados en Rx portátiles permiten una precisión diagnóstica comparable a la humana para la detección de compromiso pulmonar por SARS-CoV-2.

Palabras clave:

Inteligencia artificial

COVID-19

Radiografía de tórax

Neumonía

Aprendizaje automático

Pulmón

Full Text

Introduction

The use of portable radiography (X-ray) during the COVID-19 pandemic was an essential resource in this context, reducing the risks of transport-related contamination. For this reason, portable examinations were established for both in-patients and out-patients.

Computer technology made a positive contribution to this early stage of the pandemic, with the emergence of patient monitoring applications, contact tracing, thermal scanners and remote care cameras.1 In this context, multiple artificial intelligence (AI) platforms emerged with the aim of facilitating the detection of radiological findings related to COVID-19 infection.2

In SARS-CoV-2 pneumonia, AI algorithms detect bilateral patchy opacities, which may vary in location and intensify over time. These findings are similar to those of viral pneumonia, so the analysis is challenging for both the radiologist and the algorithm in question.3

In developing economies, where the vast majority of imaging specialists are located in large urban centres, and where accessibility to PCR testing is limited or has significant delays in the delivery of results, it is important to be able to define whether AI algorithms are a reliable tool to provide diagnostic support to on-call physicians and peripheral health centres where specialist review is not available.

Therefore, the aim of this work was to evaluate the potential of different AI algorithms to detect lung involvement by COVID-19 on portable front chest X-ray.

Materials and method

Prospective observational study in consecutive patients admitted to the emergency department or hospitalized for suspected COVID-19 infection at a university hospital. The presence of symptoms such as fever, cough, dyspnoea, anosmia and/or ageusia constituted the criteria for swabbing and performing a chest X-ray at this stage of the pandemic.

The chest X-ray images were extracted from the picture archiving and communication system (PACS) in DICOM format, while clinical and laboratory data were obtained from the electronic medical record.

The reference pattern of COVID-19 lung involvement was defined as the combined presence of a positive PCR test and symptoms of lung infection. Chest X-rays were independently analysed off-line without knowledge of symptoms or history to determine which of them showed typical findings of COVID-19 pneumonia.

The X-rays were analysed independently by a medical imaging specialist, as well as by 3 AI platforms with different training algorithms and open access specifically designed to evaluate the detection of lung involvement by COVID-19 in chest X-rays. The platforms used were Pneuma Deep Health COVID (http://pneuma.deephealth.thingtrack.com/), DAC-4 (https://www.delft.care/how-to-access/) and ENTELAI (https://covid.entelai.com/).

A classification of the findings was carried out by an experienced radiologist, as well as by AI algorithms, to establish the probability of COVID-19 pneumonia. Each analysis was determined according to degrees of probability of various types of pulmonary involvement, including probability of COVID-19 pneumonia (Figs. 1–3). Independently, the specialist physician categorised the findings into normal studies, COVID pneumonia or other findings.

Fig. 1.

Portable chest X-ray study in which the AI algorithm identifies some areas of patchy density (arrows in panel B) not suggestive of a viral lung process.

(0.12MB).

Fig. 2.

(A) Portable chest X-ray analysis by AI algorithm in a 77-year-old patient, with a confirmed diagnosis of SARS-CoV-2 infection by PCR test. The algorithm yielded a 100% probability of COVID-19 infection, shown in green (arrows). (B) Opacities identified in both lung fields.

(0.1MB).

Fig. 3.

Patient with confirmed diagnosis of SARS-CoV-2 infection by PCR test. (A) The result provided by the AI algorithm from the portable chest X-ray analysis of 87% probability of COVID-19 pneumonia. (B) Result and analysis of the AI algorithm, showing in green (arrows) opacities in both lung fields.

(0.08MB).

The classification principles of each study can be viewed on the access pages of each algorithm, for the entry of images.

Statistical analysis

Continuous variables were reported as means ± standard deviation while categorical variables were reported as frequencies and percentages. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for the detection of lung involvement by SARS-CoV-2 were evaluated. We also compared the diagnostic accuracy of the different algorithms (areas under the ROC curve, understanding an area of 0.50 as chance, and an area of 1.0 as perfect accuracy) using the DeLong method. Analyses were performed using SPSS® version 22.0 software (Armonk, NY, USA) and MedCalc® Statistical software version 13.3.3 (MedCalc software bvba, Ostend, Belgium).

Results

A total of 493 patients evaluated at our institution between July and November 2020 for suspected COVID-19 infection who underwent front chest X-ray were included. The median age of the included patients was 47 years (interquartile range 34; 71 years), 55% women. Patients were studied for having COVID-19 symptoms in the previous days, both as outpatients and those who were hospitalised for another reason.

The symptoms reported were mostly dyspnoea, dry or productive cough, odynophagia, ageusia, anosmia and fever. The per protocol patients were studied with nasopharyngeal swab and PCR, complete laboratory work-up and initial portable X-ray.

The PCR test was positive in 140 (28%) patients, and 32 patients (7%) had SARS-CoV-2 pneumonia. The SARS-CoV-2 pneumonia detection rates were 115 (23%) for the specialist; 132 (27%) and 76 (15%) for the AI-A algorithm with 50 and 70% probability thresholds; 334 (70%) and 157 (33%) for AI-B with 50 and 70% probability thresholds; and 143 (44%) and 45 (14%) for AI-C with 50 and 70% probability thresholds.

The specialist’s diagnostic performance for the detection of SARS-CoV-2 pneumonia showed low sensitivity (16%, 95% CI 5%–34%) and moderate specificity (76%, 95% CI 72%–80%); with a PPV of 4% (95% CI 1%–10%) and NPV of 93% (95% CI 90%–95%).

Using a probability threshold greater than 50%, the AI-A algorithm had a sensitivity of 25% (95% CI: 11%–43%), specificity of 73% (95% CI: 69%–77%), PPV of 6% (95% CI: 3%–12%) and NPV of 93% (95% CI: 90%–96%); AI-B had a sensitivity of 97% (95% CI: 84%–100%), specificity of 32% (95% CI: 28%–36%), PPV of 9% (95% CI: 6%–13%), and NPV of 99% (95% CI: 96%–100%); and AI-C a sensitivity of 63% (95% CI: 35%–85%), specificity of 57% (95% CI: 52%–63%), PPV of 7% (95% CI: 3%–12%) and NPV of 97% (95% CI: 93%–99%).

Using a probability threshold greater than 70%, the AI-A algorithm had a sensitivity of 15% (95% CI: 5%–33%), specificity of 85% (95% CI: 81%–88%), PPV of 7% (95% CI: 2%–15%) and NPV of 94% (95% CI: 91%–96%); AI-B had a sensitivity of 50% (95% CI: 32%–68%), specificity of 68% (95% CI: 64%–73%), PPV of 10% (95% CI: 6%–16%), and NPV of 95% (95% CI: 92%–97%); and AI-C a sensitivity of 13% (95% CI: 2%–38%), specificity of 86% (95% CI: 82%–90%), PPV of 4% (95% CI: 1%–15%) and NPV of 95% (95% CI: 92%–97%).

By analysing the algorithm data set on a continuous basis, the AI-B algorithm had the best diagnostic performance for the identification of SARS-CoV-2 pneumonia (AI-B area under the ROC curve 0.73 [CI 95%: 0.68–0.78] vs. AI-A 0.51 [95% CI: 0.45–0.57] vs. AI-C 0.57 [95% CI: 0.51–0.62]). The best probability threshold for the identification of SARS-CoV-2 pneumonia using the AI-B algorithm was 55%, with a sensitivity of 94% and a specificity of 42%. Using this threshold, the AI-B algorithm had a sensitivity of 94% (95% CI 79%–99%), specificity of 42% (95% CI 38%–47%), PPV of 10% (95% CI 7%–15%) and NPV of 99% (95% CI 96%–100%), with an area under the ROC curve higher than that of the specialist (0.68 [95% CI 0.64–0.72] vs. 0.54 [95% CI 0.49–0.59]).

Discussion

In the current study, portable X-ray-based AI algorithms enabled diagnostic accuracy comparable to that of human assessment for the identification of COVID-19 pneumonia. Although they had a low sensitivity and moderate specificity, they were associated with a high NPV that would allow it to be ruled out in most cases. To our knowledge, our study is the first to evaluate the diagnostic performance of portable X-ray in this type of population. These findings are relevant in the context of severe infrastructure and human resource constraints, particularly in emerging economies. Also, they highlight the importance of reducing in-hospital transfers in order to reduce contacts.4 For these reasons, chest X-ray is considered the first-line imaging method to assess abnormalities in patients with pulmonary symptoms according to the ACR.5

Previous studies involving different populations reported variable findings regarding the diagnostic accuracy of chest X-ray for the detection of SARS-CoV-26 pneumonia. Murphy et al. reported an area under the ROC curve of 0.81 for COVID-19 pneumonia detection, better than the human reader in almost all segments. We identified a higher diagnostic yield in one of the algorithms (using a higher probability threshold of 55%) compared to the specialist, although this was not conclusive due to the modest results of the X-ray. It should be noted that, although the 3 algorithms were developed on conventional acquisitions, our experience was applied on portable X-ray, with the original software settings.

The relatively similar detection between the human observer and the algorithm in our study encourages us to assume that in the future, this process will become more efficient and simpler to implement.

The most common confounders that can hinder the diagnosis of COVID-19 are atelectasis, haemorrhages, oedema or neoplasms, among others. In our experience, these are elements that hinder both the diagnosis and ruling out of COVID-19 disease, as well as the training of AI algorithms.

The diagnosis of COVID-19 pneumonia must be supported by symptoms, PCR analysis, and at least one X-ray imaging work-up. Our study adopts the simultaneous presence of positive PCR and pulmonary respiratory symptoms as a reference standard, in line with Albahri et al. who state that association with clinical data optimises the detection of true positives.6

The limitations of the reference standard used should be highlighted, possibly affecting the results. However, AI performance was comparable to human performance, with both strategies being similarly affected by this constraint. In line with this, few patients in the sample underwent a CT to confirm or rule out the findings.

Conclusion

In our study, portable X-ray-based AI algorithms enabled diagnostic accuracy comparable to that of human assessment for the detection of SARS-CoV-2 pneumonia. These findings are relevant in the context of significant human resource constraints, particularly in emerging economies.

Conflict of interests

The authors declare that they have no conflict of interest.

References

[1]

A.S. Adly, A.S. Adly, M.S. Adly.

Approaches based on artificial intelligence and the internet of intelligent things to prevent the spread of COVID-19: scoping review.

J Med Internet Res, 22 (2020), pp. e19104

http://dx.doi.org/10.2196/19104 | Medline

[2]

F. Shi, J. Wang, J. Shi, Z. Wu, Q. Wang, Z. Tang, et al.

Review of artificial intelligence techniques in imaging data acquisition, segmentation, and diagnosis for COVID-19.

IEEE Rev Biomed Eng, 14 (2021), pp. 4-15

http://dx.doi.org/10.1109/RBME.2020.2987975 | Medline

[3]

N.K. Chowdhury, M.A. Kabir, M.M. Rahman, N. Rezoana.

ECOVNet: a highly effective ensemble based deep learning model for detecting COVID-19.

PeerJ Comput Sci, 7 (2021), pp. e551

http://dx.doi.org/10.7717/peerj-cs.551 | Medline

[4]

C.G. Monaco, F. Zaottini, S. Schiaffino, A. Villa, G. Della Pepa, L.A. Carbonaro, et al.

Chest X-ray severity score in COVID-19 patients on emergency department admission: a two-centre study.

Eur Radiol Exp, 4 (2020), pp. 68

http://dx.doi.org/10.1186/s41747-020-00195-w | Medline

[5]

K. Murphy, H. Smits, A.J.G. Knoops, M.B.J.M. Korst, T. Samson, E.T. Scholten, et al.

COVID-19 on chest radiographs: a multireader evaluation of an artificial intelligence system.

Radiology, 296 (2020), pp. E166-E172

http://dx.doi.org/10.1148/radiol.2020201874 | Medline

[6]

O.S. Albahri, A.A. Zaidan, A.S. Albahri, B.B. Zaidan, K.H. Abdulkareem, Z.T. Al-Qaysi, et al.

Systematic review of artificial intelligence techniques in the detection and classification of COVID-19 medical images in terms of evaluation and benchmarking: taxonomy analysis, challenges, future solutions and methodological aspects.

J Infect Public Health, 13 (2020), pp. 1381-1396

http://dx.doi.org/10.1016/j.jiph.2020.06.028 | Medline

Indexed in:

Follow us:

Subscribe:

Indexed in:

Follow us:

Subscribe:

Subscribe to our newsletter