To evaluate the diagnostic performance of different artificial intelligence (AI) algorithms for the identification of pulmonary involvement by SARS-CoV-2 based on portable chest radiography (RX).
Material and methodsProspective observational study that included patients admitted for suspected COVID-19 infection in a university hospital between July and November 2020. The reference standard of pulmonary involvement by SARS-CoV-2 comprised a positive PCR test and low-tract respiratory symptoms.
Results493 patients were included, 140 (28%) with positive PCR and 32 (7%) with SARS-CoV-2 pneumonia. The AI-B algorithm had the best diagnostic performance (areas under the ROC curve AI-B 0.73, vs. AI-A 0.51, vs. AI-C 0.57). Using a detection threshold greater than 55%, AI-B had greater diagnostic performance than the specialist [(area under the curve of 0.68 (95% CI 0.64–0.72), vs. 0.54 (95% CI 0.49–0.59)].
ConclusionAI algorithms based on portable RX enabled a diagnostic performance comparable to human assessment for the detection of SARS-CoV-2 lung involvement.
Evaluar el rendimiento diagnóstico de diferentes algoritmos de inteligencia artificial (IA) para la identificación de compromiso pulmonar por SARS-CoV-2 basados en radiografía (Rx) de tórax portátil.
Material y métodoEstudio observacional prospectivo que incluyó pacientes ingresados por sospecha de infección por COVID-19 en un hospital universitario entre julio y noviembre de 2020. El patrón de referencia de compromiso pulmonar por SARS-CoV-2 comprendió una PCR positiva y síntomas respiratorios bajos.
ResultadosSe incluyeron 493 pacientes, 140 (28%) con PCR positiva y 32 (7%) con neumonía por SARS-CoV-2. El algoritmo AI-B tuvo el mejor rendimiento diagnóstico (áreas bajo la curva ROC AI-B 0,73 vs. AI-A 0,51 vs. AI-C 0,57). Utilizando un umbral de detección superior al 55%. AI-B presentó mayor precisión que el especialista (área bajo la curva de 0,68 [IC 95%: 0,64–0,72] vs. 0,54 [IC 95%: 0,49–0,59]).
ConclusiónLos algoritmos de IA basados en Rx portátiles permiten una precisión diagnóstica comparable a la humana para la detección de compromiso pulmonar por SARS-CoV-2.
The use of portable radiography (X-ray) during the COVID-19 pandemic was an essential resource in this context, reducing the risks of transport-related contamination. For this reason, portable examinations were established for both in-patients and out-patients.
Computer technology made a positive contribution to this early stage of the pandemic, with the emergence of patient monitoring applications, contact tracing, thermal scanners and remote care cameras.1 In this context, multiple artificial intelligence (AI) platforms emerged with the aim of facilitating the detection of radiological findings related to COVID-19 infection.2
In SARS-CoV-2 pneumonia, AI algorithms detect bilateral patchy opacities, which may vary in location and intensify over time. These findings are similar to those of viral pneumonia, so the analysis is challenging for both the radiologist and the algorithm in question.3
In developing economies, where the vast majority of imaging specialists are located in large urban centres, and where accessibility to PCR testing is limited or has significant delays in the delivery of results, it is important to be able to define whether AI algorithms are a reliable tool to provide diagnostic support to on-call physicians and peripheral health centres where specialist review is not available.
Therefore, the aim of this work was to evaluate the potential of different AI algorithms to detect lung involvement by COVID-19 on portable front chest X-ray.
Materials and methodProspective observational study in consecutive patients admitted to the emergency department or hospitalized for suspected COVID-19 infection at a university hospital. The presence of symptoms such as fever, cough, dyspnoea, anosmia and/or ageusia constituted the criteria for swabbing and performing a chest X-ray at this stage of the pandemic.
The chest X-ray images were extracted from the picture archiving and communication system (PACS) in DICOM format, while clinical and laboratory data were obtained from the electronic medical record.
The reference pattern of COVID-19 lung involvement was defined as the combined presence of a positive PCR test and symptoms of lung infection. Chest X-rays were independently analysed off-line without knowledge of symptoms or history to determine which of them showed typical findings of COVID-19 pneumonia.
The X-rays were analysed independently by a medical imaging specialist, as well as by 3 AI platforms with different training algorithms and open access specifically designed to evaluate the detection of lung involvement by COVID-19 in chest X-rays. The platforms used were Pneuma Deep Health COVID (http://pneuma.deephealth.thingtrack.com/), DAC-4 (https://www.delft.care/how-to-access/) and ENTELAI (https://covid.entelai.com/).
A classification of the findings was carried out by an experienced radiologist, as well as by AI algorithms, to establish the probability of COVID-19 pneumonia. Each analysis was determined according to degrees of probability of various types of pulmonary involvement, including probability of COVID-19 pneumonia (Figs. 1–3). Independently, the specialist physician categorised the findings into normal studies, COVID pneumonia or other findings.
Patient with confirmed diagnosis of SARS-CoV-2 infection by PCR test. (A) The result provided by the AI algorithm from the portable chest X-ray analysis of 87% probability of COVID-19 pneumonia. (B) Result and analysis of the AI algorithm, showing in green (arrows) opacities in both lung fields.
The classification principles of each study can be viewed on the access pages of each algorithm, for the entry of images.
Statistical analysisContinuous variables were reported as means ± standard deviation while categorical variables were reported as frequencies and percentages. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for the detection of lung involvement by SARS-CoV-2 were evaluated. We also compared the diagnostic accuracy of the different algorithms (areas under the ROC curve, understanding an area of 0.50 as chance, and an area of 1.0 as perfect accuracy) using the DeLong method. Analyses were performed using SPSS® version 22.0 software (Armonk, NY, USA) and MedCalc® Statistical software version 13.3.3 (MedCalc software bvba, Ostend, Belgium).
ResultsA total of 493 patients evaluated at our institution between July and November 2020 for suspected COVID-19 infection who underwent front chest X-ray were included. The median age of the included patients was 47 years (interquartile range 34; 71 years), 55% women. Patients were studied for having COVID-19 symptoms in the previous days, both as outpatients and those who were hospitalised for another reason.
The symptoms reported were mostly dyspnoea, dry or productive cough, odynophagia, ageusia, anosmia and fever. The per protocol patients were studied with nasopharyngeal swab and PCR, complete laboratory work-up and initial portable X-ray.
The PCR test was positive in 140 (28%) patients, and 32 patients (7%) had SARS-CoV-2 pneumonia. The SARS-CoV-2 pneumonia detection rates were 115 (23%) for the specialist; 132 (27%) and 76 (15%) for the AI-A algorithm with 50 and 70% probability thresholds; 334 (70%) and 157 (33%) for AI-B with 50 and 70% probability thresholds; and 143 (44%) and 45 (14%) for AI-C with 50 and 70% probability thresholds.
The specialist’s diagnostic performance for the detection of SARS-CoV-2 pneumonia showed low sensitivity (16%, 95% CI 5%–34%) and moderate specificity (76%, 95% CI 72%–80%); with a PPV of 4% (95% CI 1%–10%) and NPV of 93% (95% CI 90%–95%).
Using a probability threshold greater than 50%, the AI-A algorithm had a sensitivity of 25% (95% CI: 11%–43%), specificity of 73% (95% CI: 69%–77%), PPV of 6% (95% CI: 3%–12%) and NPV of 93% (95% CI: 90%–96%); AI-B had a sensitivity of 97% (95% CI: 84%–100%), specificity of 32% (95% CI: 28%–36%), PPV of 9% (95% CI: 6%–13%), and NPV of 99% (95% CI: 96%–100%); and AI-C a sensitivity of 63% (95% CI: 35%–85%), specificity of 57% (95% CI: 52%–63%), PPV of 7% (95% CI: 3%–12%) and NPV of 97% (95% CI: 93%–99%).
Using a probability threshold greater than 70%, the AI-A algorithm had a sensitivity of 15% (95% CI: 5%–33%), specificity of 85% (95% CI: 81%–88%), PPV of 7% (95% CI: 2%–15%) and NPV of 94% (95% CI: 91%–96%); AI-B had a sensitivity of 50% (95% CI: 32%–68%), specificity of 68% (95% CI: 64%–73%), PPV of 10% (95% CI: 6%–16%), and NPV of 95% (95% CI: 92%–97%); and AI-C a sensitivity of 13% (95% CI: 2%–38%), specificity of 86% (95% CI: 82%–90%), PPV of 4% (95% CI: 1%–15%) and NPV of 95% (95% CI: 92%–97%).
By analysing the algorithm data set on a continuous basis, the AI-B algorithm had the best diagnostic performance for the identification of SARS-CoV-2 pneumonia (AI-B area under the ROC curve 0.73 [CI 95%: 0.68–0.78] vs. AI-A 0.51 [95% CI: 0.45–0.57] vs. AI-C 0.57 [95% CI: 0.51–0.62]). The best probability threshold for the identification of SARS-CoV-2 pneumonia using the AI-B algorithm was 55%, with a sensitivity of 94% and a specificity of 42%. Using this threshold, the AI-B algorithm had a sensitivity of 94% (95% CI 79%–99%), specificity of 42% (95% CI 38%–47%), PPV of 10% (95% CI 7%–15%) and NPV of 99% (95% CI 96%–100%), with an area under the ROC curve higher than that of the specialist (0.68 [95% CI 0.64–0.72] vs. 0.54 [95% CI 0.49–0.59]).
DiscussionIn the current study, portable X-ray-based AI algorithms enabled diagnostic accuracy comparable to that of human assessment for the identification of COVID-19 pneumonia. Although they had a low sensitivity and moderate specificity, they were associated with a high NPV that would allow it to be ruled out in most cases. To our knowledge, our study is the first to evaluate the diagnostic performance of portable X-ray in this type of population. These findings are relevant in the context of severe infrastructure and human resource constraints, particularly in emerging economies. Also, they highlight the importance of reducing in-hospital transfers in order to reduce contacts.4 For these reasons, chest X-ray is considered the first-line imaging method to assess abnormalities in patients with pulmonary symptoms according to the ACR.5
Previous studies involving different populations reported variable findings regarding the diagnostic accuracy of chest X-ray for the detection of SARS-CoV-26 pneumonia. Murphy et al. reported an area under the ROC curve of 0.81 for COVID-19 pneumonia detection, better than the human reader in almost all segments. We identified a higher diagnostic yield in one of the algorithms (using a higher probability threshold of 55%) compared to the specialist, although this was not conclusive due to the modest results of the X-ray. It should be noted that, although the 3 algorithms were developed on conventional acquisitions, our experience was applied on portable X-ray, with the original software settings.
The relatively similar detection between the human observer and the algorithm in our study encourages us to assume that in the future, this process will become more efficient and simpler to implement.
The most common confounders that can hinder the diagnosis of COVID-19 are atelectasis, haemorrhages, oedema or neoplasms, among others. In our experience, these are elements that hinder both the diagnosis and ruling out of COVID-19 disease, as well as the training of AI algorithms.
The diagnosis of COVID-19 pneumonia must be supported by symptoms, PCR analysis, and at least one X-ray imaging work-up. Our study adopts the simultaneous presence of positive PCR and pulmonary respiratory symptoms as a reference standard, in line with Albahri et al. who state that association with clinical data optimises the detection of true positives.6
The limitations of the reference standard used should be highlighted, possibly affecting the results. However, AI performance was comparable to human performance, with both strategies being similarly affected by this constraint. In line with this, few patients in the sample underwent a CT to confirm or rule out the findings.
ConclusionIn our study, portable X-ray-based AI algorithms enabled diagnostic accuracy comparable to that of human assessment for the detection of SARS-CoV-2 pneumonia. These findings are relevant in the context of significant human resource constraints, particularly in emerging economies.
Conflict of interestsThe authors declare that they have no conflict of interest.