AI detection of knee joint effusion from radiographs: Comparative accuracy of two commercial algorithms

Verkkojulkaisu

Tiivistelmä

Background
Knee joint effusion might indicate injury even without bony changes. Automated detection from radiographs could improve the sensitivity of AI algorithms.

Purpose
To compare two commercially available AI algorithms, BoneView and RBfracture, in detecting knee joint effusion.

Material and Methods
This retrospective study collected 123 lateral knee radiographs. Detection of knee joint effusion by both AI algorithms was compared with two board-certified radiologists with arbitration. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and interobserver agreement (Cohen’s Kappa) were calculated. 95% confidence intervals (CI) assessed robustness. McNemar’s tests compared sensitivity and specificity between AI algorithms.

Results
Knee joint effusion was present in 56% of radiographs. BoneView demonstrated a sensitivity of 0.42 (95% CI: 0.31–0.54), specificity of 1.00 (95% CI: 0.93–1.00), PPV of 1.00 (95% CI: 0.88–1.00), NPV of 0.57 (95% CI: 0.47–0.67), and accuracy of 0.68 (95% CI: 0.59–0.75). RBfracture demonstrated a sensitivity of 0.75 (95% CI: 0.64–0.84), specificity of 0.91 (95% CI: 0.80–0.96), PPV of 0.91 (95% CI: 0.81–0.96), NPV of 0.74 (95% CI: 0.63–0.83), and accuracy of 0.82 (95% CI: 0.74–0.88). Cohen’s Kappa was 0.49 (95% CI: 0.35–0.63), indicating moderate agreement between the two AI algorithms. Adding knee joint effusion detection to fracture/dislocation predictions improved sensitivity.

Conclusions
Two commercially available AI algorithms demonstrated different operating points for knee joint effusion detection: BoneView achieved high specificity, while RBfracture achieved higher sensitivity. Combining injury and effusion predictions increased sensitivity at the cost of specificity.

item.page.okmtext