Diagnostic accuracy of artificial intelligence in endoscopic detection of early esophageal neoplasia: systematic review and meta-analysis

This media is currently not available.

H. Eldeeb

A. Bakr

A. Elaraby

A. Mohimen

F. Abdurrahman

B. Noha

Poster Abstract

Aims

Esophageal neoplasia (EN) remains a significant contributor to cancer-related mortality, primarily due to the difficulty in visualizing and characterizing early-stage lesions during endoscopy. Inter-observer variability in interpretation is substantial, even with enhanced imaging modalities (e.g., NBI, BLI, i-scan). Artificial intelligence (AI) systems, including computer-aided detection (CADe) and diagnosis (CADx), have emerged as potential tools to enhance and standardize diagnostic accuracy. We aimed to synthesize the diagnostic performance of AI-assisted endoscopy for EN, conducting subgroup analyses based on histology (Esophageal Squamous Cell Carcinoma [ESCC] vs. Barrett’s Esophagus-Related Neoplasia [BERN]) and comparing accuracy by the unit of analysis (per-patient vs. per-image).

Methods

A systematic review and meta-analysis were performed. Study-level 2×2 data (True Positives, False Positives, False Negatives, True Negatives) were extracted from eligible studies. Data were pooled using a bivariate random-effects hierarchical summary receiver operating characteristic (HSROC) model (MIDAS, Stata 18). Primary outcome measures were pooled sensitivity, specificity, and the Area Under the Curve (AUC). Secondary metrics included positive and negative likelihood ratios (LR+, LR–) and the Diagnostic Odds Ratio (DOR). Separate meta-analyses were conducted for per-patient data (overall and by histology) and per-image data. Statistical heterogeneity was quantified using the I² statistic, and threshold effects were evaluated.

Results

Thirteen studies met inclusion criteria for the per-patient analysis (752 reference-positive; 876 reference-negative patients). The pooled sensitivity was 0.91 (95% CI 0.85–0.94) and specificity was 0.78 (95% CI 0.63–0.88). The summary AUC was 0.93 (95% CI 0.90–0.95), with an LR+ of 4.1, LR– of 0.12, and DOR of 34. Substantial heterogeneity was observed (I²≈97%) with no significant threshold effect.

In histological subgroup analyses, ESCC (8 studies) yielded a sensitivity of 0.92 (95% CI 0.85–0.96) and specificity of 0.73 (95% CI 0.51–0.87) (AUC 0.93). For BERN (5 studies), sensitivity was 0.86 (95% CI 0.77–0.92) and specificity was 0.85 (95% CI 0.69–0.93) (AUC 0.91).

The per-image analysis (6 studies; 2,574 positive; 7,089 negative images) demonstrated higher pooled accuracy: sensitivity 0.96 (95% CI 0.92–0.98), specificity 0.93 (95% CI 0.85–0.97), and AUC 0.98 (LR+ 13.3, LR– 0.04). Heterogeneity remained substantial (I²≈93%).

Conclusions

AI-assisted endoscopy demonstrates high overall diagnostic accuracy for esophageal neoplasia. On a per-patient basis, a metric more relevant to clinical decision-making, AI provides excellent sensitivity with moderate specificity, showing comparable utility across both ESCC and BERN. The outstanding per-image accuracy (AUC 0.98) and high LR+ (13.3) suggest strong "rule-in" capability, while the consistently low LR– (0.12 per-patient, 0.04 per-image) indicates a robust "rule-out" potential, implying a reduced risk of missed lesions. Despite significant heterogeneity across studies, these findings support the continued development of AI in endoscopy. Prospective, real-time, multicenter validation studies using standardized workflows are warranted to establish optimal operating thresholds and confirm the generalizability of these systems in routine clinical practice.

Download the app

The congress at your fingertips

Aims

Methods

Results

Conclusions