Automated Quality Assessment in Endoscopy: A Multimodal Deep Learning System for Bowel Preparation Evaluation

Return

This media is currently not available.

P. Marílio cardoso

M. Mascarenhas

J. Afonso

A. Martins pinto da costa

F. Mendes

M. Martins

J. Mota

M. Almeida

A. Zatarain valles

P. Laura

C. Marta

C. Constanza

C. Miriam

E. Gustavo

M. Carvalho

M. Lera dos santos

F. Maluf filho

E. Horneaux de moura

G. Macedo

Poster Abstract

Aims

Artificial intelligence has accelerated major advances in gastroenterology and endoscopy, yet adequate bowel preparation remains fundamental to high-quality examinations. Existing preparation scales for colonoscopy and capsule endoscopy are limited by subjective interpretation and by their confinement to individual gastrointestinal segments. Quality-assessment systems (CADq) offer a consistent and objective framework for evaluating preparation quality and optimizing procedural performance. This study aimed to develop CADq algorithms capable of automatically classifying preparation quality across multiple gastrointestinal regions, including the small bowel and colon, and across multiple endoscopic modalities including colonoscopy, enteroscopy, and capsule endoscopy.

Methods

We retrospectively analyzed 144 colonoscopies from three devices, 28 enteroscopy procedures from three devices, and 5,793 capsule endoscopy exams from two devices. Bowel preparation was graded with the Boston Bowel Preparation Scale for colonoscopy, and as excellent (visible mucosa at least 90 percent), satisfactory (50 to 90 percent), or unsatisfactory (less than 50 percent) for enteroscopy and capsule endoscopy. The dataset included 181,910 colonoscopy images, 88,623 enteroscopy images, and 12,950 capsule endoscopy images, divided into training (90 percent) and testing (10 percent) subsets, with overlap between procedures from the same patient. Convolutional neural network performance was compared with expert consensus.

Results

For colonoscopy, the model achieved sensitivity and specificity of 87 percent, a negative predictive value of 95 percent, and an overall accuracy of 96 percent. Because poor-quality cases were limited, a binary model distinguishing good from excellent preparation was implemented. In enteroscopy, classification across preparation categories achieved sensitivity ranges of 69 to 98 percent, specificity of 80 to 100 percent, and accuracy of 91 to 97 percent. In capsule endoscopy, sensitivity reached 92 percent, specificity 94 percent, and accuracy 92 percent.

Conclusions

This study advances the field by introducing CADq algorithms able to classify bowel preparation in a multi-region, multi-technique, and multi-device framework. CADq systems are critical for supporting procedural quality and operational efficiency in endoscopic practice. To our knowledge, this represents the first deep learning CADq strategy developed to evaluate preparation quality across distinct gastrointestinal regions. Future developments are expected to evolve toward real-time detection of blind spots and assessment of mucosal exposure. Combined with CADe and CADx, such systems are central to achieving full-spectrum integration of artificial intelligence into routine clinical care.

Download the app

The congress at your fingertips

Aims

Methods

Results

Conclusions