This media is currently not available.
Implementation of a FAIR-Compliant Video Databank for Artificial Intelligence Model Development and Clinical Outcomes Research in Gastrointestinal Endoscopy
Poster Abstract

Aims

Artificial intelligence (AI) has potential to improve diagnosis and outcome assessment in gastrointestinal (GI) endoscopy. However, robust AI models require large, labeled datasets with full-length, high-resolution endoscopic videos. Many existing endoscopic databases rely on retrospective sampling, isolated images, or lack clinical context, limiting their utility for AI systems development. We therefore aimed to develop an endoscopy video databank coupling full-length recording with capture of structured clinical, procedural, and pathology data.

Methods

A prospective video databank was established at the Montreal University Hospital Center (NCT06822616) with enrollment of patients undergoing endoscopic procedures since 2022. Inclusion criteria comprised all screening, surveillance, diagnostic, or therapeutic upper (EGD) or lower (colonoscopy) endoscopies performed in patients ≥ 18 years old. All patients provided informed consent for data collection and subsequent use in AI system development. The databank follows FAIR principles: Findable, Accessible, Interoperable, and Reusable. Video recordings were de-identified from acquisition, using study identifiers without protected health information, with all technical specifications optimized for AI training (full-length, full-frame videos in 1920x1080-pixel resolution and 60fps, and 10-bit color depth). Trained research staff systematically documented and timestamped detected anatomical landmarks, all identified lesions, including endoscopist optical diagnoses, interventions, disease-specific scores (e.g., Ulcerative colitis: Mayo Endoscopic Subscore, Crohn’s disease: SES-CD, Barrett’s esophagus: Prague classification), AI system outputs, and endoscopy quality metrics in structured case report forms (CRFs) during the procedure. All data, including patient, procedure, and pathology outcomes, were then documented in a REDCap database and linked to the videos using the unique study identifiers.

Results

As of July 2025, 6,272 patients (mean 59.0 years, 52.1% female) have been prospectively enrolled for a total of 7,591 endoscopic procedures (5,605 (73.8%) colonoscopies, 1,986 (26.2%) EGDs) performed by 48 different endoscopists, with complete video capture and documentation included in the databank. Up to 747 and 455 structured datapoints were documented for each enrolled colonoscopy and EGD, respectively. Each polyp identified and biopsy performed were documented with up to 48 and 46 variables each. Disease-specific datapoints included up to 169 datapoints specifically for patients with inflammatory bowel disease. Prevalence of main cohort procedural indications are summarily presented in Table 1, demonstrating substantial diversity in pathologic findings across inflammatory, neoplastic, and structural disease categories. Pathologic findings (including inflammation, polyps, ulcers, dysplasia, malignancy) of any type were found in 6355 (83.7%) procedures, of which 6987 colorectal polyps found in 2981 patients and 48 cancers. Data enabled calculation of multiple endoscopic quality metrics: mean adenoma detection rate (ADR) was 38.9% (CI[37.2-40.7]), sessile serrated lesion detection rate (SSLDR) was 4.5%[3.8-5.3], advanced adenoma detection rate (AADR) was 9.0%[8.0-10.1], and mean withdrawal time was 12.1+/-11.9 minutes for colonoscopies and procedure times 5.1+/-7.1 minutes for EGDs. 

Table 1. Cohort characteristics: prevalence of procedural indications

Colonoscopies

N=5,605

EGD

N=1,986

Surveillance

1,931 (34.4%)

Dyspepsia

306 (15.6%)

Diagnostic

760 (13.6%)

Barrett’s esophagus 

229 (11.5%)

Screening

642 (11.4%)

Anemia

198 (10.0%)

Crohn’s disease

543 (9.7%)

Dysphagia

151 (7.6%)

Ulcerative colitis

459 (8.2%)

GERD

93 (4.7%)

EMR polypectomy

427 (7.6%)

EMR polypectomy

70 (3.5%)

FIT+

173 (3.1%)

Eosinophilicesophagitis

67 (3.4%)

Conclusions

This large, prospective, FAIR-compliant video databank of >7,500 procedures enrolled at our tertiary care center provides a scalable foundation for AI development, validation, and clinical outcomes research for gastrointestinal endoscopy. Ongoing multi-center expansion and integration of digital pathology slides, microbiome and multi-omics analyses will enable future projects combining endoscopic video data with precision medicine approaches.