Aims
Accurate endoscopic assessment of mucosal inflammation is critical in managing Ulcerative Colitis (UC). Mayo endoscopic score (MES) and Ulcerative Colitis Endoscopic Index of Severity (UCEIS) have significant interobserver variability. The role of Texture and Colour Enhancement Imaging (TXI) in assessing inflammation is uncertain. We aimed to assess the reliability of TXI in UC.
Methods
A multicentre international survey-based study was designed with 20 sets of high-definition white light (HDWLE) images or videos matched with TXI. Each set had varying severities of inflammation as determined by Nancy and Picasso histopathology scores. The MES and UCEIS were rated along with confidence level (high vs low). To assess interobserver variability, we used percentage agreement and Kappa score with 95% confidence intervals stratified by imaging modality and scoring system. A mixed logistic regression model was constructed to assess accuracy.
Results
30 gastroenterologists from 4 centres were included. 19 (63.3%) were experts and 10 (33.3%) had prior TXI experience. Interobserver variability was moderate overall with percentage agreement between 66-70% and Kappa scores between 0.51-0.58. There was no difference in interobserver variability between HDWLE and TXI. Agreement was slightly higher for UCEIS than MES (Table 1). 75% reported high confidence prediction with confidence consistent across histological severity. Participants rated severity accurately 75% of the time using MES with accuracy increasing with increase in histological severity. Results for UCEIS were similar with accuracy of 74%. There was no evidence of a difference in accuracy between the two imaging modalities irrespective of the index used for scoring.
Table 1: Interobserver variability results
|
Imaging modality |
Mayo |
UCEIS |
|
HDWLE |
Percent agreement = 68%
Kappa = 0.56 (0.45, 0.67)* |
Percent agreement = 66%
Kappa = 0.51 (0.40, 0.63)* |
|
TXI |
Percent agreement = 67%
Kappa = 0.55 (0.44, 0.66)* |
Percent agreement = 70%
Kappa = 0.58 (0.45, 0.70)* |
*95% confidence interval
Conclusions
Interobserver variability remains a challenge in accurately assessing inflammation in UC despite advances in technology and validated scoring systems. Our study suggests that TXI achieves moderate interobserver agreement with accuracy of 75% and is not superior to HDWLE.