Aims
Early identification of malignant potential and appropriate endoscopic resection technique are essential to optimize curative resection and minimize complications in colorectal neoplasia. Although artificial intelligence (AI) systems have demonstrated high accuracy in real-time optical diagnosis, large language models (LLMs) have not been systematically assessed for their ability to integrate morphology, optical classifications and ability to predict histology and recommend resection strategy. This study evaluates the diagnostic performance of a widely used, commercially available LLM for predicting polyp histology and recommending an optimal resection strategy, aiming to determine whether it can function as a decision-support tool in advanced endoscopic resections such as ESD.
Methods
Early identification of malignant potential and appropriate endoscopic resection technique are essential to optimize curative resection and minimize complications in colorectal neoplasia. Although artificial intelligence (AI) systems have demonstrated high accuracy in real-time optical diagnosis, large language models (LLMs) have not been systematically assessed for their ability to integrate morphology, optical classifications and ability to predict histology and recommend resection strategy. This study evaluates the diagnostic performance of a widely used, commercially available LLM for predicting polyp histology and recommending an optimal resection strategy, aiming to determine whether it can function as a decision-support tool in advanced endoscopic resections such as ESD.
Results
The study included 62 colorectal lesions from 58 patients; 45% were females and ages ranged from 18-89 years (mean 66 years). Lesions were most commonly located in the sigmoid colon (29.03%), rectum (22.5%) and ascending colon (20.96%). Clinical information provided to the AI software in the informed condition included age, sex, relevant medical history (prior CRC, adjuvant therapy, surgery), lesion size and location and optical classifications (Paris/JNET/Kudo). For resection strategy selection, the LLM showed 70.96% agreement with expert’s decisions in the photos only conditions (κ=0.59), which increased to 90.32% in the informed condition (κ=0.751). Direct comparison between outputs in the two conditions demonstrated that 12 cases improved to match the expert’s recommendation when clinical and optical classification data were added (κ=0.415, p<0.001). For histology prediction, accuracy compared with final pathology was 72.72% in the photos only condition (κ=0.408) and increased to 87.03% in the informed condition (κ=0.758). Comparison of the two conditions identified 7 lesions for which histology categorization improved with additional clinical and optical data (κ=0.624, p<0.001). Response time for each case remained under 45 seconds to support feasibility for real-time use.
|
|
Resection method informed |
|
|
|
|---|---|---|---|---|
|
|
No |
Yes |
Total |
p-value1 |
|
Resection method photos |
|
|
|
<0.001 |
|
No |
6 |
12 |
18 |
|
|
Yes |
0 |
44 |
44 |
|
|
Total |
6 |
56 |
62 |
|
|
1Fisher's exact test |
||||
Conclusions
This study provides the first evaluation of a widely used LLM as a decision-support tool for colorectal lesion characterization, histology prediction and resection strategy selection especially in lesions requiring advanced resection techniques as ESD. Our data has shown that performance improved markedly when clinical and endoscopic information supplemented photos, highlighting the importance of multi-modal input. These findings suggest that general-purpose LLMs could complement existing AI systems and offer practical decision support in settings where advanced resection expertise is limited.