Yao Lan, Yin Heliang, Yang Chengyuan, Han Shuyan, Ma Jiamin, Graff J Carolyn, Wang Cong-Yi, Jiao Yan, Ji Jiafu, Gu Weikuan, Wang Gang
College of Health Management, Harbin Medical University, 157 Baojian Road, Harbin, Heilongjiang, 150081, China; Department of Orthopedic Surgery and BME-Campbell Clinic, University of Tennessee Health Science Centre, Memphis, TN, 38163, USA.
Department of Orthopedic Surgery and BME-Campbell Clinic, University of Tennessee Health Science Centre, Memphis, TN, 38163, USA; Centre of Integrative Research, The First Hospital of Qiqihar City, Qiqihar, Heilongjiang, 161005, China.
Cancer Lett. 2025 Jun 28;620:217632. doi: 10.1016/j.canlet.2025.217632. Epub 2025 Mar 15.
We intend to explore the capability of ChatGPT 4.0 in generating innovative research hypotheses to address key challenges in the early diagnosis of colorectal cancer (CRC). We asked ChatGPT to generate hypotheses focusing on three main challenges: improving screening accuracy, overcoming technological limitations, and identifying reliable biomarkers. The hypotheses were evaluated for novelty. The experimental plans provided by ChatGPT for selected hypotheses were assessed for completion and feasibility. As a result, ChatGPT generated a total of 65 hypotheses. ChatGPT rated all 65 hypotheses, with 25 hypotheses receiving the highest rating (5) and 40 hypotheses receiving a rating of 4 or lower. The research team evaluated a total of 65 hypotheses, assigning them the following grades: hypotheses were rated as excellent (Grade 5), 16 were deemed suitable (Grade 4), 31 were classified as satisfactory (Grade 3), 12 were identified as needing Improvement (Grade 2), and one was considered poor (Grade 1). Additionally, the study determined that 17 of the generated hypotheses had corresponding publications. Out of the three experimental plans assessed, one was rated excellent (5) for feasibility, while the others received good (4) and moderate (3) ratings. Predicted outcomes and alternative approaches were rated as good, with some areas requiring further improvement. Our data demonstrate that AI has the potential to revolutionize hypothesis generation in medical research, though further validation through experimental and clinical studies is needed. This study suggests that while AI can generate novel hypotheses, human expertise is essential for evaluating their practicality and relevance in scientific research.
我们打算探索ChatGPT 4.0在生成创新性研究假设以应对结直肠癌(CRC)早期诊断中的关键挑战方面的能力。我们要求ChatGPT生成聚焦于三个主要挑战的假设:提高筛查准确性、克服技术限制以及识别可靠的生物标志物。对这些假设进行新颖性评估。对ChatGPT为选定假设提供的实验计划进行完整性和可行性评估。结果,ChatGPT共生成了65个假设。ChatGPT对所有65个假设进行了评分,其中25个假设获得了最高评分(5分),40个假设获得了4分或更低的评分。研究团队对总共65个假设进行了评估,并给予它们以下等级:假设被评为优秀(5级),16个被认为合适(4级),31个被归类为满意(3级),12个被确定需要改进(2级),1个被认为较差(1级)。此外,该研究确定所生成的假设中有17个有相应的出版物。在评估的三个实验计划中,一个在可行性方面被评为优秀(5分),而其他的获得了良好(4分)和中等(3分)的评分。预测结果和替代方法被评为良好,不过有些方面需要进一步改进。我们的数据表明,人工智能有潜力彻底改变医学研究中的假设生成,尽管需要通过实验和临床研究进行进一步验证。这项研究表明,虽然人工智能可以生成新颖的假设,但人类专业知识对于评估它们在科学研究中的实用性和相关性至关重要。