• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

探索用于眼球震颤分类的生成式预训练变换器-4视觉模型:瞳孔追踪过程的开发与验证

Exploring Generative Pre-Trained Transformer-4-Vision for Nystagmus Classification: Development and Validation of a Pupil-Tracking Process.

作者信息

Noda Masao, Koshu Ryota, Tsunoda Reiko, Ogihara Hirofumi, Kamo Tomohiko, Ito Makoto, Fushiki Hiroaki

机构信息

Department of Otolaryngology, Mejiro University Ear Institute Clinic, 320 Ukiya, Iwatsuki-ku, Saitama-shi, Saitama, 339-8501, Japan, 81 48 797 3341.

Department of Otolaryngology, Jichi Medical University, Shimotsuke, Japan.

出版信息

JMIR Form Res. 2025 Jun 6;9:e70070. doi: 10.2196/70070.

DOI:10.2196/70070
PMID:40478723
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12164947/
Abstract

BACKGROUND

Conventional nystagmus classification methods often rely on subjective observation by specialists, which is time-consuming and variable among clinicians. Recently, deep learning techniques have been used to automate nystagmus classification using convolutional and recurrent neural networks. These networks can accurately classify nystagmus patterns using video data. However, associated challenges including the need for large datasets when creating models, limited applicability to address specific image conditions, and the complexity associated with using these models.

OBJECTIVE

This study aimed to evaluate a novel approach for nystagmus classification that used the Generative Pre-trained Transformer 4 Vision (GPT-4V) model, which is a state-of-the-art large-scale language model with powerful image recognition capabilities.

METHODS

We developed a pupil-tracking process using a nystagmus-recording video and verified the optimization model's accuracy using GPT-4V classification and nystagmus recording. We tested whether the created optimization model could be evaluated in six categories of nystagmus: right horizontal, left horizontal, upward, downward, right torsional, and left torsional. The traced trajectory was input as two-dimensional coordinate data or an image, and multiple in-context learning methods were evaluated.

RESULTS

The developed model showed an overall classification accuracy of 37% when using pupil-traced images and a maximum accuracy of 24.6% when pupil coordinates were used as input. Regarding orientation, we achieved a maximum accuracy of 69% for the classification of horizontal nystagmus patterns but a lower accuracy for the vertical and torsional components.

CONCLUSIONS

We demonstrated the potential of versatile vertigo management in a generative artificial intelligence model that improves the accuracy and efficiency of nystagmus classification. We also highlighted areas for further improvement, such as expanding the dataset size and enhancing input modalities, to improve classification performance across all nystagmus types. The GPT-4V model validated only for recognizing still images can be linked to video classification and proposed as a novel method.

摘要

背景

传统的眼球震颤分类方法通常依赖专家的主观观察,这既耗时,而且临床医生之间的判断存在差异。最近,深度学习技术已被用于通过卷积神经网络和循环神经网络实现眼球震颤分类的自动化。这些网络可以使用视频数据准确地对眼球震颤模式进行分类。然而,相关挑战包括创建模型时需要大型数据集、解决特定图像条件的适用性有限以及使用这些模型的复杂性。

目的

本研究旨在评估一种使用生成式预训练视觉变换器4(GPT-4V)模型进行眼球震颤分类的新方法,该模型是一种具有强大图像识别能力的先进大规模语言模型。

方法

我们使用眼球震颤记录视频开发了一种瞳孔跟踪程序,并使用GPT-4V分类和眼球震颤记录来验证优化模型的准确性。我们测试了创建的优化模型是否可以在六种眼球震颤类别中进行评估:右水平、左水平、向上、向下、右扭转和左扭转。跟踪的轨迹作为二维坐标数据或图像输入,并评估了多种上下文学习方法。

结果

当使用瞳孔跟踪图像时,开发的模型总体分类准确率为37%,当使用瞳孔坐标作为输入时,最高准确率为24.6%。在方向方面,我们对水平眼球震颤模式分类的最高准确率为69%,但对垂直和扭转成分的准确率较低。

结论

我们在生成式人工智能模型中展示了通用眩晕管理的潜力,该模型提高了眼球震颤分类的准确性和效率。我们还强调了需要进一步改进的领域,例如扩大数据集规模和增强输入模态,以提高所有类型眼球震颤的分类性能。仅经过识别静止图像验证的GPT-4V模型可与视频分类相关联,并作为一种新方法提出。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d82d/12164947/15954f14a9b2/formative-v9-e70070-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d82d/12164947/a3e85d993a8c/formative-v9-e70070-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d82d/12164947/8d2c165970c6/formative-v9-e70070-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d82d/12164947/15954f14a9b2/formative-v9-e70070-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d82d/12164947/a3e85d993a8c/formative-v9-e70070-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d82d/12164947/8d2c165970c6/formative-v9-e70070-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d82d/12164947/15954f14a9b2/formative-v9-e70070-g003.jpg

相似文献

1
Exploring Generative Pre-Trained Transformer-4-Vision for Nystagmus Classification: Development and Validation of a Pupil-Tracking Process.探索用于眼球震颤分类的生成式预训练变换器-4视觉模型:瞳孔追踪过程的开发与验证
JMIR Form Res. 2025 Jun 6;9:e70070. doi: 10.2196/70070.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.
4
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
5
Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。
Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.
6
Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study.揭示GPT-4V在美国医师执照考试(USMLE)问题上高精度背后的隐藏挑战:观察性研究。
J Med Internet Res. 2025 Feb 7;27:e65146. doi: 10.2196/65146.
7
Head impulse, nystagmus, and test of skew examination for diagnosing central causes of acute vestibular syndrome.头部脉冲测试、眼震测试和斜视角检查用于诊断急性前庭综合征的中枢性病因。
Cochrane Database Syst Rev. 2023 Nov 2;11(11):CD015089. doi: 10.1002/14651858.CD015089.pub2.
8
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
9
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
10
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

本文引用的文献

1
Evaluating Large Language Models for Drafting Emergency Department Discharge Summaries.评估用于起草急诊科出院小结的大语言模型。
medRxiv. 2024 Apr 4:2024.04.03.24305088. doi: 10.1101/2024.04.03.24305088.
2
Adequacy of prostate cancer prevention and screening recommendations provided by an artificial intelligence-powered large language model.人工智能驱动的大型语言模型提供的前列腺癌预防和筛查建议的充分性。
Int Urol Nephrol. 2024 Aug;56(8):2589-2595. doi: 10.1007/s11255-024-04009-5. Epub 2024 Apr 2.
3
Performance of GPT-4V in Answering the Japanese Otolaryngology Board Certification Examination Questions: Evaluation Study.
GPT-4V 在回答日本耳鼻喉科学委员会认证考试问题方面的表现:评估研究。
JMIR Med Educ. 2024 Mar 28;10:e57054. doi: 10.2196/57054.
4
A comprehensive evaluation of large Language models on benchmark biomedical text processing tasks.对基准生物医学文本处理任务中大型语言模型的全面评估。
Comput Biol Med. 2024 Mar;171:108189. doi: 10.1016/j.compbiomed.2024.108189. Epub 2024 Feb 20.
5
Performance of Generative Pretrained Transformer on the National Medical Licensing Examination in Japan.生成式预训练变换器在日本国家医师资格考试中的表现。
PLOS Digit Health. 2024 Jan 23;3(1):e0000433. doi: 10.1371/journal.pdig.0000433. eCollection 2024 Jan.
6
A nystagmus extraction system using artificial intelligence for video-nystagmography.基于人工智能的视频眼震图中眼球震颤提取系统。
Sci Rep. 2023 Jul 24;13(1):11975. doi: 10.1038/s41598-023-39104-7.
7
Evaluating GPT as an Adjunct for Radiologic Decision Making: GPT-4 Versus GPT-3.5 in a Breast Imaging Pilot.评估 GPT 作为放射学决策辅助工具:GPT-4 与 GPT-3.5 在乳腺成像试点中的比较。
J Am Coll Radiol. 2023 Oct;20(10):990-997. doi: 10.1016/j.jacr.2023.05.003. Epub 2023 Jun 21.
8
A Medical Ethics Framework for Conversational Artificial Intelligence.医疗伦理框架下的会话式人工智能
J Med Internet Res. 2023 Jul 26;25:e43068. doi: 10.2196/43068.
9
aEYE: A deep learning system for video nystagmus detection.aEYE:一种用于视频眼震检测的深度学习系统。
Front Neurol. 2022 Aug 11;13:963968. doi: 10.3389/fneur.2022.963968. eCollection 2022.
10
Nystagmus.眼球震颤
JAMA. 2021 Feb 23;325(8):798. doi: 10.1001/jama.2020.3911.