• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结合监督和无监督机器学习方法进行表型功能基因组学筛选。

Combining Supervised and Unsupervised Machine Learning Methods for Phenotypic Functional Genomics Screening.

机构信息

Department of Cell Biology, Centre for Molecular Medicine, UMC Utrecht, Utrecht, The Netherlands.

Department of Information and Computing Sciences, Utrecht University, Utrecht, The Netherlands.

出版信息

SLAS Discov. 2020 Jul;25(6):655-664. doi: 10.1177/2472555220919345. Epub 2020 May 13.

DOI:10.1177/2472555220919345
PMID:32400262
Abstract

There has been an increase in the use of machine learning and artificial intelligence (AI) for the analysis of image-based cellular screens. The accuracy of these analyses, however, is greatly dependent on the quality of the training sets used for building the machine learning models. We propose that unsupervised exploratory methods should first be applied to the data set to gain a better insight into the quality of the data. This improves the selection and labeling of data for creating training sets before the application of machine learning. We demonstrate this using a high-content genome-wide small interfering RNA screen. We perform an unsupervised exploratory data analysis to facilitate the identification of four robust phenotypes, which we subsequently use as a training set for building a high-quality random forest machine learning model to differentiate four phenotypes with an accuracy of 91.1% and a kappa of 0.85. Our approach enhanced our ability to extract new knowledge from the screen when compared with the use of unsupervised methods alone.

摘要

机器学习和人工智能(AI)在基于图像的细胞筛选分析中的应用日益增多。然而,这些分析的准确性在很大程度上取决于用于构建机器学习模型的训练集的质量。我们提出,应该首先将无监督探索性方法应用于数据集,以更好地了解数据的质量。这可以在应用机器学习之前改进训练集的数据选择和标记。我们使用高通量全基因组小干扰 RNA 筛选来证明这一点。我们进行无监督探索性数据分析,以方便识别四个稳健表型,我们随后将其用作训练集,以构建一个高质量的随机森林机器学习模型,该模型可以将四个表型准确地区分为 91.1%,kappa 值为 0.85。与单独使用无监督方法相比,我们的方法增强了我们从筛选中提取新知识的能力。

相似文献

1
Combining Supervised and Unsupervised Machine Learning Methods for Phenotypic Functional Genomics Screening.结合监督和无监督机器学习方法进行表型功能基因组学筛选。
SLAS Discov. 2020 Jul;25(6):655-664. doi: 10.1177/2472555220919345. Epub 2020 May 13.
2
Digging deep into Golgi phenotypic diversity with unsupervised machine learning.利用无监督机器学习深入研究高尔基体表型多样性。
Mol Biol Cell. 2017 Dec 1;28(25):3686-3698. doi: 10.1091/mbc.E17-06-0379. Epub 2017 Oct 11.
3
Active Learning Strategies for Phenotypic Profiling of High-Content Screens.用于高内涵筛选表型分析的主动学习策略
J Biomol Screen. 2014 Jun;19(5):685-95. doi: 10.1177/1087057114527313. Epub 2014 Mar 18.
4
Combined unsupervised-supervised machine learning for phenotyping complex diseases with its application to obstructive sleep apnea.联合无监督-监督机器学习方法对复杂疾病进行表型分析及其在阻塞性睡眠呼吸暂停中的应用。
Sci Rep. 2021 Feb 24;11(1):4457. doi: 10.1038/s41598-021-84003-4.
5
Classifying Force Spectroscopy of DNA Pulling Measurements Using Supervised and Unsupervised Machine Learning Methods.使用监督式和非监督式机器学习方法对DNA拉伸测量的力谱进行分类
J Chem Inf Model. 2016 Apr 25;56(4):621-9. doi: 10.1021/acs.jcim.5b00722. Epub 2016 Apr 4.
6
Comprehensive study of semi-supervised learning for DNA methylation-based supervised classification of central nervous system tumors.基于 DNA 甲基化的中枢神经系统肿瘤有监督分类的半监督学习综合研究。
BMC Bioinformatics. 2022 Jun 8;23(1):223. doi: 10.1186/s12859-022-04764-1.
7
Discrimination of the hierarchical structure of cortical layers in 2-photon microscopy data by combined unsupervised and supervised machine learning.基于无监督和监督机器学习的方法对双光子显微镜数据的皮质层层次结构进行判别。
Sci Rep. 2019 May 15;9(1):7424. doi: 10.1038/s41598-019-43432-y.
8
A novel phenotypic dissimilarity method for image-based high-throughput screens.一种基于图像的高通量筛选的新型表型差异方法。
BMC Bioinformatics. 2013 Nov 21;14:336. doi: 10.1186/1471-2105-14-336.
9
Convolutional sparse kernel network for unsupervised medical image analysis.卷积稀疏核网络在医学图像无监督分析中的应用。
Med Image Anal. 2019 Aug;56:140-151. doi: 10.1016/j.media.2019.06.005. Epub 2019 Jun 12.
10
The Utility of Unsupervised Machine Learning in Anatomic Pathology.无监督机器学习在解剖病理学中的应用。
Am J Clin Pathol. 2022 Jan 6;157(1):5-14. doi: 10.1093/ajcp/aqab085.

引用本文的文献

1
Increasing load factor in logistics and evaluating shipment performance with machine learning methods: A case from the automotive industry.提高物流中的装载率并使用机器学习方法评估运输绩效:来自汽车行业的案例。
Sci Rep. 2025 Apr 11;15(1):12434. doi: 10.1038/s41598-025-94713-8.
2
Improved detection of low-frequency within-host variants from deep sequencing: A case study with human papillomavirus.通过深度测序改进对宿主内低频变异的检测:人乳头瘤病毒的案例研究
Virus Evol. 2024 Feb 7;10(1):veae013. doi: 10.1093/ve/veae013. eCollection 2024.
3
COVID-19 vaccine design using reverse and structural vaccinology, ontology-based literature mining and machine learning.
利用反向和结构疫苗学、基于本体的文献挖掘和机器学习设计 COVID-19 疫苗。
Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac190.
4
Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases.运用先进数据科学从靶向测序面板中识别复发性遗传模式:散发性和遗传性神经退行性疾病的案例研究
BMC Med Genomics. 2022 Feb 10;15(1):26. doi: 10.1186/s12920-022-01173-4.