Suppr超能文献

对于生物医学特征选择和分类问题,存在多种效果相似的解决方案。

Multiple similarly effective solutions exist for biomedical feature selection and classification problems.

作者信息

Liu Jiamei, Xu Cheng, Yang Weifeng, Shu Yayun, Zheng Weiwei, Zhou Fengfeng

机构信息

College of Software, Jilin University, Changchun, Jilin, 130012, China.

College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China.

出版信息

Sci Rep. 2017 Oct 9;7(1):12830. doi: 10.1038/s41598-017-13184-8.

Abstract

Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine learning model is trained for this purpose by optimizing the power of discriminating samples from two groups. However, most of the classification algorithms tend to generate one locally optimal solution according to the input dataset and the mathematical presumptions of the dataset. Here we demonstrated from the aspects of both disease classification and feature selection that multiple different solutions may have similar classification performances. So the existing machine learning algorithms may have ignored a horde of fishes by catching only a good one. Since most of the existing machine learning algorithms generate a solution by optimizing a mathematical goal, it may be essential for understanding the biological mechanisms for the investigated classification question, by considering both the generated solution and the ignored ones.

摘要

二元分类是一个广泛应用的问题,用于辅助决策各种生物医学大数据问题,例如治疗参与者与对照组之间的临床药物试验,以及有或无某种表型的参与者之间的全基因组关联研究(GWAS)。为此目的,通过优化区分两组样本的能力来训练机器学习模型。然而,大多数分类算法倾向于根据输入数据集和数据集的数学假设生成一个局部最优解。在这里,我们从疾病分类和特征选择两个方面证明,多个不同的解可能具有相似的分类性能。因此,现有的机器学习算法可能只钓到了一条好鱼,却忽略了一大群鱼。由于大多数现有的机器学习算法通过优化一个数学目标来生成一个解,通过考虑生成的解和被忽略的解来理解所研究分类问题的生物学机制可能至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c61/5634418/3d5e5c5fce8c/41598_2017_13184_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验