Suppr超能文献

AIGen:用于复杂基因数据分析的人工智能软件。

AIGen: an artificial intelligence software for complex genetic data analysis.

机构信息

Department of Experimental Statistics, Louisiana State University, 45 Martin D. Woodin Hall, Baton Rouge, LA 70802, United States.

Department of Mathematics, Texas State University, 601 University Drive San Marcos, TX 78666, United States.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae566.

Abstract

The recent development of artificial intelligence (AI) technology, especially the advance of deep neural network (DNN) technology, has revolutionized many fields. While DNN plays a central role in modern AI technology, it has rarely been used in genetic data analysis due to analytical and computational challenges brought by high-dimensional genetic data and an increasing number of samples. To facilitate the use of AI in genetic data analysis, we developed a C++ package, AIGen, based on two newly developed neural networks (i.e. kernel neural networks and functional neural networks) that are capable of modeling complex genotype-phenotype relationships (e.g. interactions) while providing robust performance against high-dimensional genetic data. Moreover, computationally efficient algorithms (e.g. a minimum norm quadratic unbiased estimation approach and batch training) are implemented in the package to accelerate the computation, making them computationally efficient for analyzing large-scale datasets with thousands or even millions of samples. By applying AIGen to the UK Biobank dataset, we demonstrate that it can efficiently analyze large-scale genetic data, attain improved accuracy, and maintain robust performance. Availability: AIGen is developed in C++ and its source code, along with reference libraries, is publicly accessible on GitHub at https://github.com/TingtHou/AIGen.

摘要

最近人工智能 (AI) 技术的发展,特别是深度神经网络 (DNN) 技术的进步,已经彻底改变了许多领域。虽然 DNN 在现代 AI 技术中起着核心作用,但由于高维遗传数据和样本数量的增加带来的分析和计算挑战,它很少被用于遗传数据分析。为了促进 AI 在遗传数据分析中的应用,我们开发了一个基于两个新开发的神经网络(即核神经网络和功能神经网络)的 C++ 包 AIGen,它能够建模复杂的基因型-表型关系(例如相互作用),同时提供针对高维遗传数据的稳健性能。此外,该包中还实现了计算效率高的算法(例如最小范数二次无偏估计方法和批量训练),以加速计算,使其能够对具有数千甚至数百万个样本的大规模数据集进行高效分析。通过将 AIGen 应用于 UK Biobank 数据集,我们证明它可以有效地分析大规模遗传数据,提高准确性,并保持稳健的性能。可获取性:AIGen 是用 C++ 开发的,其源代码以及参考库都可以在 GitHub 上公开获取,网址为 https://github.com/TingtHou/AIGen。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41a9/11568876/497d52089aa2/bbae566f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验