• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于张量分解的特征提取与分类方法从基因组数据中检测自然选择。

Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data.

机构信息

Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA.

出版信息

Mol Biol Evol. 2023 Oct 4;40(10). doi: 10.1093/molbev/msad216.

DOI:10.1093/molbev/msad216
PMID:37772983
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10581699/
Abstract

Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.

摘要

推断适应性事件对于了解特征很重要,例如人类在婴儿期后消化乳糖和病毒变体的快速传播。早期从基因组数据中识别自然选择痕迹的努力涉及到总结统计和似然方法的开发。然而,这些技术基于简单的模式或理论模型,限制了它们可以探索的复杂程度。由于人工智能的复兴,机器学习方法在最近检测自然选择的努力中占据了中心地位,例如卷积神经网络应用于单倍型图像。然而,这些技术的局限性包括在非凸环境下估计大量模型参数以及在不考虑图像内位置的情况下进行特征识别。另一种方法是使用张量分解从多维数据中提取特征,尽管保留了数据的潜在结构,并将这些特征提供给机器学习模型。在这里,我们采用了这个框架,并提出了一种新的方法,称为 T-REx,它使用张量分解从采样个体的单倍型图像中提取特征,然后使用经典的机器学习方法从这些特征进行预测。作为一个概念验证,我们探索了 T-REx 在模拟中性和选择清扫场景中的性能,发现它具有很高的区分清扫和中性的能力、对常见技术障碍的鲁棒性以及特征重要性的易于可视化。因此,T-REx 是从基因组数据中检测适应性过程的工具包的有力补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/ebdfe8b2916f/msad216f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/6d9106909c48/msad216f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/9adef1be3696/msad216f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/40f0334e1562/msad216f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/912e81ed2cbe/msad216f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/176df4290824/msad216f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/ebdfe8b2916f/msad216f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/6d9106909c48/msad216f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/9adef1be3696/msad216f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/40f0334e1562/msad216f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/912e81ed2cbe/msad216f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/176df4290824/msad216f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0592/10581699/ebdfe8b2916f/msad216f6.jpg

相似文献

1
Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data.基于张量分解的特征提取与分类方法从基因组数据中检测自然选择。
Mol Biol Evol. 2023 Oct 4;40(10). doi: 10.1093/molbev/msad216.
2
Tensor decomposition based feature extraction and classification to detect natural selection from genomic data.基于张量分解的特征提取与分类,以从基因组数据中检测自然选择。
bioRxiv. 2023 Mar 29:2023.03.27.527731. doi: 10.1101/2023.03.27.527731.
3
Uncovering Footprints of Natural Selection Through Spectral Analysis of Genomic Summary Statistics.通过基因组汇总统计的光谱分析揭示自然选择的足迹。
Mol Biol Evol. 2023 Jul 5;40(7). doi: 10.1093/molbev/msad157.
4
ImaGene: a convolutional neural network to quantify natural selection from genomic data.ImaGene:一种从基因组数据中定量自然选择的卷积神经网络。
BMC Bioinformatics. 2019 Nov 22;20(Suppl 9):337. doi: 10.1186/s12859-019-2927-x.
5
On convolutional neural networks for selection inference: Revealing the effect of preprocessing on model learning and the capacity to discover novel patterns.基于卷积神经网络的选择推理研究:揭示预处理对模型学习和发现新规律能力的影响。
PLoS Comput Biol. 2023 Nov 27;19(11):e1010979. doi: 10.1371/journal.pcbi.1010979. eCollection 2023 Nov.
6
Interpreting generative adversarial networks to infer natural selection from genetic data.从遗传数据推断自然选择的生成对抗网络解释。
Genetics. 2024 Apr 3;226(4). doi: 10.1093/genetics/iyae024.
7
Detecting Positive Selection in Populations Using Genetic Data.利用遗传数据检测群体中的正选择。
Methods Mol Biol. 2020;2090:87-123. doi: 10.1007/978-1-0716-0199-0_5.
8
Hierarchical boosting: a machine-learning framework to detect and classify hard selective sweeps in human populations.分层提升:一种用于检测和分类人类群体中硬选择性清除的机器学习框架。
Bioinformatics. 2015 Dec 15;31(24):3946-52. doi: 10.1093/bioinformatics/btv493. Epub 2015 Aug 26.
9
Machine-Learning Prospects for Detecting Selection Signatures Using Population Genomics Data.利用群体基因组学数据检测选择信号的机器学习前景。
J Comput Biol. 2022 Sep;29(9):943-960. doi: 10.1089/cmb.2021.0447. Epub 2022 May 30.
10
Detecting adaptive introgression in human evolution using convolutional neural networks.使用卷积神经网络检测人类进化中的适应性基因渗入。
Elife. 2021 May 25;10:e64669. doi: 10.7554/eLife.64669.

引用本文的文献

1
Genomic Anomaly Detection with Functional Data Analysis.基于功能数据分析的基因组异常检测
Genes (Basel). 2025 Jun 15;16(6):710. doi: 10.3390/genes16060710.
2
Efficient Detection and Characterization of Targets of Natural Selection Using Transfer Learning.利用迁移学习对自然选择目标进行高效检测与特征描述
Mol Biol Evol. 2025 Apr 30;42(5). doi: 10.1093/molbev/msaf094.
3
Efficient detection and characterization of targets of natural selection using transfer learning.利用迁移学习对自然选择目标进行高效检测与特征描述。

本文引用的文献

1
Uncovering Footprints of Natural Selection Through Spectral Analysis of Genomic Summary Statistics.通过基因组汇总统计的光谱分析揭示自然选择的足迹。
Mol Biol Evol. 2023 Jul 5;40(7). doi: 10.1093/molbev/msad157.
2
Versatile Detection of Diverse Selective Sweeps with Flex-Sweep.利用 Flex-Sweep 实现多种选择清除的灵活检测。
Mol Biol Evol. 2023 Jun 1;40(6). doi: 10.1093/molbev/msad139.
3
Timesweeper: accurately identifying selective sweeps using population genomic time series.Timesweeper:使用种群基因组时间序列准确识别选择清除。
bioRxiv. 2025 Mar 6:2025.03.05.641710. doi: 10.1101/2025.03.05.641710.
4
iHDSel software: The price equation and the population stability index to detect genomic patterns compatible with selective sweeps. An example with SARS-CoV-2.iHDSel软件:用于检测与选择性清除兼容的基因组模式的价格方程和群体稳定性指数。以严重急性呼吸综合征冠状病毒2为例。
Biol Methods Protoc. 2024 Nov 27;9(1):bpae089. doi: 10.1093/biomethods/bpae089. eCollection 2024.
5
Digital Image Processing to Detect Adaptive Evolution.用于检测适应性进化的数字图像处理
Mol Biol Evol. 2024 Dec 6;41(12). doi: 10.1093/molbev/msae242.
Genetics. 2023 Jul 6;224(3). doi: 10.1093/genetics/iyad084.
4
Effects of Selection at Linked Sites on Patterns of Genetic Variability.连锁位点选择对遗传变异模式的影响。
Annu Rev Ecol Evol Syst. 2021 Nov;52:177-197. doi: 10.1146/annurev-ecolsys-010621-044528.
5
Mutation rates across species.跨物种的突变率。
Nat Genet. 2023 Apr;55(4):524. doi: 10.1038/s41588-023-01381-3.
6
Evolution of the germline mutation rate across vertebrates.脊椎动物种系突变率的演化。
Nature. 2023 Mar;615(7951):285-291. doi: 10.1038/s41586-023-05752-y. Epub 2023 Mar 1.
7
Inferring Balancing Selection From Genome-Scale Data.从全基因组数据推断平衡选择。
Genome Biol Evol. 2023 Mar 3;15(3). doi: 10.1093/gbe/evad032.
8
Admixture has obscured signals of historical hard sweeps in humans.混合掩盖了人类历史上剧烈遗传漂变的信号。
Nat Ecol Evol. 2022 Dec;6(12):2003-2015. doi: 10.1038/s41559-022-01914-9. Epub 2022 Oct 31.
9
Integrated In Silico Analyses Identify PUF60 and SF3A3 as New Spliceosome-Related Breast Cancer RNA-Binding Proteins.整合的计算机模拟分析确定PUF60和SF3A3为新的与剪接体相关的乳腺癌RNA结合蛋白。
Biology (Basel). 2022 Mar 22;11(4):481. doi: 10.3390/biology11040481.
10
A spatially aware likelihood test to detect sweeps from haplotype distributions.基于单倍型分布的空间感知似然检验方法来检测选择信号
PLoS Genet. 2022 Apr 11;18(4):e1010134. doi: 10.1371/journal.pgen.1010134. eCollection 2022 Apr.