• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过基因组汇总统计的光谱分析揭示自然选择的足迹。

Uncovering Footprints of Natural Selection Through Spectral Analysis of Genomic Summary Statistics.

机构信息

Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA.

出版信息

Mol Biol Evol. 2023 Jul 5;40(7). doi: 10.1093/molbev/msad157.

DOI:10.1093/molbev/msad157
PMID:37433019
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10365025/
Abstract

Natural selection leaves a spatial pattern along the genome, with a haplotype distribution distortion near the selected locus that fades with distance. Evaluating the spatial signal of a population-genetic summary statistic across the genome allows for patterns of natural selection to be distinguished from neutrality. Considering the genomic spatial distribution of multiple summary statistics is expected to aid in uncovering subtle signatures of selection. In recent years, numerous methods have been devised that consider genomic spatial distributions across summary statistics, utilizing both classical machine learning and deep learning architectures. However, better predictions may be attainable by improving the way in which features are extracted from these summary statistics. We apply wavelet transform, multitaper spectral analysis, and S-transform to summary statistic arrays to achieve this goal. Each analysis method converts one-dimensional summary statistic arrays to two-dimensional images of spectral analysis, allowing simultaneous temporal and spectral assessment. We feed these images into convolutional neural networks and consider combining models using ensemble stacking. Our modeling framework achieves high accuracy and power across a diverse set of evolutionary settings, including population size changes and test sets of varying sweep strength, softness, and timing. A scan of central European whole-genome sequences recapitulated well-established sweep candidates and predicted novel cancer-associated genes as sweeps with high support. Given that this modeling framework is also robust to missing genomic segments, we believe that it will represent a welcome addition to the population-genomic toolkit for learning about adaptive processes from genomic data.

摘要

自然选择在基因组上留下了一个空间模式,在选择的基因座附近存在着单倍型分布的扭曲,这种扭曲随着距离的增加而逐渐消失。评估群体遗传综合统计量在基因组上的空间信号,可以将自然选择的模式与中性模式区分开来。考虑多个综合统计量的基因组空间分布,有望帮助揭示选择的微妙特征。近年来,已经设计出了许多方法,利用经典的机器学习和深度学习架构,考虑了综合统计量的基因组空间分布。然而,通过改进从这些综合统计量中提取特征的方式,可能可以获得更好的预测结果。我们应用小波变换、多谱线谱分析和 S 变换来实现这一目标,将一维综合统计数组转换为谱分析的二维图像,从而可以同时进行时间和谱分析。我们将这些图像输入卷积神经网络,并考虑使用集成堆叠来组合模型。我们的建模框架在各种进化环境中都具有很高的准确性和功效,包括种群大小的变化以及具有不同强度、柔软度和时间的测试集。对中欧全基因组序列的扫描很好地重现了已确立的扫描候选者,并预测了一些癌症相关基因作为具有高支持的扫描。鉴于该建模框架也能很好地处理缺失的基因组片段,我们相信它将成为从基因组数据中学习适应性过程的群体基因组工具包的一个受欢迎的补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/0a6ca76b0855/msad157f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/938ab894b11d/msad157f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/868ae69be54d/msad157f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/d96a030e47f7/msad157f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/2a05cc35f39a/msad157f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/f99dd6563e20/msad157f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/9bca74ec2770/msad157f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/a6e5816b23f1/msad157f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/0a6ca76b0855/msad157f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/938ab894b11d/msad157f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/868ae69be54d/msad157f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/d96a030e47f7/msad157f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/2a05cc35f39a/msad157f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/f99dd6563e20/msad157f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/9bca74ec2770/msad157f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/a6e5816b23f1/msad157f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/10365025/0a6ca76b0855/msad157f8.jpg

相似文献

1
Uncovering Footprints of Natural Selection Through Spectral Analysis of Genomic Summary Statistics.通过基因组汇总统计的光谱分析揭示自然选择的足迹。
Mol Biol Evol. 2023 Jul 5;40(7). doi: 10.1093/molbev/msad157.
2
Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data.基于张量分解的特征提取与分类方法从基因组数据中检测自然选择。
Mol Biol Evol. 2023 Oct 4;40(10). doi: 10.1093/molbev/msad216.
3
ImaGene: a convolutional neural network to quantify natural selection from genomic data.ImaGene:一种从基因组数据中定量自然选择的卷积神经网络。
BMC Bioinformatics. 2019 Nov 22;20(Suppl 9):337. doi: 10.1186/s12859-019-2927-x.
4
Tensor decomposition based feature extraction and classification to detect natural selection from genomic data.基于张量分解的特征提取与分类,以从基因组数据中检测自然选择。
bioRxiv. 2023 Mar 29:2023.03.27.527731. doi: 10.1101/2023.03.27.527731.
5
On convolutional neural networks for selection inference: Revealing the effect of preprocessing on model learning and the capacity to discover novel patterns.基于卷积神经网络的选择推理研究:揭示预处理对模型学习和发现新规律能力的影响。
PLoS Comput Biol. 2023 Nov 27;19(11):e1010979. doi: 10.1371/journal.pcbi.1010979. eCollection 2023 Nov.
6
Machine-Learning Prospects for Detecting Selection Signatures Using Population Genomics Data.利用群体基因组学数据检测选择信号的机器学习前景。
J Comput Biol. 2022 Sep;29(9):943-960. doi: 10.1089/cmb.2021.0447. Epub 2022 May 30.
7
Distinguishing between recent balancing selection and incomplete sweep using deep neural networks.利用深度神经网络区分近期平衡选择和不完全清除。
Mol Ecol Resour. 2021 Nov;21(8):2706-2718. doi: 10.1111/1755-0998.13379. Epub 2021 Apr 5.
8
A Likelihood Approach for Uncovering Selective Sweep Signatures from Haplotype Data.一种从单倍型数据中发现选择清除信号的似然方法。
Mol Biol Evol. 2020 Oct 1;37(10):3023-3046. doi: 10.1093/molbev/msaa115.
9
Detecting Positive Selection in Populations Using Genetic Data.利用遗传数据检测群体中的正选择。
Methods Mol Biol. 2020;2090:87-123. doi: 10.1007/978-1-0716-0199-0_5.
10
The pitfalls and virtues of population genetic summary statistics: Detecting selective sweeps in recent divergences.群体遗传综合统计数据的陷阱与优点:检测近期分歧中的选择清除。
J Evol Biol. 2021 Jun;34(6):893-909. doi: 10.1111/jeb.13738. Epub 2020 Dec 16.

引用本文的文献

1
Semi-supervised detection of natural selection with positive-unlabeled learning.基于正例未标注学习的自然选择半监督检测
bioRxiv. 2025 Aug 18:2025.08.15.670602. doi: 10.1101/2025.08.15.670602.
2
Signatures of soft selective sweeps predominate in the yellow fever mosquito .软选择清除的特征在埃及伊蚊中占主导地位。
bioRxiv. 2025 Jul 10:2025.07.06.663360. doi: 10.1101/2025.07.06.663360.
3
Genomic Anomaly Detection with Functional Data Analysis.基于功能数据分析的基因组异常检测

本文引用的文献

1
A spatially aware likelihood test to detect sweeps from haplotype distributions.基于单倍型分布的空间感知似然检验方法来检测选择信号
PLoS Genet. 2022 Apr 11;18(4):e1010134. doi: 10.1371/journal.pgen.1010134. eCollection 2022 Apr.
2
WASF2 Serves as a Potential Biomarker and Therapeutic Target in Ovarian Cancer: A Pan-Cancer Analysis.WASF2作为卵巢癌的潜在生物标志物和治疗靶点:一项泛癌分析
Front Oncol. 2022 Mar 14;12:840038. doi: 10.3389/fonc.2022.840038. eCollection 2022.
3
Detecting adaptive introgression in human evolution using convolutional neural networks.
Genes (Basel). 2025 Jun 15;16(6):710. doi: 10.3390/genes16060710.
4
Sweeps in Space: Leveraging Geographic Data to Identify Beneficial Alleles in Anopheles gambiae.空间扫描:利用地理数据识别冈比亚按蚊中的有益等位基因。
Mol Biol Evol. 2025 Jun 4;42(6). doi: 10.1093/molbev/msaf141.
5
Efficient Detection and Characterization of Targets of Natural Selection Using Transfer Learning.利用迁移学习对自然选择目标进行高效检测与特征描述
Mol Biol Evol. 2025 Apr 30;42(5). doi: 10.1093/molbev/msaf094.
6
Efficient detection and characterization of targets of natural selection using transfer learning.利用迁移学习对自然选择目标进行高效检测与特征描述。
bioRxiv. 2025 Mar 6:2025.03.05.641710. doi: 10.1101/2025.03.05.641710.
7
Sweeps in space: leveraging geographic data to identify beneficial alleles in .空间扫描:利用地理数据识别……中的有益等位基因
bioRxiv. 2025 Apr 23:2025.02.07.637123. doi: 10.1101/2025.02.07.637123.
8
iHDSel software: The price equation and the population stability index to detect genomic patterns compatible with selective sweeps. An example with SARS-CoV-2.iHDSel软件:用于检测与选择性清除兼容的基因组模式的价格方程和群体稳定性指数。以严重急性呼吸综合征冠状病毒2为例。
Biol Methods Protoc. 2024 Nov 27;9(1):bpae089. doi: 10.1093/biomethods/bpae089. eCollection 2024.
9
Digital Image Processing to Detect Adaptive Evolution.用于检测适应性进化的数字图像处理
Mol Biol Evol. 2024 Dec 6;41(12). doi: 10.1093/molbev/msae242.
10
Tree Sequences as a General-Purpose Tool for Population Genetic Inference.树序列作为一种通用的群体遗传推断工具。
Mol Biol Evol. 2024 Nov 1;41(11). doi: 10.1093/molbev/msae223.
使用卷积神经网络检测人类进化中的适应性基因渗入。
Elife. 2021 May 25;10:e64669. doi: 10.7554/eLife.64669.
4
Distinguishing between recent balancing selection and incomplete sweep using deep neural networks.利用深度神经网络区分近期平衡选择和不完全清除。
Mol Ecol Resour. 2021 Nov;21(8):2706-2718. doi: 10.1111/1755-0998.13379. Epub 2021 Apr 5.
5
Discovery of Ongoing Selective Sweeps within Anopheles Mosquito Populations Using Deep Learning.利用深度学习发现疟蚊种群中的持续选择清除。
Mol Biol Evol. 2021 Mar 9;38(3):1168-1183. doi: 10.1093/molbev/msaa259.
6
Learning the properties of adaptive regions with functional data analysis.利用功能数据分析学习自适应区域的特性。
PLoS Genet. 2020 Aug 27;16(8):e1008896. doi: 10.1371/journal.pgen.1008896. eCollection 2020 Aug.
7
Background Selection Does Not Mimic the Patterns of Genetic Diversity Produced by Selective Sweeps.背景选择不会模仿选择清除产生的遗传多样性模式。
Genetics. 2020 Oct;216(2):499-519. doi: 10.1534/genetics.120.303469. Epub 2020 Aug 26.
8
A Likelihood Approach for Uncovering Selective Sweep Signatures from Haplotype Data.一种从单倍型数据中发现选择清除信号的似然方法。
Mol Biol Evol. 2020 Oct 1;37(10):3023-3046. doi: 10.1093/molbev/msaa115.
9
Identifying and Classifying Shared Selective Sweeps from Multilocus Data.从多基因座数据中识别和分类共享的选择漂变。
Genetics. 2020 May;215(1):143-171. doi: 10.1534/genetics.120.303137. Epub 2020 Mar 9.
10
Computer-based Multitaper Spectrogram Program for Electroencephalographic Data.用于脑电图数据的基于计算机的多窗谱图程序
J Vis Exp. 2019 Nov 13(153). doi: 10.3791/60333.