Suppr超能文献

多莉:大数据集的Vietoris-Rips过滤的二维以内持久图的计算。

Dory: Computation of persistence diagrams up to dimension two for Vietoris-Rips filtrations of large data sets.

作者信息

Aggarwal Manu, Periwal Vipul

机构信息

Laboratory of Biological Modeling, NIDDK, National Institutes of Health, 31 Center Dr, Bethesda, 20892, MD, United States.

出版信息

J Comput Sci. 2024 Jul;79. doi: 10.1016/j.jocs.2024.102290. Epub 2024 Apr 20.

Abstract

Persistent homology (PH) is an approach to topological data analysis (TDA) that computes multi-scale topologically invariant properties of high-dimensional data that are robust to noise. While PH has revealed useful patterns across various applications, computational requirements have limited applications to small data sets of a few thousand points. We present Dory, an efficient and scalable algorithm that can compute the persistent homology of sparse Vietoris-Rips complexes on larger data sets, up to and including dimension two and over the field . As an application, we compute the PH of the human genome at high resolution as revealed by a genome-wide Hi-C data set containing approximately three million points. Extant algorithms were unable to process it, whereas Dory processed it within five minutes, using less than five GB of memory. Results show that the topology of the human genome changes significantly upon treatment with auxin, a molecule that degrades cohesin, corroborating the hypothesis that cohesin plays a crucial role in loop formation in DNA.

摘要

持久同调(PH)是一种拓扑数据分析(TDA)方法,它可以计算高维数据的多尺度拓扑不变属性,这些属性对噪声具有鲁棒性。虽然PH在各种应用中都揭示了有用的模式,但计算需求限制了其应用于几千个点的小数据集。我们提出了Dory,这是一种高效且可扩展的算法,它可以在更大的数据集上计算稀疏Vietoris-Rips复形的持久同调,维度可达二维及包括二维,且在该域上。作为一个应用,我们以高分辨率计算了人类基因组的PH,这是由一个包含约三百万个点的全基因组Hi-C数据集揭示的。现有的算法无法处理它,而Dory在不到五分钟的时间内使用不到5GB的内存就处理了它。结果表明,用生长素(一种降解黏连蛋白的分子)处理后,人类基因组的拓扑结构发生了显著变化,这证实了黏连蛋白在DNA环形成中起关键作用的假设。

相似文献

9
Treatments for breast engorgement during lactation.哺乳期乳房胀痛的治疗方法。
Cochrane Database Syst Rev. 2016 Jun 28;2016(6):CD006946. doi: 10.1002/14651858.CD006946.pub3.

本文引用的文献

1
PI-Net: A Deep Learning Approach to Extract Topological Persistence Images.PI-Net:一种用于提取拓扑持久性图像的深度学习方法。
Conf Comput Vis Pattern Recognit Workshops. 2020 Jun;2020:3639-3648. doi: 10.1109/cvprw50498.2020.00425. Epub 2020 Jul 28.
2
Topological data analysis of zebrafish patterns.斑马鱼图案的拓扑数据分析。
Proc Natl Acad Sci U S A. 2020 Mar 10;117(10):5113-5124. doi: 10.1073/pnas.1917763117. Epub 2020 Feb 25.
3
A roadmap for the computation of persistent homology.持久同调计算路线图。
EPJ Data Sci. 2017;6(1):17. doi: 10.1140/epjds/s13688-017-0109-5. Epub 2017 Aug 9.
5
Cohesin Loss Eliminates All Loop Domains.黏连蛋白缺失消除了所有的环状结构域。
Cell. 2017 Oct 5;171(2):305-320.e24. doi: 10.1016/j.cell.2017.09.026.
6
Persistent Homology Analysis of Brain Artery Trees.脑动脉树的持久同调分析
Ann Appl Stat. 2016;10(1):198-218. doi: 10.1214/15-AOAS886. Epub 2016 Mar 25.
9
Persistence of force networks in compressed granular media.压缩颗粒介质中力网络的持久性。
Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Apr;87(4):042207. doi: 10.1103/PhysRevE.87.042207. Epub 2013 Apr 22.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验