Suppr超能文献

推断跨越有序-无序相变的网络中的耦合。

Inferring couplings in networks across order-disorder phase transitions.

作者信息

Ngampruetikorn Vudtiwat, Sachdeva Vedant, Torrence Johanna, Humplik Jan, Schwab David J, Palmer Stephanie E

机构信息

Initiative for the Theoretical Sciences, The Graduate Center, CUNY, New York, New York 10016, USA.

Department of Organismal Biology and Anatomy and Department of Physics, University of Chicago, Chicago, Illinois 60637, USA.

出版信息

Phys Rev Res. 2022 Jun-Aug;4(2). doi: 10.1103/physrevresearch.4.023240. Epub 2022 Jun 24.

Abstract

Statistical inference is central to many scientific endeavors, yet how it works remains unresolved. Answering this requires a quantitative understanding of the intrinsic interplay between statistical models, inference methods, and the structure in the data. To this end, we characterize the efficacy of direct coupling analysis (DCA) - a highly successful method for analyzing amino acid sequence data-in inferring pairwise interactions from samples of ferromagnetic Ising models on random graphs. Our approach allows for physically motivated exploration of qualitatively distinct data regimes separated by phase transitions. We show that inference quality depends strongly on the nature of data-generating distributions: optimal accuracy occurs at an intermediate temperature where the detrimental effects from macroscopic order and thermal noise are minimal. Importantly our results indicate that DCA does not always outperform its local-statistics-based predecessors; while DCA excels at low temperatures, it becomes inferior to simple correlation thresholding at virtually all temperatures when data are limited. Our findings offer insights into the regime in which DCA operates so successfully, and more broadly, how inference interacts with the structure in the data.

摘要

统计推断是许多科学研究的核心,但它的工作原理仍未得到解决。要回答这个问题,需要对统计模型、推断方法和数据结构之间的内在相互作用有定量的理解。为此,我们刻画了直接耦合分析(DCA)的功效,DCA是一种用于分析氨基酸序列数据的非常成功的方法,用于从随机图上的铁磁伊辛模型样本中推断成对相互作用。我们的方法允许对由相变分隔的定性不同的数据区域进行基于物理动机的探索。我们表明,推断质量强烈依赖于数据生成分布的性质:在中间温度下,宏观秩序和热噪声的有害影响最小,此时精度最佳。重要的是,我们的结果表明,DCA并不总是优于其基于局部统计的前身;虽然DCA在低温下表现出色,但当数据有限时,在几乎所有温度下它都不如简单的相关阈值法。我们的发现为DCA如此成功运行的区域提供了见解,更广泛地说,为推断如何与数据结构相互作用提供了见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c2e/10421637/8609aa2305fd/nihms-1918669-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验