Suppr超能文献

使用树突状热图同时显示基因型差异与表型差异。

Using Dendritic Heat Maps to Simultaneously Display Genotype Divergence with Phenotype Divergence.

作者信息

Kellom Matthew, Raymond Jason

机构信息

School of Earth and Space Exploration, Arizona State University, Tempe, Arizona, United States of America.

出版信息

PLoS One. 2016 Aug 18;11(8):e0161292. doi: 10.1371/journal.pone.0161292. eCollection 2016.

Abstract

The advancement of techniques to visualize and analyze large-scale sequencing datasets is an area of active research and is rooted in traditional techniques such as heat maps and dendrograms. We introduce dendritic heat maps that display heat map results over aligned DNA sequence clusters for a range of clustering cutoffs. Dendritic heat maps aid in visualizing the effects of group differences on clustering hierarchy and relative abundance of sampled sequences. Here, we artificially generate two separate datasets with simplified mutation and population growth procedures with GC content group separation to use as example phenotypes. In this work, we use the term phenotype to represent any feature by which groups can be separated. These sequences were clustered in a fractional identity range of 0.75 to 1.0 using agglomerative minimum-, maximum-, and average-linkage algorithms, as well as a divisive centroid-based algorithm. We demonstrate that dendritic heat maps give freedom to scrutinize specific clustering levels across a range of cutoffs, track changes in phenotype inequity across multiple levels of sequence clustering specificity, and easily visualize how deeply rooted changes in phenotype inequity are in a dataset. As genotypes diverge in sample populations, clusters are shown to break apart into smaller clusters at higher identity cutoff levels, similar to a dendrogram. Phenotype divergence, which is shown as a heat map of relative abundance bin response, may or may not follow genotype divergences. This joined view highlights the relationship between genotype and phenotype divergence for treatment groups. We discuss the minimum-, maximum-, average-, and centroid-linkage algorithm approaches to building dendritic heat maps and make a case for the divisive "top-down" centroid-based clustering methodology as being the best option visualize the effects of changing factors on clustering hierarchy and relative abundance.

摘要

可视化和分析大规模测序数据集的技术进展是一个活跃的研究领域,它植根于诸如热图和树状图等传统技术。我们引入了树状热图,它可以在一系列聚类截止值的情况下,将热图结果显示在比对后的DNA序列簇上。树状热图有助于可视化群体差异对聚类层次结构和采样序列相对丰度的影响。在这里,我们通过简化的突变和群体增长程序,人为生成了两个单独的数据集,并按GC含量进行分组,以此作为示例表型。在这项工作中,我们使用术语“表型”来表示可用于区分不同群体的任何特征。这些序列使用凝聚式最小、最大和平均连锁算法以及基于分裂质心的算法,在0.75至1.0的分数同一性范围内进行聚类。我们证明,树状热图能够让我们自由地审视一系列截止值下的特定聚类水平,追踪跨多个序列聚类特异性水平的表型不平等变化,并轻松可视化表型不平等在数据集中的根源有多深。随着样本群体中的基因型发生分化,聚类在更高的同一性截止水平上会分裂成更小的聚类,这与树状图类似。表型分化以相对丰度区间响应的热图形式显示,可能与基因型分化一致,也可能不一致。这种联合视图突出了治疗组基因型和表型分化之间的关系。我们讨论了构建树状热图的最小、最大、平均和质心连锁算法方法,并论证了基于分裂“自上而下”质心的聚类方法是可视化变化因素对聚类层次结构和相对丰度影响的最佳选择。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2476/4990276/c09dc84e695f/pone.0161292.g007.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验