Ferles Christos, Beaufort William-Scott, Ferle Vanessa
Scientific Computing Research Unit and Department of Chemistry, University of Cape Town, Rondebosch, Cape Town, South Africa.
Institute of Marine Biology and Genetics, Center for Marine Research, East Attica, Greece.
Methods Mol Biol. 2017;1552:83-101. doi: 10.1007/978-1-4939-6753-7_6.
The present study devises mapping methodologies and projection techniques that visualize and demonstrate biological sequence data clustering results. The Sequence Data Density Display (SDDD) and Sequence Likelihood Projection (SLP) visualizations represent the input symbolical sequences in a lower-dimensional space in such a way that the clusters and relations of data elements are depicted graphically. Both operate in combination/synergy with the Self-Organizing Hidden Markov Model Map (SOHMMM). The resulting unified framework is in position to analyze automatically and directly raw sequence data. This analysis is carried out with little, or even complete absence of, prior information/domain knowledge.
本研究设计了映射方法和投影技术,用于可视化和展示生物序列数据聚类结果。序列数据密度显示(SDDD)和序列似然投影(SLP)可视化以这样一种方式在低维空间中表示输入的符号序列,即数据元素的聚类和关系以图形方式描绘。两者都与自组织隐马尔可夫模型图(SOHMMM)结合/协同运行。由此产生的统一框架能够自动且直接地分析原始序列数据。这种分析在几乎没有或甚至完全没有先验信息/领域知识的情况下进行。