LASIGE, Faculdade de Ciências, Universidade de Lisboa, Lisbon, Portugal.
BMC Bioinformatics. 2022 May 23;23(1):192. doi: 10.1186/s12859-022-04733-8.
The effectiveness of biclustering, simultaneous clustering of rows and columns in a data matrix, was shown in gene expression data analysis. Several researchers recognize its potentialities in other research areas. Nevertheless, the last two decades have witnessed the development of a significant number of biclustering algorithms targeting gene expression data analysis and a lack of consistent studies exploring the capacities of biclustering outside this traditional application domain.
This work evaluates the potential use of biclustering in fMRI time series data, targeting the Region × Time dimensions by comparing seven state-in-the-art biclustering and three traditional clustering algorithms on artificial and real data. It further proposes a methodology for biclustering evaluation beyond gene expression data analysis. The results discuss the use of different search strategies in both artificial and real fMRI time series showed the superiority of exhaustive biclustering approaches, obtaining the most homogeneous biclusters. However, their high computational costs are a challenge, and further work is needed for the efficient use of biclustering in fMRI data analysis.
This work pinpoints avenues for the use of biclustering in spatio-temporal data analysis, in particular neurosciences applications. The proposed evaluation methodology showed evidence of the effectiveness of biclustering in finding local patterns in fMRI time series data. Further work is needed regarding scalability to promote the application in real scenarios.
在基因表达数据分析中,行和列同时聚类的双聚类技术已被证明具有有效性。尽管如此,在过去的二十年中,已经开发出了针对基因表达数据分析的大量双聚类算法,但针对该技术在传统应用领域之外的能力的一致性研究却相对较少。
本研究旨在通过比较七种最先进的双聚类算法和三种传统聚类算法在人工和真实数据上针对 fMRI 时间序列数据的 Region×Time 维度,评估双聚类在 fMRI 时间序列数据中的潜在应用。本研究进一步提出了一种超越基因表达数据分析的双聚类评估方法。研究结果讨论了不同搜索策略在人工和真实 fMRI 时间序列中的应用,结果表明完全双聚类方法具有优越性,能够获得最同质的双聚类。然而,其高计算成本是一个挑战,需要进一步研究以提高双聚类在 fMRI 数据分析中的效率。
本研究为双聚类在时空数据分析中的应用,特别是神经科学领域的应用指明了方向。所提出的评估方法为在 fMRI 时间序列数据中发现局部模式的双聚类有效性提供了证据。需要进一步研究可扩展性问题,以促进其在实际场景中的应用。