Wang Yue, Chen Kunqi, Wei Zhen, Coenen Frans, Su Jionglong, Meng Jia
Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Department of Computer Science, University of Liverpool, Liverpool L69 7ZB, UK.
Bioinformatics. 2021 Jun 9;37(9):1285-1291. doi: 10.1093/bioinformatics/btaa938.
The distribution of biological features strongly indicates their functional relevance. Compared to DNA-related features, deciphering the distribution of mRNA-related features is non-trivial due to the existence of isoform ambiguity and compositional diversity of mRNAs.
We propose here a rigorous statistical framework, MetaTX, for deciphering the distribution of mRNA-related features. Through a standardized mRNA model, MetaTX firstly unifies various mRNA transcripts of diverse compositions, and then corrects the isoform ambiguity by incorporating the overall distribution pattern of the features through an EM algorithm. MetaTX was tested on both simulated and real data. Results suggested that MetaTX substantially outperformed existing direct methods on simulated datasets, and that a more informative distribution pattern was produced for all the three datasets tested, which contain N6-Methyladenosine sites generated by different technologies. MetaTX should make a useful tool for studying the distribution and functions of mRNA-related biological features, especially for mRNA modifications such as N6-Methyladenosine.
The MetaTX R package is freely available at GitHub: https://github.com/yue-wang-biomath/MetaTX.1.0.
Supplementary data are available at Bioinformatics online.
生物学特征的分布强烈表明其功能相关性。与DNA相关特征相比,由于mRNA存在异构体模糊性和组成多样性,解读mRNA相关特征的分布并非易事。
我们在此提出一个严格的统计框架MetaTX,用于解读mRNA相关特征的分布。通过一个标准化的mRNA模型,MetaTX首先统一各种组成不同的mRNA转录本,然后通过期望最大化(EM)算法纳入特征的整体分布模式来校正异构体模糊性。MetaTX在模拟数据和真实数据上均进行了测试。结果表明,在模拟数据集上,MetaTX显著优于现有的直接方法,并且对于所测试的所有三个数据集(包含通过不同技术生成的N6-甲基腺苷位点)都产生了更具信息性的分布模式。MetaTX应该会成为研究mRNA相关生物学特征的分布和功能的有用工具,特别是对于诸如N6-甲基腺苷之类的mRNA修饰。
MetaTX R包可在GitHub上免费获取:https://github.com/yue-wang-biomath/MetaTX.1.0。
补充数据可在《生物信息学》在线获取。