Computational Systems Biology Lab, Department of Bioinformatics, Shantou University Medical College (SUMC), No.22, Rd. Xinling, Shantou, China.
Guangdong Provincial Key Laboratory for Breast Cancer Diagnosis and Treatment, Cancer Hospital, Shantou University Medical College (SUMC), Shantou, 515041, China.
BMC Bioinformatics. 2019 Nov 8;20(1):554. doi: 10.1186/s12859-019-3171-0.
BACKGROUND: The improvements of high throughput technologies have produced large amounts of multi-omics experiments datasets. Initial analysis of these data has revealed many concurrent gene alterations within single dataset or/and among multiple omics datasets. Although powerful bioinformatics pipelines have been developed to store, manipulate and analyze these data, few explicitly find and assess the recurrent co-occurring aberrations across multiple regulation levels. RESULTS: Here, we introduced a novel R-package (called OmicsARules) to identify the concerted changes among genes under association rules mining framework. OmicsARules embedded a new rule-interestingness measure, Lamda3, to evaluate the associated pattern and prioritize the most biologically meaningful gene associations. As demonstrated with DNA methlylation and RNA-seq datasets from breast invasive carcinoma (BRCA), esophageal carcinoma (ESCA) and lung adenocarcinoma (LUAD), Lamda3 achieved better biological significance over other rule-ranking measures. Furthermore, OmicsARules can illustrate the mechanistic connections between methlylation and transcription, based on combined omics dataset. OmicsARules is available as a free and open-source R package. CONCLUSIONS: OmicsARules searches for concurrent patterns among frequently altered genes, thus provides a new dimension for exploring single or multiple omics data across sequencing platforms.
背景:高通量技术的改进产生了大量的多组学实验数据集。对这些数据的初步分析揭示了单个数据集或多个组学数据集中许多并发的基因改变。尽管已经开发了强大的生物信息学管道来存储、操作和分析这些数据,但很少有专门的方法能够在多个调控水平上发现和评估反复出现的共发生的异常。
结果:在这里,我们引入了一个新的 R 包(称为 OmicsARules),用于在关联规则挖掘框架下识别基因之间的协同变化。OmicsARules 嵌入了一种新的规则有趣性度量 Lamda3,用于评估关联模式并优先考虑最有生物学意义的基因关联。正如在来自乳腺浸润性癌(BRCA)、食管癌(ESCA)和肺腺癌(LUAD)的 DNA 甲基化和 RNA-seq 数据集中所展示的那样,Lamda3 在其他规则排名度量中实现了更好的生物学意义。此外,OmicsARules 可以根据组合的组学数据集说明甲基化和转录之间的机制联系。OmicsARules 作为一个免费和开源的 R 包提供。
结论:OmicsARules 搜索经常改变的基因之间的并发模式,从而为探索单一组学或多个组学数据提供了一个新的维度,跨越测序平台。
BMC Bioinformatics. 2019-11-8
Bioinformatics. 2015-6-1
Nucleic Acids Res. 2021-1-8
BMC Bioinformatics. 2019-12-10
Bioinformatics. 2018-3-15
PLoS Comput Biol. 2017-11-3
Cancer Res. 2017-11-1
Semin Cancer Biol. 2018-11-16
World J Gastroenterol. 2017-10-21
Nature. 2016-2-24
Nat Methods. 2016-1