Suppr超能文献

一种基于组套索的方法,用于从多个时间序列数据集中稳健地推断基因调控网络。

A group LASSO-based method for robustly inferring gene regulatory networks from multiple time-course datasets.

作者信息

Liu Li-Zhi, Wu Fang-Xiang, Zhang Wen-Jun

出版信息

BMC Syst Biol. 2014;8 Suppl 3(Suppl 3):S1. doi: 10.1186/1752-0509-8-S3-S1. Epub 2014 Oct 22.

Abstract

BACKGROUND

As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results.

RESULTS

A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves.

CONCLUSIONS

The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.

摘要

背景

基因调控网络作为细胞中基因调控的抽象映射,对生物学研究和实际应用都很重要。从微阵列基因表达数据逆向工程基因调控网络是系统生物学中一个具有挑战性的研究问题。随着生物技术的发展,可能会在不同情况下为特定基因网络收集多个时间进程基因表达数据集。通过整合这些多个数据集可以改进基因调控网络的推断。还已知基因表达数据可能被大误差或异常值污染,这可能会影响推断结果。

结果

提出了一种新方法,即Huber组套索法,用于从多个时间进程基因表达数据集中推断相同的潜在网络拓扑结构,并考虑对大误差或异常值的鲁棒性。为了解决所提出方法中涉及的优化问题,开发了一种结合辅助函数最小化和块下降思想的高效算法。一种稳定性选择方法适用于我们的方法,以找到由具有分数的边组成的网络拓扑结构。所提出的方法应用于模拟数据集和实际实验数据集。结果表明,在接收器操作特征曲线下的面积和精确召回率曲线下的面积方面,Huber组套索法均优于组套索法。

结论

算法的收敛性分析从理论上表明,算法生成的序列收敛到问题的最优解。模拟和实际数据示例证明了Huber组套索法在整合多个时间进程基因表达数据集以及提高对大误差或异常值的抗性方面的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa83/4243122/c641049d9a4c/1752-0509-8-S3-S1-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验