Suppr超能文献

利用综合方法研究远缘生物中必需基因的可预测性。

Investigating the predictability of essential genes across distantly related organisms using an integrative approach.

机构信息

Division of Biomedical Informatics, Cincinnati Children's Hospital Research Foundation, Cincinnati, OH 45229, USA.

出版信息

Nucleic Acids Res. 2011 Feb;39(3):795-807. doi: 10.1093/nar/gkq784. Epub 2010 Sep 24.

Abstract

Rapid and accurate identification of new essential genes in under-studied microorganisms will significantly improve our understanding of how a cell works and the ability to re-engineer microorganisms. However, predicting essential genes across distantly related organisms remains a challenge. Here, we present a machine learning-based integrative approach that reliably transfers essential gene annotations between distantly related bacteria. We focused on four bacterial species that have well-characterized essential genes, and tested the transferability between three pairs among them. For each pair, we trained our classifier to learn traits associated with essential genes in one organism, and applied it to make predictions in the other. The predictions were then evaluated by examining the agreements with the known essential genes in the target organism. Ten-fold cross-validation in the same organism yielded AUC scores between 0.86 and 0.93. Cross-organism predictions yielded AUC scores between 0.69 and 0.89. The transferability is likely affected by growth conditions, quality of the training data set and the evolutionary distance. We are thus the first to report that gene essentiality can be reliably predicted using features trained and tested in a distantly related organism. Our approach proves more robust and portable than existing approaches, significantly extending our ability to predict essential genes beyond orthologs.

摘要

快速准确地鉴定研究较少的微生物中的新必需基因,将显著提高我们对细胞工作原理的理解能力,并提高对微生物进行重新设计的能力。然而,在亲缘关系较远的生物之间预测必需基因仍然是一个挑战。在这里,我们提出了一种基于机器学习的综合方法,可以可靠地在亲缘关系较远的细菌之间转移必需基因注释。我们专注于四个具有特征明确的必需基因的细菌物种,并测试了它们之间三个对之间的可转移性。对于每一对,我们训练我们的分类器来学习一个生物体中与必需基因相关的特征,并将其应用于另一个生物体中的预测。然后通过检查与目标生物体中已知必需基因的一致性来评估预测结果。在同一生物体中进行的 10 倍交叉验证得到的 AUC 分数在 0.86 和 0.93 之间。在不同生物体之间的预测得到的 AUC 分数在 0.69 和 0.89 之间。可转移性可能受到生长条件、训练数据集的质量和进化距离的影响。因此,我们是第一个报告可以使用在亲缘关系较远的生物体中训练和测试的特征来可靠地预测基因必需性的人。我们的方法比现有的方法更稳健和可移植,大大扩展了我们预测必需基因的能力,超越了同源基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b41/3035443/d943cbd1caae/gkq784f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验