• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用新的核苷酸间距离序列进行序列比较和必需基因鉴定。

Sequence comparison and essential gene identification with new inter-nucleotide distance sequences.

作者信息

Li Yushuang, Lv Yanfen, Li Xiaonan, Xiao Wenli, Li Chun

机构信息

School of Science, Yanshan University, Qinhuangdao 066004, PR China.

School of Science, Yanshan University, Qinhuangdao 066004, PR China.

出版信息

J Theor Biol. 2017 Apr 7;418:84-93. doi: 10.1016/j.jtbi.2017.01.031. Epub 2017 Jan 27.

DOI:10.1016/j.jtbi.2017.01.031
PMID:28137599
Abstract

Four new inter-nucleotide distance sequences for a DNA sequence are defined. They are different from ones presented by Afreixo et al., and overcome the irreversible defect of the global inter-nucleotide distance sequence proposed by Nair and Mahalakshmi. Five basic statistical quantities are extracted from (ordered) precise inter-nucleotide distance sequences to construct a 20 dimensional feature vector. This simple mathematical descriptor of DNA sequence plays crucial roles in sequence comparison and essential gene identification. Euclidean distance between feature vectors is utilized to compare similarities among whole mitochondrial genomes of 18 eutherian mammals and 23 sequences of 16S ribosomal RNA, respectively. Derived phylogenetic trees are quite agreement with a few popular studies. Furthermore, using feature vector as input a support vector machine (SVM)-based method are developed to identify essential genes and non-essential genes of 5 bacteria. Higher AUC values (the minimum is 0.7971, the highest reaches 0.8751 and the average is 0.8174) than some well-known results confirm the performance of the method.

摘要

定义了DNA序列的四种新的核苷酸间距序列。它们不同于阿弗雷肖等人提出的序列,并且克服了奈尔和玛哈拉克希米提出的全局核苷酸间距序列的不可逆缺陷。从(有序的)精确核苷酸间距序列中提取五个基本统计量,以构建一个20维特征向量。这种简单的DNA序列数学描述符在序列比较和关键基因识别中起着至关重要的作用。分别利用特征向量之间的欧几里得距离来比较18种真兽类哺乳动物的全线粒体基因组和16S核糖体RNA的23个序列之间的相似性。推导得到的系统发育树与一些流行研究的结果相当一致。此外,以特征向量作为输入,开发了一种基于支持向量机(SVM)的方法来识别5种细菌的必需基因和非必需基因。比一些知名结果更高的AUC值(最小值为0.7971,最高达到0.8751,平均为0.8174)证实了该方法的性能。

相似文献

1
Sequence comparison and essential gene identification with new inter-nucleotide distance sequences.利用新的核苷酸间距离序列进行序列比较和必需基因鉴定。
J Theor Biol. 2017 Apr 7;418:84-93. doi: 10.1016/j.jtbi.2017.01.031. Epub 2017 Jan 27.
2
RibAlign: a software tool and database for eubacterial phylogeny based on concatenated ribosomal protein subunits.RibAlign:一种基于串联核糖体蛋白亚基的真细菌系统发育分析的软件工具和数据库。
BMC Bioinformatics. 2006 Feb 13;7:66. doi: 10.1186/1471-2105-7-66.
3
Evaluation of different partial 16S rRNA gene sequence regions for phylogenetic analysis of microbiomes.评价不同的部分 16S rRNA 基因序列区域用于微生物组的系统发育分析。
J Microbiol Methods. 2011 Jan;84(1):81-7. doi: 10.1016/j.mimet.2010.10.020. Epub 2010 Oct 31.
4
16S rRNA gene sequencing for bacterial pathogen identification in the clinical laboratory.临床实验室中用于细菌病原体鉴定的16S rRNA基因测序
Mol Diagn. 2001 Dec;6(4):313-21. doi: 10.1054/modi.2001.29158.
5
[Determination of the taxonomic position of bacteria from Lake Baikal using sequence analysis of 16S rRNA fragments].[利用16S rRNA片段序列分析确定贝加尔湖细菌的分类地位]
Mikrobiologiia. 1996 Nov-Dec;65(6):855-64.
6
repRNA: a web server for generating various feature vectors of RNA sequences.repRNA:一个用于生成RNA序列各种特征向量的网络服务器。
Mol Genet Genomics. 2016 Feb;291(1):473-81. doi: 10.1007/s00438-015-1078-7. Epub 2015 Jun 18.
7
16S rDNA-based identification of bacteria from conjunctival swabs by PCR and DGGE fingerprinting.通过聚合酶链反应(PCR)和变性梯度凝胶电泳(DGGE)指纹图谱技术,基于16S核糖体DNA(rDNA)对结膜拭子中的细菌进行鉴定。
Invest Ophthalmol Vis Sci. 2001 May;42(6):1164-71.
8
Sequence heterogeneities among 16S ribosomal RNA sequences, and their effect on phylogenetic analyses at the species level.16S核糖体RNA序列间的序列异质性及其对物种水平系统发育分析的影响。
Mol Biol Evol. 1996 Mar;13(3):451-61. doi: 10.1093/oxfordjournals.molbev.a025606.
9
Widespread occurrence of a novel division of bacteria identified by 16S rRNA gene sequences originally found in deep marine sediments.最初在深海沉积物中发现的一种通过16S rRNA基因序列鉴定的新型细菌分类广泛存在。
Appl Environ Microbiol. 2004 Sep;70(9):5708-13. doi: 10.1128/AEM.70.9.5708-5713.2004.
10
Translation initiation modeling and mutational analysis based on the 3(')-end of the Escherichia coli 16S rRNA sequence.基于大肠杆菌16S rRNA序列3(')末端的翻译起始建模与突变分析。
Biosystems. 2009 Apr;96(1):58-64. doi: 10.1016/j.biosystems.2008.11.008. Epub 2008 Nov 25.

引用本文的文献

1
Evaluation of machine learning classifiers for predicting essential genes in strains.用于预测菌株中必需基因的机器学习分类器评估
Bioinformation. 2022 Dec 31;18(12):1126-1130. doi: 10.6026/973206300181126. eCollection 2022.
2
Sequence-based information-theoretic features for gene essentiality prediction.用于基因必需性预测的基于序列的信息论特征。
BMC Bioinformatics. 2017 Nov 9;18(1):473. doi: 10.1186/s12859-017-1884-5.