• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于酵母、秀丽隐杆线虫和人类的编码与非编码DNA序列分类的有效统计特征。

Effective statistical features for coding and non-coding DNA sequence classification for yeast, C. elegans and human.

作者信息

Liew Alan Wee-Chung, Wu Yonghui, Yan Hong, Yang Mengsu

出版信息

Int J Bioinform Res Appl. 2005;1(2):181-201. doi: 10.1504/IJBRA.2005.007577.

DOI:10.1504/IJBRA.2005.007577
PMID:18048129
Abstract

This study performs a quantitative evaluation of the different coding features in terms of their information content for the classification of coding and non-coding regions for three species. Our study indicated that coding features that are effective for yeast or C. elegans are generally not very effective for human, which has a short average exon length. By performing a correlation analysis, we identified a subset of human coding features with high discriminative power, but complementary in their information content. For this subset, a classification accuracy of up to 90% was obtained using a simple kNN classifier.

摘要

本研究针对三种物种的编码区和非编码区分类,对不同编码特征的信息含量进行了定量评估。我们的研究表明,对酵母或秀丽隐杆线虫有效的编码特征通常对人类不太有效,因为人类的外显子平均长度较短。通过进行相关性分析,我们确定了一组具有高鉴别力但信息含量互补的人类编码特征。对于该子集,使用简单的kNN分类器可获得高达90%的分类准确率。

相似文献

1
Effective statistical features for coding and non-coding DNA sequence classification for yeast, C. elegans and human.用于酵母、秀丽隐杆线虫和人类的编码与非编码DNA序列分类的有效统计特征。
Int J Bioinform Res Appl. 2005;1(2):181-201. doi: 10.1504/IJBRA.2005.007577.
2
Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.用于区分蛋白质编码区域与其他区域的离散拉马努金变换。
Mol Cell Probes. 2014 Oct-Dec;28(5-6):228-36. doi: 10.1016/j.mcp.2014.04.002. Epub 2014 Apr 29.
3
Conservation of function and expression of unc-119 from two Caenorhabditis species despite divergence of non-coding DNA.尽管非编码DNA存在差异,但两种秀丽隐杆线虫物种中unc-119的功能和表达仍保持保守。
Gene. 1996 Dec 12;183(1-2):77-85. doi: 10.1016/s0378-1119(96)00491-x.
4
Determination of eukaryotic protein coding regions using neural networks and information theory.使用神经网络和信息论确定真核生物蛋白质编码区域
J Mol Biol. 1992 Jul 20;226(2):471-9. doi: 10.1016/0022-2836(92)90961-i.
5
Segmentation of short human exons based on spectral features of double curves.基于双曲线光谱特征的人类短外显子分割
Int J Data Min Bioinform. 2008;2(1):15-35. doi: 10.1504/ijdmb.2008.016754.
6
Conservation of sequence and function of the pag-3 genes from C. elegans and C. briggsae.秀丽隐杆线虫和briggsae线虫pag-3基因的序列与功能保守性。
Gene. 2000 Feb 8;243(1-2):67-74. doi: 10.1016/s0378-1119(99)00560-0.
7
Cloning and characterization of the C. elegans histidyl-tRNA synthetase gene.秀丽隐杆线虫组氨酰-tRNA合成酶基因的克隆与特性分析
Nucleic Acids Res. 1993 Sep 11;21(18):4344-7. doi: 10.1093/nar/21.18.4344.
8
Heterogeneity of mRNA coding for Caenorhabditis elegans coronin-like protein.编码秀丽隐杆线虫类冠蛋白的mRNA的异质性。
Gene. 2001 Jun 27;271(2):255-9. doi: 10.1016/s0378-1119(01)00509-1.
9
Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence.通过DNA序列的三碱基周期性分析预测蛋白质编码区域。
J Theor Biol. 2007 Aug 21;247(4):687-94. doi: 10.1016/j.jtbi.2007.03.038. Epub 2007 Apr 10.
10
Transduplication resulted in the incorporation of two protein-coding sequences into the turmoil-1 transposable element of C. elegans.转重复导致两个蛋白质编码序列并入秀丽隐杆线虫的turmoil-1转座元件中。
Biol Direct. 2008 Oct 8;3:41. doi: 10.1186/1745-6150-3-41.

引用本文的文献

1
STR-based feature extraction and selection for genetic feature discovery in neurological disease genes.基于 STR 的特征提取和选择在神经疾病基因中的遗传特征发现。
Sci Rep. 2023 Feb 11;13(1):2480. doi: 10.1038/s41598-023-29376-4.
2
LPS-induced galectin-3 oligomerization results in enhancement of neutrophil activation.脂多糖诱导半乳糖凝集素-3 寡聚化导致中性粒细胞激活增强。
PLoS One. 2011;6(10):e26004. doi: 10.1371/journal.pone.0026004. Epub 2011 Oct 21.
3
Multi-scale parametric spectral analysis for exon detection in DNA sequences based on forward-backward linear prediction and singular value decomposition of the double-base curves.
基于双碱基曲线的前后向线性预测和奇异值分解的DNA序列外显子检测多尺度参数谱分析
Bioinformation. 2008 Feb 12;2(7):273-8. doi: 10.6026/97320630002273.
4
On relationship of Z-curve and Fourier approaches for DNA coding sequence classification.关于Z曲线与傅里叶方法在DNA编码序列分类中的关系。
Bioinformation. 2006 Nov 14;1(7):242-6. doi: 10.6026/97320630001242.