• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Identification of polymorphic motifs using probabilistic search algorithms.使用概率搜索算法鉴定多态性基序。
Genome Res. 2005 Jan;15(1):67-77. doi: 10.1101/gr.2358005.
2
A fast weak motif-finding algorithm based on community detection in graphs.基于图中社区检测的快速弱模式发现算法。
BMC Bioinformatics. 2013 Jul 17;14:227. doi: 10.1186/1471-2105-14-227.
3
An Algorithm for Motif Discovery with Iteration on Lengths of Motifs.一种基于基序长度迭代的基序发现算法。
IEEE/ACM Trans Comput Biol Bioinform. 2015 Jan-Feb;12(1):136-41. doi: 10.1109/TCBB.2014.2351793.
4
Discovery of protein phosphorylation motifs through exploratory data analysis.通过探索性数据分析发现蛋白质磷酸化基序。
PLoS One. 2011;6(5):e20025. doi: 10.1371/journal.pone.0020025. Epub 2011 May 25.
5
A cluster refinement algorithm for motif discovery.一种用于发现模体的簇精炼算法。
IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):654-68. doi: 10.1109/TCBB.2009.25.
6
An Efficient Algorithm for Discovering Motifs in Large DNA Data Sets.一种在大型DNA数据集中发现基序的高效算法。
IEEE Trans Nanobioscience. 2015 Jul;14(5):535-44. doi: 10.1109/TNB.2015.2421340. Epub 2015 Apr 9.
7
A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs.基于蒙特卡罗的框架增强了调控序列基序的发现和解释。
BMC Bioinformatics. 2012 Nov 27;13:317. doi: 10.1186/1471-2105-13-317.
8
GADEM: a genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery.GADEM:一种遗传算法引导的间隔二元组形成,结合期望最大化算法用于基序发现。
J Comput Biol. 2009 Feb;16(2):317-29. doi: 10.1089/cmb.2008.16TT.
9
Discriminative motif discovery in DNA and protein sequences using the DEME algorithm.使用DEME算法在DNA和蛋白质序列中发现鉴别性基序。
BMC Bioinformatics. 2007 Oct 15;8:385. doi: 10.1186/1471-2105-8-385.
10
SamSelect: a sample sequence selection algorithm for quorum planted motif search on large DNA datasets.SamSelect:一种用于在大型 DNA 数据集上进行约定种植基序搜索的样本序列选择算法。
BMC Bioinformatics. 2018 Jun 18;19(1):228. doi: 10.1186/s12859-018-2242-y.

引用本文的文献

1
The Indian Genome Variation database (IGVdb): a project overview.印度基因组变异数据库(IGVdb):项目概述。
Hum Genet. 2005 Oct;118(1):1-11. doi: 10.1007/s00439-005-0009-9. Epub 2005 Aug 25.

本文引用的文献

1
Ethnic India: a genomic view, with special reference to peopling and structure.印度族群:基因组视角,特别涉及人口迁徙与结构
Genome Res. 2003 Oct;13(10):2277-90. doi: 10.1101/gr.1413403.
2
Human genome. HapMap launched with pledges of $100 million.人类基因组。国际人类基因组单体型图计划启动,承诺投入1亿美元。
Science. 2002 Nov 1;298(5595):941-2. doi: 10.1126/science.298.5595.941a.
3
Detecting recent positive selection in the human genome from haplotype structure.从单倍型结构检测人类基因组中近期的正选择。
Nature. 2002 Oct 24;419(6909):832-7. doi: 10.1038/nature01140. Epub 2002 Oct 9.
4
Expression of QK/QR/RRRAA or DERAA motifs at the third hypervariable region of HLA-DRB1 and disease severity in rheumatoid arthritis.HLA-DRB1第三高变区QK/QR/RRRAA或DERAA基序的表达与类风湿关节炎的疾病严重程度
J Rheumatol. 2002 Jul;29(7):1358-65.
5
SNPs on human chromosomes 21 and 22 -- analysis in terms of protein features and pseudogenes.人类21号和22号染色体上的单核苷酸多态性——基于蛋白质特征和假基因的分析
Pharmacogenomics. 2002 May;3(3):393-402. doi: 10.1517/14622416.3.3.393.
6
High-resolution haplotype structure in the human genome.人类基因组中的高分辨率单倍型结构。
Nat Genet. 2001 Oct;29(2):229-32. doi: 10.1038/ng1001-229.
7
Conserved promoter motif is required for cell cycle timing of dnaX transcription in Caulobacter.保守的启动子基序是新月柄杆菌中dnaX转录的细胞周期定时所必需的。
J Bacteriol. 2001 Aug;183(16):4860-5. doi: 10.1128/JB.183.16.4860-4865.2001.
8
Genetic evidence on the origins of Indian caste populations.关于印度种姓群体起源的遗传学证据。
Genome Res. 2001 Jun;11(6):994-1004. doi: 10.1101/gr.gr-1733rr.
9
A new statistical method for haplotype reconstruction from population data.一种从群体数据中重建单倍型的新统计方法。
Am J Hum Genet. 2001 Apr;68(4):978-89. doi: 10.1086/319501. Epub 2001 Mar 9.
10
Association study designs for complex diseases.复杂疾病的关联研究设计
Nat Rev Genet. 2001 Feb;2(2):91-9. doi: 10.1038/35052543.

使用概率搜索算法鉴定多态性基序。

Identification of polymorphic motifs using probabilistic search algorithms.

作者信息

Basu Analabha, Chaudhuri Probal, Majumder Partha P

机构信息

Human Genetics Unit, Indian Statistical Institute, Kolkata, 700108 India.

出版信息

Genome Res. 2005 Jan;15(1):67-77. doi: 10.1101/gr.2358005.

DOI:10.1101/gr.2358005
PMID:15632091
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC540278/
Abstract

The problem of identifying motifs comprising nucleotides at a set of polymorphic DNA sites, not necessarily contiguous, arises in many human genetic problems. However, when the sites are not contiguous, no efficient algorithm exists for polymorphic motif identification. A search based on complete enumeration is computationally inefficient. We have developed probabilistic search algorithms to discover motifs of known or unknown lengths. We have developed statistical tests of significance for assessing a motif discovery, and a statistical criterion for simultaneously estimating motif length and discovering it. We have tested these algorithms on various synthetic data sets and have shown that they are very efficient, in the sense that the "true" motifs can be detected in the vast majority of replications and in a small number of iterations. Additionally, we have applied them to some real data sets and have shown that they are able to identify known motifs. In certain applications, it is pertinent to find motifs that contain contrasting nucleotides at the sites included in the motif (e.g., motifs identified in case-control association studies). For this, we have suggested appropriate modifications. Using simulations, we have discovered that the success rate of identification of the correct motif is high in case-control studies except when relative risks are small. Our analyses of evolutionary data sets resulted in the identification of some motifs that appear to have important implications on human evolutionary inference. These algorithms can easily be implemented to discover motifs from multilocus genotype data by simple numerical recoding of genotypes.

摘要

在许多人类遗传学问题中,都会出现识别由一组多态性DNA位点(不一定是连续的)上的核苷酸组成的基序的问题。然而,当这些位点不连续时,不存在用于多态性基序识别的有效算法。基于完全枚举的搜索在计算上效率低下。我们开发了概率搜索算法来发现已知或未知长度的基序。我们开发了用于评估基序发现的显著性统计检验,以及用于同时估计基序长度并发现它的统计标准。我们在各种合成数据集上测试了这些算法,并表明它们非常高效,即能够在绝大多数重复中且在少数迭代中检测到“真实”基序。此外,我们将它们应用于一些真实数据集,并表明它们能够识别已知基序。在某些应用中,找到在基序所包含的位点上含有对比核苷酸的基序(例如,在病例对照关联研究中识别出的基序)是相关的。为此,我们提出了适当的修改。通过模拟,我们发现除了相对风险较小时,在病例对照研究中正确基序的识别成功率很高。我们对进化数据集的分析导致识别出一些似乎对人类进化推断有重要意义的基序。通过对基因型进行简单的数字重新编码,这些算法可以很容易地实现从多位点基因型数据中发现基序。