• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

双碱基编码DNA序列的局部比对。

Local alignment of two-base encoded DNA sequence.

作者信息

Homer Nils, Merriman Barry, Nelson Stanley F

机构信息

Department of Computer Science, University of California Los Angeles, Los Angeles, California 90095, USA.

出版信息

BMC Bioinformatics. 2009 Jun 9;10:175. doi: 10.1186/1471-2105-10-175.

DOI:10.1186/1471-2105-10-175
PMID:19508732
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2709925/
Abstract

BACKGROUND

DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity.

RESULTS

We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions.

CONCLUSION

The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data.

摘要

背景

DNA序列比较基于使用相似性得分对两个序列进行最优局部比对。然而,一些新的DNA测序技术并不直接测量碱基序列,而是测量一种编码形式,比如这里所考虑的双碱基编码。为了将此类数据与参考序列进行比较,必须将数据解码为序列。解码是确定性的,但测量误差的可能性要求在所有可能的错误模式及由此产生的比对中进行搜索,以在较少错误与较高序列相似性之间实现最优平衡。

结果

我们提出了一种对局部比对标准动态规划方法的扩展,该方法同时对数据进行解码并执行比对,基于错误和编辑的加权组合最大化相似性得分,并允许仿射空位罚分。我们还展示了模拟结果,这些结果证明了我们的双碱基编码比对方法的性能特征,并在相同条件下将其与标准DNA序列比对进行了对比。

结论

针对双碱基编码数据的新局部比对算法在识别潜在序列变异的同时,具有强大的能力来正确检测和校正测量误差,并有助于基于这种序列数据形式的基因组重测序工作。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/0af3f13d0937/1471-2105-10-175-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/9cfc17579d82/1471-2105-10-175-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/3006b7dfd71a/1471-2105-10-175-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/55db7cdd8a91/1471-2105-10-175-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/8440aa5026dd/1471-2105-10-175-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/8c3a49b9fc79/1471-2105-10-175-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/0af3f13d0937/1471-2105-10-175-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/9cfc17579d82/1471-2105-10-175-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/3006b7dfd71a/1471-2105-10-175-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/55db7cdd8a91/1471-2105-10-175-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/8440aa5026dd/1471-2105-10-175-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/8c3a49b9fc79/1471-2105-10-175-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20df/2709925/0af3f13d0937/1471-2105-10-175-6.jpg

相似文献

1
Local alignment of two-base encoded DNA sequence.双碱基编码DNA序列的局部比对。
BMC Bioinformatics. 2009 Jun 9;10:175. doi: 10.1186/1471-2105-10-175.
2
Local alignment of generalized k-base encoded DNA sequence.广义 k 基编码 DNA 序列的局部比对。
BMC Bioinformatics. 2010 Jun 24;11:347. doi: 10.1186/1471-2105-11-347.
3
Glocal alignment: finding rearrangements during alignment.全局比对:比对过程中发现重排
Bioinformatics. 2003;19 Suppl 1:i54-62. doi: 10.1093/bioinformatics/btg1005.
4
The tree alignment problem.树对齐问题。
BMC Bioinformatics. 2012 Nov 9;13:293. doi: 10.1186/1471-2105-13-293.
5
CSA: an efficient algorithm to improve circular DNA multiple alignment.CSA:一种改进环状DNA多重比对的高效算法。
BMC Bioinformatics. 2009 Jul 23;10:230. doi: 10.1186/1471-2105-10-230.
6
FOGSAA: Fast Optimal Global Sequence Alignment Algorithm.FOGSAA:快速最优全局序列比对算法。
Sci Rep. 2013;3:1746. doi: 10.1038/srep01746.
7
Pairwise alignment of nucleotide sequences using maximal exact matches.使用最大完全匹配进行核苷酸序列的两两比对。
BMC Bioinformatics. 2019 May 21;20(1):261. doi: 10.1186/s12859-019-2827-0.
8
Iterative refinement of structure-based sequence alignments by Seed Extension.通过种子延伸对基于结构的序列比对进行迭代优化。
BMC Bioinformatics. 2009 Jul 9;10:210. doi: 10.1186/1471-2105-10-210.
9
Lower bounds on multiple sequence alignment using exact 3-way alignment.使用精确三元比对的多序列比对下限
BMC Bioinformatics. 2007 Apr 30;8:140. doi: 10.1186/1471-2105-8-140.
10
Highly improved homopolymer aware nucleotide-protein alignments with 454 data.使用 454 数据进行高度改进的同源聚合物识别核苷酸-蛋白质比对。
BMC Bioinformatics. 2012 Sep 12;13:230. doi: 10.1186/1471-2105-13-230.

引用本文的文献

1
Transcriptomics of an extended phenotype: parasite manipulation of wasp social behaviour shifts expression of caste-related genes.一种扩展表型的转录组学:寄生虫对黄蜂社会行为的操控改变了与等级相关基因的表达。
Proc Biol Sci. 2017 Apr 12;284(1852). doi: 10.1098/rspb.2017.0029.
2
eIF2β is critical for eIF5-mediated GDP-dissociation inhibitor activity and translational control.真核生物翻译起始因子2β(eIF2β)对于真核生物翻译起始因子5(eIF5)介导的GDP解离抑制因子活性和翻译控制至关重要。
Nucleic Acids Res. 2016 Nov 16;44(20):9698-9709. doi: 10.1093/nar/gkw657. Epub 2016 Jul 25.
3
Transcript Abundance of Putative Lipid Phosphate Phosphatases During Development of Trypanosoma brucei in the Tsetse Fly.

本文引用的文献

1
SHRiMP: accurate mapping of short color-space reads.SHRiMP:短颜色空间读数的精确映射
PLoS Comput Biol. 2009 May;5(5):e1000386. doi: 10.1371/journal.pcbi.1000386. Epub 2009 May 22.
2
Rapid whole-genome mutational profiling using next-generation sequencing technologies.使用下一代测序技术进行快速全基因组突变分析
Genome Res. 2008 Oct;18(10):1638-42. doi: 10.1101/gr.077776.108. Epub 2008 Sep 4.
3
Mapping short DNA sequencing reads and calling variants using mapping quality scores.使用比对质量分数比对短DNA测序读数并识别变异。
采采蝇体内布氏锥虫发育过程中假定的脂质磷酸磷酸酶的转录本丰度
Am J Trop Med Hyg. 2016 Apr;94(4):890-3. doi: 10.4269/ajtmh.15-0566. Epub 2016 Feb 8.
4
Challenges in exome analysis by LifeScope and its alternative computational pipelines.LifeScope及其替代计算流程在全外显子组分析中的挑战。
BMC Res Notes. 2015 Sep 7;8:421. doi: 10.1186/s13104-015-1385-4.
5
Identifying Highly Penetrant Disease Causal Mutations Using Next Generation Sequencing: Guide to Whole Process.利用新一代测序技术鉴定高穿透性疾病致病突变:全过程指南
Biomed Res Int. 2015;2015:923491. doi: 10.1155/2015/923491. Epub 2015 Apr 6.
6
The Recent De Novo Origin of Protein C-Termini.蛋白质C末端的近期从头起源
Genome Biol Evol. 2015 May 21;7(6):1686-701. doi: 10.1093/gbe/evv098.
7
Ultradeep analysis of tumor heterogeneity in regions of somatic hypermutation.体细胞超突变区域肿瘤异质性的超深度分析
Genome Med. 2015 Mar 12;7(1):24. doi: 10.1186/s13073-015-0147-1. eCollection 2015.
8
Hoxa2 selectively enhances Meis binding to change a branchial arch ground state.Hoxa2选择性增强Meis的结合以改变鳃弓的基础状态。
Dev Cell. 2015 Feb 9;32(3):265-77. doi: 10.1016/j.devcel.2014.12.024. Epub 2015 Jan 29.
9
Intraclonal diversity in follicular lymphoma analyzed by quantitative ultradeep sequencing of noncoding regions.通过非编码区定量超深度测序分析滤泡性淋巴瘤的克隆内多样性。
J Immunol. 2014 Nov 15;193(10):4888-94. doi: 10.4049/jimmunol.1401699. Epub 2014 Oct 13.
10
BatMeth: improved mapper for bisulfite sequencing reads on DNA methylation.BatMeth:用于亚硫酸氢盐测序读取的DNA甲基化改进映射器。
Genome Biol. 2012 Oct 3;13(10):R82. doi: 10.1186/gb-2012-13-10-r82.
Genome Res. 2008 Nov;18(11):1851-8. doi: 10.1101/gr.078212.108. Epub 2008 Aug 19.
4
SOAP: short oligonucleotide alignment program.SOAP:短寡核苷酸比对程序。
Bioinformatics. 2008 Mar 1;24(5):713-4. doi: 10.1093/bioinformatics/btn025. Epub 2008 Jan 28.
5
The diploid genome sequence of an individual human.某个人类个体的二倍体基因组序列。
PLoS Biol. 2007 Sep 4;5(10):e254. doi: 10.1371/journal.pbio.0050254.
6
A general approach to the analysis of errors and failure modes in the base-calling function in automated fluorescent DNA sequencing.自动化荧光DNA测序中碱基识别功能的错误与故障模式分析的通用方法。
Electrophoresis. 2002 Aug;23(16):2720-8. doi: 10.1002/1522-2683(200208)23:16<2720::AID-ELPS2720>3.0.CO;2-Z.
7
PatternHunter: faster and more sensitive homology search.PatternHunter:更快、更灵敏的同源性搜索。
Bioinformatics. 2002 Mar;18(3):440-5. doi: 10.1093/bioinformatics/18.3.440.
8
BLAT--the BLAST-like alignment tool.BLAT——类BLAST比对工具。
Genome Res. 2002 Apr;12(4):656-64. doi: 10.1101/gr.229202.
9
SSAHA: a fast search method for large DNA databases.SSAHA:一种用于大型DNA数据库的快速搜索方法。
Genome Res. 2001 Oct;11(10):1725-9. doi: 10.1101/gr.194201.
10
Improvement of base-calling in multilane automated DNA sequencing by use of electrophoretic calibration standards, data linearization, and trace alignment.通过使用电泳校准标准、数据线性化和峰痕比对改进多通道自动DNA测序中的碱基识别。
Electrophoresis. 2001 Jun;22(10):1906-14. doi: 10.1002/1522-2683(200106)22:10<1906::AID-ELPS1906>3.0.CO;2-5.