• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用 strobe reads 进行结构变异分析。

Structural variation analysis with strobe reads.

机构信息

Department of Computer Science, Brown University, Providence, RI 02912, USA.

出版信息

Bioinformatics. 2010 May 15;26(10):1291-8. doi: 10.1093/bioinformatics/btq153. Epub 2010 Apr 8.

DOI:10.1093/bioinformatics/btq153
PMID:20378554
Abstract

MOTIVATION

Structural variation including deletions, duplications and rearrangements of DNA sequence are an important contributor to genome variation in many organisms. In human, many structural variants are found in complex and highly repetitive regions of the genome making their identification difficult. A new sequencing technology called strobe sequencing generates strobe reads containing multiple subreads from a single contiguous fragment of DNA. Strobe reads thus generalize the concept of paired reads, or mate pairs, that have been routinely used for structural variant detection. Strobe sequencing holds promise for unraveling complex variants that have been difficult to characterize with current sequencing technologies.

RESULTS

We introduce an algorithm for identification of structural variants using strobe sequencing data. We consider strobe reads from a test genome that have multiple possible alignments to a reference genome due to sequencing errors and/or repetitive sequences in the reference. We formulate the combinatorial optimization problem of finding the minimum number of structural variants in the test genome that are consistent with these alignments. We solve this problem using an integer linear program. Using simulated strobe sequencing data, we show that our algorithm has better sensitivity and specificity than paired read approaches for structural variation identification.

CONTACT

braphael@brown.edu

摘要

动机

包括 DNA 序列缺失、重复和重排在内的结构变异是许多生物体基因组变异的一个重要贡献因素。在人类中,许多结构变体存在于基因组的复杂和高度重复区域,使得它们的识别变得困难。一种称为频闪测序的新型测序技术从单个连续 DNA 片段生成包含多个子读数的频闪读数。因此,频闪读数扩展了已常规用于结构变体检测的配对读数或配对的概念。频闪测序有望解决当前测序技术难以描述的复杂变体。

结果

我们介绍了一种使用频闪测序数据识别结构变体的算法。我们考虑由于测序错误和/或参考基因组中的重复序列而在测试基因组中具有多个可能与参考基因组对齐的频闪读数。我们将找到与这些对齐一致的测试基因组中最小数量的结构变体的组合优化问题制定出来。我们使用整数线性规划来解决这个问题。使用模拟的频闪测序数据,我们表明,我们的算法在结构变异识别方面的灵敏度和特异性均优于配对读取方法。

联系信息

braphael@brown.edu

相似文献

1
Structural variation analysis with strobe reads.使用 strobe reads 进行结构变异分析。
Bioinformatics. 2010 May 15;26(10):1291-8. doi: 10.1093/bioinformatics/btq153. Epub 2010 Apr 8.
2
Whole genome sequencing.全基因组测序
Methods Mol Biol. 2010;628:215-26. doi: 10.1007/978-1-60327-367-1_12.
3
Characterizing and interpreting genetic variation from personal genome sequencing.对个人基因组测序中的基因变异进行表征和解读。
Methods Mol Biol. 2012;838:343-67. doi: 10.1007/978-1-61779-507-7_17.
4
Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology.使用454生命科学技术对蒺藜苜蓿表达序列标签进行测序。
BMC Genomics. 2006 Oct 24;7:272. doi: 10.1186/1471-2164-7-272.
5
Correction of sequencing errors in a mixed set of reads.纠正混合读取集中的测序错误。
Bioinformatics. 2010 May 15;26(10):1284-90. doi: 10.1093/bioinformatics/btq151. Epub 2010 Apr 8.
6
Detection and interpretation of genomic structural variation in mammals.哺乳动物基因组结构变异的检测与解读。
Methods Mol Biol. 2012;838:225-48. doi: 10.1007/978-1-61779-507-7_11.
7
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
8
Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays.基于自组装 DNA 纳米阵列的无链碱基读取进行人类基因组测序。
Science. 2010 Jan 1;327(5961):78-81. doi: 10.1126/science.1181498. Epub 2009 Nov 5.
9
Consensus generation and variant detection by Celera Assembler.通过Celera Assembler进行共识序列生成和变异检测。
Bioinformatics. 2008 Apr 15;24(8):1035-40. doi: 10.1093/bioinformatics/btn074. Epub 2008 Mar 4.
10
Whole genome assembly from 454 sequencing output via modified DNA graph concept.通过改进的DNA图谱概念从454测序输出进行全基因组组装。
Comput Biol Chem. 2009 Jun;33(3):224-30. doi: 10.1016/j.compbiolchem.2009.04.005. Epub 2009 May 3.

引用本文的文献

1
Multidisciplinary approaches for elucidating genetics and molecular pathogenesis of urinary tract malformations.多学科方法阐明尿路畸形的遗传学和分子发病机制。
Kidney Int. 2022 Mar;101(3):473-484. doi: 10.1016/j.kint.2021.09.034. Epub 2021 Nov 12.
2
Effective sequence similarity detection with strobemers.利用频闪体进行有效的序列相似性检测。
Genome Res. 2021 Nov;31(11):2080-2094. doi: 10.1101/gr.275648.121. Epub 2021 Oct 19.
3
Biophysics and the Genomic Sciences.生物物理学与基因组科学。
Biophys J. 2019 Dec 3;117(11):2047-2053. doi: 10.1016/j.bpj.2019.07.038. Epub 2019 Jul 30.
4
Identifying structural variants using linked-read sequencing data.使用连接读长测序数据鉴定结构变异体。
Bioinformatics. 2018 Jan 15;34(2):353-360. doi: 10.1093/bioinformatics/btx712.
5
Characterization of structural variants with single molecule and hybrid sequencing approaches.采用单分子和混合测序方法进行结构变异的特征分析。
Bioinformatics. 2014 Dec 15;30(24):3458-66. doi: 10.1093/bioinformatics/btu714. Epub 2014 Oct 28.
6
PBHoney: identifying genomic variants via long-read discordance and interrupted mapping.PBHoney:通过长读段不一致性和中断映射识别基因组变异体。
BMC Bioinformatics. 2014 Jun 10;15:180. doi: 10.1186/1471-2105-15-180.
7
Chapter 15: disease gene prioritization.第 15 章:疾病基因优先级排序。
PLoS Comput Biol. 2013 Apr;9(4):e1002902. doi: 10.1371/journal.pcbi.1002902. Epub 2013 Apr 25.
8
CGAP-align: a high performance DNA short read alignment tool.CGAP-align:一款高性能的 DNA 短读序列比对工具。
PLoS One. 2013 Apr 11;8(4):e61033. doi: 10.1371/journal.pone.0061033. Print 2013.
9
Chapter 6: Structural variation and medical genomics.第六章:结构变异与医学基因组学。
PLoS Comput Biol. 2012;8(12):e1002821. doi: 10.1371/journal.pcbi.1002821. Epub 2012 Dec 27.
10
QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.QColors:一种用于从短且不连续的下一代测序读数中保守重建病毒准种的算法。
In Silico Biol. 2011;11(5-6):193-201. doi: 10.3233/ISB-2012-0454.