• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
A quality control algorithm for DNA sequencing projects.一种用于DNA测序项目的质量控制算法。
Nucleic Acids Res. 1993 Aug 11;21(16):3829-38. doi: 10.1093/nar/21.16.3829.
2
Ancient conserved regions in new gene sequences and the protein databases.新基因序列和蛋白质数据库中的古老保守区域。
Science. 1993 Mar 19;259(5102):1711-6. doi: 10.1126/science.8456298.
3
A fast algorithm for genome-wide analysis of proteins with repeated sequences.一种用于对具有重复序列的蛋白质进行全基因组分析的快速算法。
Proteins. 1999 Jun 1;35(4):440-6.
4
Species-specific patterns of DNA bending and sequence.DNA弯曲和序列的物种特异性模式。
Nucleic Acids Res. 1991 Oct 11;19(19):5253-61. doi: 10.1093/nar/19.19.5253.
5
acdc - Automated Contamination Detection and Confidence estimation for single-cell genome data.ACDC - 单细胞基因组数据的自动污染检测与置信度估计
BMC Bioinformatics. 2016 Dec 20;17(1):543. doi: 10.1186/s12859-016-1397-7.
6
A frameshift error detection algorithm for DNA sequencing projects.一种用于DNA测序项目的移码错误检测算法。
Nucleic Acids Res. 1995 Aug 11;23(15):2900-8. doi: 10.1093/nar/23.15.2900.
7
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
8
The long N-terminus of the C. elegans DNA repair enzyme APN-1 targets the protein to the nucleus of a heterologous system.线虫DNA修复酶APN-1的长N端将该蛋白质靶向异源系统的细胞核。
Gene. 2014 Dec 15;553(2):151-7. doi: 10.1016/j.gene.2014.10.016. Epub 2014 Oct 13.
9
Improved database searches for orthologous sequences by conditioning on outgroup sequences.通过以外群序列为条件来改进直系同源序列的数据库搜索。
Bioinformatics. 2002 Jan;18(1):83-91. doi: 10.1093/bioinformatics/18.1.83.
10
FingerPRINTScan: intelligent searching of the PRINTS motif database.指纹扫描:PRINTS基序数据库的智能搜索
Bioinformatics. 1999 Oct;15(10):799-806. doi: 10.1093/bioinformatics/15.10.799.

引用本文的文献

1
VecScreen_plus_taxonomy: imposing a tax(onomy) increase on vector contamination screening.VecScreen_plus_taxonomy:对载体污染筛查施加分类学税(onomy)增加。
Bioinformatics. 2018 Mar 1;34(5):755-759. doi: 10.1093/bioinformatics/btx669.
2
Mobilomics in Saccharomyces cerevisiae strains.酵母细胞中的运动组学。
BMC Bioinformatics. 2013 Mar 20;14:102. doi: 10.1186/1471-2105-14-102.
3
Classifying coding DNA with nucleotide statistics.利用核苷酸统计对编码DNA进行分类。
Bioinform Biol Insights. 2009 Oct 28;3:141-54. doi: 10.4137/bbi.s3030.
4
Comparative analysis of environmental sequences: potential and challenges.环境序列的比较分析:潜力与挑战
Philos Trans R Soc Lond B Biol Sci. 2006 Mar 29;361(1467):519-23. doi: 10.1098/rstb.2005.1809.
5
On the species of origin: diagnosing the source of symbiotic transcripts.关于起源物种:诊断共生转录本的来源。
Genome Biol. 2001;2(9):RESEARCH0037. doi: 10.1186/gb-2001-2-9-research0037. Epub 2001 Aug 23.
6
Detecting and analyzing DNA sequencing errors: toward a higher quality of the Bacillus subtilis genome sequence.检测和分析DNA测序错误:迈向更高质量的枯草芽孢杆菌基因组序列
Genome Res. 1999 Nov;9(11):1116-27. doi: 10.1101/gr.9.11.1116.
7
Contamination of cDNA- libraries and expressed-sequence-tags databases.互补DNA文库和表达序列标签数据库的污染。
Am J Hum Genet. 1995 Nov;57(5):1254-5.

本文引用的文献

1
Contamination of cDNA sequences in databases.数据库中cDNA序列的污染。
Science. 1993 Mar 19;259(5102):1677-8. doi: 10.1126/science.8456288.
2
The frequency of oligonucleotides in mammalian genic regions.哺乳动物基因区域中寡核苷酸的频率。
Comput Appl Biosci. 1989 Feb;5(1):33-40. doi: 10.1093/bioinformatics/5.1.33.
3
Sequence of an unusually large protein implicated in regulation of myosin activity in C. elegans.与秀丽隐杆线虫肌球蛋白活性调节相关的一种异常大的蛋白质的序列。
Nature. 1989 Nov 2;342(6245):45-50. doi: 10.1038/342045a0.
4
Bacterial evolution.细菌进化
Microbiol Rev. 1987 Jun;51(2):221-71. doi: 10.1128/mr.51.2.221-271.1987.
5
Linguistic measure of taxonomic and functional relatedness of nucleotide sequences.核苷酸序列的分类学和功能相关性的语言学度量。
J Biomol Struct Dyn. 1990 Jun;7(6):1251-68. doi: 10.1080/07391102.1990.10508563.
6
Basic local alignment search tool.基本局部比对搜索工具
J Mol Biol. 1990 Oct 5;215(3):403-10. doi: 10.1016/S0022-2836(05)80360-2.
7
Complementary DNA sequencing: expressed sequence tags and human genome project.互补DNA测序:表达序列标签与人类基因组计划
Science. 1991 Jun 21;252(5013):1651-6. doi: 10.1126/science.2047873.
8
An analysis of the origin of metazoans, using comparisons of partial sequences of the 28S RNA, reveals an early emergence of triploblasts.通过对28S RNA部分序列进行比较来分析后生动物的起源,结果显示三胚层动物出现得很早。
EMBO J. 1991 Mar;10(3):499-503. doi: 10.1002/j.1460-2075.1991.tb07975.x.
9
Compositional variations in DNA sequences.DNA序列中的组成变化。
Comput Appl Biosci. 1991 Jul;7(3):287-93. doi: 10.1093/bioinformatics/7.3.287.
10
Over- and under-representation of short oligonucleotides in DNA sequences.DNA序列中短寡核苷酸的过度和不足表现
Proc Natl Acad Sci U S A. 1992 Feb 15;89(4):1358-62. doi: 10.1073/pnas.89.4.1358.

一种用于DNA测序项目的质量控制算法。

A quality control algorithm for DNA sequencing projects.

作者信息

White O, Dunning T, Sutton G, Adams M, Venter J C, Fields C

机构信息

Institute for Genomic Research, Gaithersburg, MD 20878.

出版信息

Nucleic Acids Res. 1993 Aug 11;21(16):3829-38. doi: 10.1093/nar/21.16.3829.

DOI:10.1093/nar/21.16.3829
PMID:8367301
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC309901/
Abstract

Heterologous DNA sequences from rearrangements with the genomes of host cells, genomic fragments from hybrid cells, or impure tissue sources can threaten the purity of libraries that are derived from RNA or DNA. Hybridization methods can only detect contaminants from known or suspected heterologous sources, and whole library screening is technically very difficult. Detection of contaminating heterologous clones by sequence alignment is only possible when related sequences are present in a known database. We have developed a statistical test to identify heterologous sequences that is based on the differences in hexamer composition of DNA from different organisms. This test does not require that sequences similar to potential heterologous contaminants are present in the database, and can in principle detect contamination by previously unknown organisms. We have applied this test to the major public expressed sequence tag (EST) data sets to evaluate its utility as a quality control measure and a peer evaluation tool. There is detectable heterogeneity in most human and C.elegans EST data sets but it is not apparently associated with cross-species contamination. However, there is direct evidence for both yeast and bacterial sequence contamination in some public database sequences annotated as human. Results obtained with the hexamer test have been confirmed with similarity searches using sequences from the relevant data sets.

摘要

与宿主细胞基因组重排产生的异源DNA序列、杂交细胞的基因组片段或不纯的组织来源可能会威胁到源自RNA或DNA的文库的纯度。杂交方法只能检测来自已知或疑似异源来源的污染物,而对整个文库进行筛选在技术上非常困难。只有当相关序列存在于已知数据库中时,才能通过序列比对检测出污染的异源克隆。我们开发了一种基于不同生物体DNA六聚体组成差异的统计测试方法来识别异源序列。该测试不需要数据库中存在与潜在异源污染物相似的序列,原则上可以检测出以前未知生物体的污染。我们已将此测试应用于主要的公共表达序列标签(EST)数据集,以评估其作为质量控制措施和同行评估工具的效用。在大多数人类和秀丽隐杆线虫EST数据集中可检测到异质性,但这显然与跨物种污染无关。然而,在一些注释为人类的公共数据库序列中,有直接证据表明存在酵母和细菌序列污染。使用相关数据集的序列进行相似性搜索,已证实了六聚体测试的结果。