• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DisVar:一个用于利用大规模个人基因信息识别与疾病相关变异的R语言库。

DisVar: an R library for identifying variants associated with diseases using large-scale personal genetic information.

作者信息

Chanasongkhram Khunanon, Damkliang Kasikrit, Sangket Unitsa

机构信息

Division of Biological Science, Faculty of Science, Prince of Songkla University, Hat Yai, Songkhla, Thailand.

Division of Computational Science, Faculty of Science, Prince of Songkla University, Hat Yai, Songkhla, Thailand.

出版信息

PeerJ. 2023 Sep 28;11:e16086. doi: 10.7717/peerj.16086. eCollection 2023.

DOI:10.7717/peerj.16086
PMID:37790633
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10542659/
Abstract

BACKGROUND

Genetic variants may potentially play a contributing factor in the development of diseases. Several genetic disease databases are used in medical research and diagnosis but the web applications used to search these databases for disease-associated variants have limitations. The application may not be able to search for large-scale genetic variants, the results of searches may be difficult to interpret and variants mapped from the latest reference genome (GRCH38/hg38) may not be supported.

METHODS

In this study, we developed a novel R library called "DisVar" to identify disease-associated genetic variants in large-scale individual genomic data. This R library is compatible with variants from the latest reference genome version. DisVar uses five databases of disease-associated variants. Over 100 million variants can be simultaneously searched for specific associated diseases.

RESULTS

The package was evaluated using 24 Variant Call Format (VCF) files (215,054 to 11,346,899 sites) from the 1000 Genomes Project. Disease-associated variants were detected in 298,227 hits across all the VCF files, taking a total of 63.58 m to complete. The package was also tested on ClinVar's VCF file (2,120,558 variants), where 20,657 hits associated with diseases were identified with an estimated elapsed time of 45.98 s.

CONCLUSIONS

DisVar can overcome the limitations of existing tools and is a fast and effective diagnostic and preventive tool that identifies disease-associated variations from large-scale genetic variants against the latest reference genome.

摘要

背景

基因变异可能在疾病发展中发挥促成作用。医学研究和诊断中使用了多个遗传疾病数据库,但用于在这些数据库中搜索疾病相关变异的网络应用存在局限性。该应用可能无法搜索大规模基因变异,搜索结果可能难以解释,并且可能不支持从最新参考基因组(GRCH38/hg38)映射的变异。

方法

在本研究中,我们开发了一个名为“DisVar”的新型R库,用于在大规模个体基因组数据中识别疾病相关的基因变异。这个R库与最新参考基因组版本的变异兼容。DisVar使用五个疾病相关变异数据库。可以同时搜索超过1亿个变异以查找特定的相关疾病。

结果

使用来自千人基因组计划的24个变异调用格式(VCF)文件(215,054至11,346,899个位点)对该软件包进行了评估。在所有VCF文件中的298,227次命中中检测到了疾病相关变异,总共耗时63.58分钟完成。该软件包还在ClinVar的VCF文件(2,120,558个变异)上进行了测试,在该文件中识别出了20,657个与疾病相关的命中,估计耗时45.98秒。

结论

DisVar可以克服现有工具的局限性,是一种快速有效的诊断和预防工具,可根据最新参考基因组从大规模基因变异中识别疾病相关变异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/766c/10542659/ac7cda5f54f9/peerj-11-16086-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/766c/10542659/84454fde5250/peerj-11-16086-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/766c/10542659/11ad276a7c2a/peerj-11-16086-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/766c/10542659/ac7cda5f54f9/peerj-11-16086-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/766c/10542659/84454fde5250/peerj-11-16086-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/766c/10542659/11ad276a7c2a/peerj-11-16086-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/766c/10542659/ac7cda5f54f9/peerj-11-16086-g003.jpg

相似文献

1
DisVar: an R library for identifying variants associated with diseases using large-scale personal genetic information.DisVar:一个用于利用大规模个人基因信息识别与疾病相关变异的R语言库。
PeerJ. 2023 Sep 28;11:e16086. doi: 10.7717/peerj.16086. eCollection 2023.
2
Variant graph craft (VGC): a comprehensive tool for analyzing genetic variation and identifying disease-causing variants.变体图工艺(VGC):一种全面的分析遗传变异和识别致病变异的工具。
BMC Bioinformatics. 2024 Sep 3;25(1):288. doi: 10.1186/s12859-024-05875-7.
3
VCF-Miner: GUI-based application for mining variants and annotations stored in VCF files.VCF-Miner:用于挖掘存储在VCF文件中的变异和注释的基于图形用户界面的应用程序。
Brief Bioinform. 2016 Mar;17(2):346-51. doi: 10.1093/bib/bbv051. Epub 2015 Jul 25.
4
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
5
VCF-Server: A web-based visualization tool for high-throughput variant data mining and management.VCF-Server:一个基于网络的高通量变异数据挖掘和管理的可视化工具。
Mol Genet Genomic Med. 2019 Jul;7(7):e00641. doi: 10.1002/mgg3.641. Epub 2019 May 24.
6
Improved VCF normalization for accurate VCF comparison.改进VCF标准化以实现准确的VCF比较。
Bioinformatics. 2017 Apr 1;33(7):964-970. doi: 10.1093/bioinformatics/btw748.
7
SEQMINER: An R-Package to Facilitate the Functional Interpretation of Sequence-Based Associations.SEQMINER:一个用于促进基于序列关联的功能解释的R包。
Genet Epidemiol. 2015 Dec;39(8):619-23. doi: 10.1002/gepi.21918. Epub 2015 Sep 23.
8
Variant Tool Chest: an improved tool to analyze and manipulate variant call format (VCF) files.变异工具工具箱:一种改进的工具,用于分析和操作变异调用格式 (VCF) 文件。
BMC Bioinformatics. 2014;15 Suppl 7(Suppl 7):S12. doi: 10.1186/1471-2105-15-S7-S12. Epub 2014 May 28.
9
gSearch: a fast and flexible general search tool for whole-genome sequencing.gSearch:一种快速灵活的全基因组测序通用搜索工具。
Bioinformatics. 2012 Aug 15;28(16):2176-7. doi: 10.1093/bioinformatics/bts358. Epub 2012 Jun 23.
10
WhopGenome: high-speed access to whole-genome variation and sequence data in R.WhopGenome:R 语言中快速访问全基因组变异和序列数据的工具。
Bioinformatics. 2015 Feb 1;31(3):413-5. doi: 10.1093/bioinformatics/btu636. Epub 2014 Oct 1.

引用本文的文献

1
VOE: automated analysis of variant epitopes of SARS-CoV-2 for the development of diagnostic tests or vaccines for COVID-19.病毒变异株电子分析:用于开发 COVID-19 诊断检测或疫苗的 SARS-CoV-2 变异表位的自动化分析。
PeerJ. 2024 Jun 19;12:e17504. doi: 10.7717/peerj.17504. eCollection 2024.

本文引用的文献

1
bestDEG: a web-based application automatically combines various tools to precisely predict differentially expressed genes (DEGs) from RNA-Seq data.bestDEG:一个基于网络的应用程序,可以自动结合各种工具,从 RNA-Seq 数据中准确预测差异表达基因(DEGs)。
PeerJ. 2022 Nov 10;10:e14344. doi: 10.7717/peerj.14344. eCollection 2022.
2
MutationTaster2021.MutationTaster2021.
Nucleic Acids Res. 2021 Jul 2;49(W1):W446-W451. doi: 10.1093/nar/gkab266.
3
The influence of evolutionary history on human health and disease.进化史对人类健康和疾病的影响。
Nat Rev Genet. 2021 May;22(5):269-283. doi: 10.1038/s41576-020-00305-9. Epub 2021 Jan 6.
4
Development of Statistical Software for the Korean Laboratory Accreditation Program Using R Language: LaboStats.使用 R 语言开发韩国实验室认可计划统计软件:LaboStats。
Ann Lab Med. 2019 Nov;39(6):552-560. doi: 10.3343/alm.2019.39.6.552.
5
Benefits and limitations of genome-wide association studies.全基因组关联研究的优势和局限性。
Nat Rev Genet. 2019 Aug;20(8):467-484. doi: 10.1038/s41576-019-0127-1.
6
The genetic basis of disease.疾病的遗传基础。
Essays Biochem. 2018 Dec 2;62(5):643-723. doi: 10.1042/EBC20170053. Print 2018 Dec 3.
7
The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019.NHGRI-EBI GWAS Catalog 于 2019 年发布的已发表全基因组关联研究、靶向基因芯片和汇总统计数据
Nucleic Acids Res. 2019 Jan 8;47(D1):D1005-D1012. doi: 10.1093/nar/gky1120.
8
GWAS4D: multidimensional analysis of context-specific regulatory variant for human complex diseases and traits.GWAS4D:人类复杂疾病和特征的上下文特定调控变体的多维分析。
Nucleic Acids Res. 2018 Jul 2;46(W1):W114-W120. doi: 10.1093/nar/gky407.
9
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.
10
SNPer: an R library for quantitative variant analysis on single nucleotide polymorphisms among influenza virus populations.SNPer:一个用于流感病毒群体中单核苷酸多态性定量变异分析的 R 语言库。
PLoS One. 2015 Apr 13;10(4):e0122812. doi: 10.1371/journal.pone.0122812. eCollection 2015.