乳腺癌的超级变体鉴定。

Supervariants identification for breast cancer.

机构信息

Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut.

出版信息

Genet Epidemiol. 2020 Nov;44(8):934-947. doi: 10.1002/gepi.22350. Epub 2020 Aug 17.

DOI:10.1002/gepi.22350

PMID:32808324

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7924970/

Abstract

In genome-wide association studies, signals associated with rare variants and interactions between genes are hard to detect even when the sample size is in tens of thousands. To overcome these problems, we examine the concept of supervariant. Like the classic concept of the gene, a supervariant is a combination of alleles in multiple loci, but the contributing loci can be anywhere in the genome. We hypothesize that supervariants are easy to detect and the aggregated signals are more stable in their associations with the disease than that from a single nucleoid polymorphism. Using the UK Biobank databases, we develop a ranking and aggregation method for identifying supervariants. Specifically, we examine 9,377 breast cancer cases with 46,861 controls matched by sex and age. In our simulations, the use of supervariants outperforms single-nucleotide polymorphism-based association method in detecting rare variants and signals with interactive structure. In real data analysis, we identify supervariants on Chromosomes 1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 16, and 22 which cover previously reported loci that have associations with breast or other cancers, and several novel loci on Chromosomes 2, 5, 9, and 12. These findings demonstrate the validity of supervariants and its potential of discovering replicable and novel results for complex disease.

摘要

在全基因组关联研究中，即使样本量达到数万，也很难检测到与罕见变异和基因相互作用相关的信号。为了克服这些问题，我们研究了超级变体的概念。与经典的基因概念一样，超级变体是多个基因座中等位基因的组合，但贡献的基因座可以位于基因组的任何地方。我们假设超级变体易于检测，并且与疾病的关联的聚合信号比单一核碱基多态性更稳定。使用英国生物库数据库，我们开发了一种识别超级变体的排名和聚合方法。具体来说，我们检查了 9377 例乳腺癌病例和 46861 例性别和年龄匹配的对照。在我们的模拟中，使用超级变体在检测罕见变异和具有交互结构的信号方面优于基于单核苷酸多态性的关联方法。在真实数据分析中，我们在染色体 1、2、3、5、6、7、8、9、10、11、16 和 22 上识别出超级变体，这些超级变体涵盖了先前报道的与乳腺癌或其他癌症相关的基因座，以及染色体 2、5、9 和 12 上的几个新基因座。这些发现证明了超级变体的有效性及其发现复杂疾病可重复和新颖结果的潜力。

相似文献

Supervariants identification for breast cancer.乳腺癌的超级变体鉴定。

Genet Epidemiol. 2020 Nov;44(8):934-947. doi: 10.1002/gepi.22350. Epub 2020 Aug 17.

Identification and validation of supervariants reveal novel loci associated with human white matter microstructure.鉴定和验证超级变体揭示了与人类白质微观结构相关的新基因座。

Genome Res. 2024 Feb 7;34(1):20-33. doi: 10.1101/gr.277905.123.

Super-variants identification for brain connectivity.脑连接的超级变体识别。

Hum Brain Mapp. 2021 Apr 1;42(5):1304-1312. doi: 10.1002/hbm.25294. Epub 2020 Nov 24.

Identification of independent association signals and putative functional variants for breast cancer risk through fine-scale mapping of the 12p11 locus.通过对12p11位点进行精细定位来识别乳腺癌风险的独立关联信号和推定的功能变异。

Breast Cancer Res. 2016 Jun 21;18(1):64. doi: 10.1186/s13058-016-0718-0.

A method combining a random forest-based technique with the modeling of linkage disequilibrium through latent variables, to run multilocus genome-wide association studies.一种结合基于随机森林的技术和通过潜在变量进行连锁不平衡建模的方法，用于进行多基因座全基因组关联研究。

BMC Bioinformatics. 2018 Mar 27;19(1):106. doi: 10.1186/s12859-018-2054-0.

Localization of breast cancer susceptibility loci by genome-wide SNP linkage disequilibrium mapping.通过全基因组单核苷酸多态性连锁不平衡图谱定位乳腺癌易感基因座

Genet Epidemiol. 2006 Jan;30(1):48-61. doi: 10.1002/gepi.20101.

Identification of candidate causal variants and target genes at 41 breast cancer risk loci through differential allelic expression analysis.通过差异等位基因表达分析鉴定 41 个乳腺癌风险位点的候选因果变异和靶基因。

Sci Rep. 2024 Sep 28;14(1):22526. doi: 10.1038/s41598-024-72163-y.

Fine mapping of breast cancer genome-wide association studies loci in women of African ancestry identifies novel susceptibility markers.非洲裔女性乳腺癌全基因组关联研究位点的精细定位鉴定出新的易感性标记物。

Carcinogenesis. 2013 Jul;34(7):1520-8. doi: 10.1093/carcin/bgt090. Epub 2013 Mar 8.

Predicting signatures of "synthetic associations" and "natural associations" from empirical patterns of human genetic variation.从人类遗传变异的经验模式预测“合成关联”和“自然关联”的特征。

PLoS Comput Biol. 2012;8(7):e1002600. doi: 10.1371/journal.pcbi.1002600. Epub 2012 Jul 5.

A joint transcriptome-wide association study across multiple tissues identifies candidate breast cancer susceptibility genes.一项跨多种组织的联合转录组全基因组关联研究鉴定出候选乳腺癌易感基因。

Am J Hum Genet. 2023 Jun 1;110(6):950-962. doi: 10.1016/j.ajhg.2023.04.005. Epub 2023 May 9.

引用本文的文献

Identifying genetic variants for brain connectivity using Ball Covariance Ranking and Aggregation.使用球协方差排序和聚合来识别大脑连通性的基因变异。

J Am Stat Assoc. 2025 Feb 27. doi: 10.1080/01621459.2025.2450837.

A multi-tissue, splicing-based joint transcriptome-wide association study identifies susceptibility genes for breast cancer.多组织、基于剪接的联合转录组全基因组关联研究鉴定乳腺癌易感基因。

Am J Hum Genet. 2024 Jun 6;111(6):1100-1113. doi: 10.1016/j.ajhg.2024.04.010. Epub 2024 May 10.

Genome Res. 2024 Feb 7;34(1):20-33. doi: 10.1101/gr.277905.123.

Deep learning identified genetic variants for COVID-19-related mortality among 28,097 affected cases in UK Biobank.深度学习确定了英国生物库 28097 例 COVID-19 相关死亡病例的遗传变异。

Genet Epidemiol. 2023 Apr;47(3):215-230. doi: 10.1002/gepi.22515. Epub 2023 Jan 24.

Super-taxon in human microbiome are identified to be associated with colorectal cancer.在人类微生物组中，超分类群被确定与结直肠癌相关。

BMC Bioinformatics. 2022 Jun 21;23(1):243. doi: 10.1186/s12859-022-04786-9.

Genetic variants are identified to increase risk of COVID-19 related mortality from UK Biobank data.从英国生物银行数据中鉴定出与 COVID-19 相关死亡率增加相关的遗传变异。

Hum Genomics. 2021 Feb 3;15(1):10. doi: 10.1186/s40246-021-00306-7.

Genetic variants are identified to increase risk of COVID-19 related mortality from UK Biobank data.通过英国生物银行的数据确定了增加新冠病毒相关死亡风险的基因变异。

medRxiv. 2020 Nov 9:2020.11.05.20226761. doi: 10.1101/2020.11.05.20226761.

本文引用的文献

Long noncoding RNA CASC21 exerts an oncogenic role in colorectal cancer through regulating miR-7-5p/YAP1 axis.长链非编码 RNA CASC21 通过调控 miR-7-5p/YAP1 轴在结直肠癌中发挥致癌作用。

Biomed Pharmacother. 2020 Jan;121:109628. doi: 10.1016/j.biopha.2019.109628. Epub 2019 Nov 12.

Genome-wide association and transcriptome studies identify target genes and risk loci for breast cancer.全基因组关联和转录组研究确定了乳腺癌的靶基因和风险位点。

Nat Commun. 2019 Apr 15;10(1):1741. doi: 10.1038/s41467-018-08053-5.

Linc01194 acts as an oncogene in colorectal carcinoma and is associated with poor survival outcome.Linc01194在结直肠癌中作为一种癌基因发挥作用，并与不良生存结果相关。

Cancer Manag Res. 2019 Mar 22;11:2349-2362. doi: 10.2147/CMAR.S189189. eCollection 2019.

Interaction of antioxidant gene variants and susceptibility to type 2 diabetes mellitus.抗氧化剂基因变异与2型糖尿病易感性的相互作用。

Br J Biomed Sci. 2019 Oct;76(4):166-171. doi: 10.1080/09674845.2019.1595869. Epub 2019 May 21.

Association of polymorphisms in intron 2 of FGFR2 and breast cancer risk in Сhinese women.FGFR2基因第2内含子多态性与中国女性乳腺癌风险的关联

Tsitol Genet. 2016 Sep-Oct;50(5):59-64.

The UK Biobank resource with deep phenotyping and genomic data.英国生物银行资源库，具有深度表型和基因组数据。

Nature. 2018 Oct;562(7726):203-209. doi: 10.1038/s41586-018-0579-z. Epub 2018 Oct 10.

Association between lncRNA CASC8 polymorphisms and the risk of cancer: a meta-analysis.长链非编码RNA CASC8基因多态性与癌症风险的关联：一项荟萃分析

Cancer Manag Res. 2018 Aug 31;10:3141-3148. doi: 10.2147/CMAR.S170783. eCollection 2018.

Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes.全基因组关联分析鉴定出 143 个 2 型糖尿病风险变异和潜在调控机制。

Nat Commun. 2018 Jul 27;9(1):2941. doi: 10.1038/s41467-018-04951-w.

Risk, Prediction and Prevention of Hereditary Breast Cancer - Large-Scale Genomic Studies in Times of Big and Smart Data.遗传性乳腺癌的风险、预测与预防——大数据与智能数据时代的大规模基因组研究

Geburtshilfe Frauenheilkd. 2018 May;78(5):481-492. doi: 10.1055/a-0603-4350. Epub 2018 Jun 4.

Red and processed meat consumption and breast cancer: UK Biobank cohort study and meta-analysis.食用红色和加工肉类与乳腺癌：英国生物库队列研究和荟萃分析。

Eur J Cancer. 2018 Feb;90:73-82. doi: 10.1016/j.ejca.2017.11.022. Epub 2017 Dec 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。