• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

外群机器学习方法识别出与自闭症谱系障碍相关的非编码DNA中的单核苷酸变异。

Outgroup Machine Learning Approach Identifies Single Nucleotide Variants in Noncoding DNA Associated with Autism Spectrum Disorder.

作者信息

Varma Maya, Paskov Kelley Marie, Jung Jae-Yoon, Sierra Chrisman Brianna, Stockham Nate Tyler, Washington Peter Yigitcan, Wall Dennis Paul

机构信息

Departments of Computer Science, Stanford University, Stanford, CA 94305, USA.

出版信息

Pac Symp Biocomput. 2019;24:260-271.

PMID:30864328
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6417813/
Abstract

Autism spectrum disorder (ASD) is a heritable neurodevelopmental disorder affecting 1 in 59 children. While noncoding genetic variation has been shown to play a major role in many complex disorders, the contribution of these regions to ASD susceptibility remains unclear. Genetic analyses of ASD typically use unaffected family members as controls; however, we hypothesize that this method does not effectively elevate variant signal in the noncoding region due to family members having subclinical phenotypes arising from common genetic mechanisms. In this study, we use a separate, unrelated outgroup of individuals with progressive supranuclear palsy (PSP), a neurodegenerative condition with no known etiological overlap with ASD, as a control population. We use whole genome sequencing data from a large cohort of 2182 children with ASD and 379 controls with PSP, sequenced at the same facility with the same machines and variant calling pipeline, in order to investigate the role of noncoding variation in the ASD phenotype. We analyze seven major types of noncoding variants: microRNAs, human accelerated regions, hypersensitive sites, transcription factor binding sites, DNA repeat sequences, simple repeat sequences, and CpG islands. After identifying and removing batch effects between the two groups, we trained an ℓ1-regularized logistic regression classifier to predict ASD status from each set of variants. The classifier trained on simple repeat sequences performed well on a held-out test set (AUC-ROC = 0.960); this classifier was also able to differentiate ASD cases from controls when applied to a completely independent dataset (AUC-ROC = 0.960). This suggests that variation in simple repeat regions is predictive of the ASD phenotype and may contribute to ASD risk. Our results show the importance of the noncoding region and the utility of independent control groups in effectively linking genetic variation to disease phenotype for complex disorders.

摘要

自闭症谱系障碍(ASD)是一种遗传性神经发育障碍,每59名儿童中就有1人受其影响。虽然非编码基因变异已被证明在许多复杂疾病中起主要作用,但这些区域对ASD易感性的贡献仍不清楚。ASD的基因分析通常使用未受影响的家庭成员作为对照;然而,我们推测这种方法不能有效地提高非编码区域的变异信号,因为家庭成员具有由共同遗传机制引起的亚临床表型。在本研究中,我们使用一组单独的、无亲缘关系的进行性核上性麻痹(PSP)患者作为对照人群,PSP是一种神经退行性疾病,与ASD没有已知的病因重叠。我们使用来自2182名患有ASD的儿童和379名患有PSP的对照的大样本队列的全基因组测序数据,这些数据在同一机构使用相同的机器和变异检测流程进行测序,以研究非编码变异在ASD表型中的作用。我们分析了七种主要类型的非编码变异:微小RNA、人类加速区域、超敏位点、转录因子结合位点、DNA重复序列、简单重复序列和CpG岛。在识别并消除两组之间的批次效应后,我们训练了一个ℓ1正则化逻辑回归分类器,以根据每组变异预测ASD状态。在简单重复序列上训练的分类器在一个留出的测试集上表现良好(AUC-ROC = 0.960);当应用于一个完全独立的数据集时,该分类器也能够区分ASD病例和对照(AUC-ROC = 0.960)。这表明简单重复区域的变异可预测ASD表型,并可能导致ASD风险。我们的结果显示了非编码区域的重要性以及独立对照组在有效将基因变异与复杂疾病的疾病表型联系起来方面的效用。

相似文献

1
Outgroup Machine Learning Approach Identifies Single Nucleotide Variants in Noncoding DNA Associated with Autism Spectrum Disorder.外群机器学习方法识别出与自闭症谱系障碍相关的非编码DNA中的单核苷酸变异。
Pac Symp Biocomput. 2019;24:260-271.
2
Whole genome sequencing and variant discovery in the ASPIRE autism spectrum disorder cohort.ASPIRE 自闭症谱系障碍队列的全基因组测序和变异发现。
Clin Genet. 2019 Sep;96(3):199-206. doi: 10.1111/cge.13556. Epub 2019 May 30.
3
Identification of rare noncoding sequence variants in gamma-aminobutyric acid A receptor, alpha 4 subunit in autism spectrum disorder.鉴定自闭症谱系障碍中γ-氨基丁酸 A 受体α4 亚单位的罕见非编码序列变异。
Neurogenetics. 2018 Jan;19(1):17-26. doi: 10.1007/s10048-017-0529-1. Epub 2017 Nov 18.
4
Whole Exome Sequencing Identifies Novel De Novo Variants Interacting with Six Gene Networks in Autism Spectrum Disorder.全外显子组测序鉴定出与自闭症谱系障碍中六个基因网络相互作用的新型新生变异。
Genes (Basel). 2020 Dec 22;12(1):1. doi: 10.3390/genes12010001.
5
Analysis of common genetic variation and rare CNVs in the Australian Autism Biobank.澳大利亚自闭症生物样本库中常见遗传变异与罕见 CNVs 的分析。
Mol Autism. 2021 Feb 10;12(1):12. doi: 10.1186/s13229-020-00407-5.
6
Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk.全基因组深度学习分析鉴定非编码突变对自闭症风险的贡献。
Nat Genet. 2019 Jun;51(6):973-980. doi: 10.1038/s41588-019-0420-0. Epub 2019 May 27.
7
Short tandem repeat expansions in cortical layer-specific genes implicate in phenotypic severity and adaptability of autism spectrum disorder.皮质层特定基因中的短串联重复扩展与自闭症谱系障碍的表型严重程度和适应性有关。
Psychiatry Clin Neurosci. 2024 Jul;78(7):405-415. doi: 10.1111/pcn.13676. Epub 2024 May 15.
8
An integrative analysis of non-coding regulatory DNA variations associated with autism spectrum disorder.自闭症谱系障碍相关非编码调控 DNA 变异的综合分析。
Mol Psychiatry. 2019 Nov;24(11):1707-1719. doi: 10.1038/s41380-018-0049-x. Epub 2018 Apr 27.
9
Functional DNA methylation signatures for autism spectrum disorder genomic risk loci: 16p11.2 deletions and CHD8 variants.自闭症谱系障碍基因组风险位点的功能性 DNA 甲基化特征:16p11.2 缺失和 CHD8 变异。
Clin Epigenetics. 2019 Jul 16;11(1):103. doi: 10.1186/s13148-019-0684-3.
10
Genetic determinants of survival in progressive supranuclear palsy: a genome-wide association study.进行性核上性麻痹生存的遗传决定因素:全基因组关联研究。
Lancet Neurol. 2021 Feb;20(2):107-116. doi: 10.1016/S1474-4422(20)30394-X. Epub 2020 Dec 17.

引用本文的文献

1
Putting the "mi" in omics: discovering miRNA biomarkers for pediatric precision care.将“mi”融入组学:发现 miRNA 生物标志物,用于儿科精准医疗。
Pediatr Res. 2023 Jan;93(2):316-323. doi: 10.1038/s41390-022-02206-5. Epub 2022 Jul 29.
2
A maximum flow-based network approach for identification of stable noncoding biomarkers associated with the multigenic neurological condition, autism.一种基于最大流的网络方法,用于识别与多基因神经疾病——自闭症相关的稳定非编码生物标志物。
BioData Min. 2021 May 3;14(1):28. doi: 10.1186/s13040-021-00262-x.
3
Impact of variant-level batch effects on identification of genetic risk factors in large sequencing studies.

本文引用的文献

1
Editorial: The rising prevalence of autism.社论:自闭症的发病率不断上升。
J Child Psychol Psychiatry. 2018 Jul;59(7):717-720. doi: 10.1111/jcpp.12941.
2
An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder.全基因组序列关联研究的分析框架及其对自闭症谱系障碍的意义。
Nat Genet. 2018 Apr 26;50(5):727-736. doi: 10.1038/s41588-018-0107-y.
3
Genetics of autism spectrum disorder.自闭症谱系障碍的遗传学
变异水平批次效应在大型测序研究中对遗传风险因素识别的影响。
PLoS One. 2021 Apr 16;16(4):e0249305. doi: 10.1371/journal.pone.0249305. eCollection 2021.
4
Spatial constrains and information content of sub-genomic regions of the human genome.人类基因组亚基因组区域的空间限制和信息内容。
iScience. 2021 Jan 10;24(2):102048. doi: 10.1016/j.isci.2021.102048. eCollection 2021 Feb 19.
5
Coalitional Game Theory Facilitates Identification of Non-Coding Variants Associated With Autism.联盟博弈论有助于识别与自闭症相关的非编码变异。
Biomed Inform Insights. 2019 Mar 8;11:1178222619832859. doi: 10.1177/1178222619832859. eCollection 2019.
6
Precision Medicine: Improving health through high-resolution analysis of personal data.精准医学:通过对个人数据的高分辨率分析改善健康状况。
Pac Symp Biocomput. 2019;24:220-223.
Handb Clin Neurol. 2018;147:321-329. doi: 10.1016/B978-0-444-63233-3.00021-X.
4
The Psychiatric Risk Gene Transcription Factor 4 (TCF4) Regulates Neurodevelopmental Pathways Associated With Schizophrenia, Autism, and Intellectual Disability.精神疾病风险基因转录因子 4(TCF4)调控与精神分裂症、自闭症和智力障碍相关的神经发育途径。
Schizophr Bull. 2018 Aug 20;44(5):1100-1110. doi: 10.1093/schbul/sbx164.
5
Genomic Patterns of De Novo Mutation in Simplex Autism.单纯性自闭症的新生突变基因组模式
Cell. 2017 Oct 19;171(3):710-722.e12. doi: 10.1016/j.cell.2017.08.047. Epub 2017 Sep 28.
6
Identifying and mitigating batch effects in whole genome sequencing data.识别并减轻全基因组测序数据中的批次效应。
BMC Bioinformatics. 2017 Jul 24;18(1):351. doi: 10.1186/s12859-017-1756-z.
7
Truncating de novo mutations in the Krüppel-type zinc-finger gene ZNF148 in patients with corpus callosum defects, developmental delay, short stature, and dysmorphisms.胼胝体发育不全、发育迟缓、身材矮小及畸形患者中Krüppel型锌指基因ZNF148的截短型新生突变。
Genome Med. 2016 Dec 13;8(1):131. doi: 10.1186/s13073-016-0386-9.
8
A Comparative Review of microRNA Expression Patterns in Autism Spectrum Disorder.自闭症谱系障碍中微小RNA表达模式的比较综述
Front Psychiatry. 2016 Nov 4;7:176. doi: 10.3389/fpsyt.2016.00176. eCollection 2016.
9
Microsatellite markers: what they mean and why they are so useful.微卫星标记:它们的含义以及为何如此有用。
Genet Mol Biol. 2016 Jul-Sep;39(3):312-28. doi: 10.1590/1678-4685-GMB-2016-0027. Epub 2016 Aug 4.
10
The genetic architecture of type 2 diabetes.2型糖尿病的遗传结构
Nature. 2016 Aug 4;536(7614):41-47. doi: 10.1038/nature18642. Epub 2016 Jul 11.