Suppr超能文献

深度学习识别文献中基于微阵列的、基因水平的错误结论。

Deep learning identifies erroneous microarray-based, gene-level conclusions in literature.

作者信息

Qin Yanan, Yi Daiyao, Chen Xianghao, Guan Yuanfang

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.

出版信息

NAR Genom Bioinform. 2021 Oct 4;3(4):lqab089. doi: 10.1093/nargab/lqab089. eCollection 2021 Dec.

Abstract

More than 110 000 publications have used microarrays to decipher phenotype-associated genes, clinical biomarkers and gene functions. Microarrays rely on digital assaying the fluorescence signals of arrays. In this study, we retrospectively constructed raw images for 37 724 published microarray data, and developed deep learning algorithms to automatically detect systematic defects. We report that an alarming amount of 26.73% of the microarray-based studies are affected by serious imaging defects. By literature mining, we found that publications associated with these affected microarrays have reported disproportionately more biological discoveries on the genes in the contaminated areas compared to other genes. 28.82% of the gene-level conclusions reported in these publications were based on measurements falling into the contaminated area, indicating severe, systematic problems caused by such contaminations. We provided the identified published, problematic datasets, affected genes and the imputed arrays as well as software tools for scanning such contamination that will become essential to future studies to scrutinize and critically analyze microarray data.

摘要

超过11万篇出版物使用微阵列来解读与表型相关的基因、临床生物标志物和基因功能。微阵列依靠对阵列的荧光信号进行数字检测。在本研究中,我们回顾性地为37724篇已发表的微阵列数据构建了原始图像,并开发了深度学习算法来自动检测系统缺陷。我们报告称,高达26.73%的基于微阵列的研究受到严重成像缺陷的影响。通过文献挖掘,我们发现,与这些受影响的微阵列相关的出版物报告的受污染区域基因的生物学发现比其他基因多得多。这些出版物中报告的28.82%的基因水平结论是基于落入受污染区域的测量数据,表明此类污染导致了严重的系统性问题。我们提供了已识别的有问题的已发表数据集、受影响的基因、插补阵列以及用于扫描此类污染的软件工具,这些对于未来研究仔细审查和批判性分析微阵列数据至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5c3/8489595/3bc68ba0ff64/lqab089fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验