• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

定量高通量数据集中技术变异的监测。

Monitoring of technical variation in quantitative high-throughput datasets.

作者信息

Lauss Martin, Visne Ilhami, Kriegner Albert, Ringnér Markus, Jönsson Göran, Höglund Mattias

机构信息

Department of Oncology, Clinical Sciences, Lund University, Sweden.

出版信息

Cancer Inform. 2013 Sep 23;12:193-201. doi: 10.4137/CIN.S12862. eCollection 2013.

DOI:10.4137/CIN.S12862
PMID:24092958
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3785384/
Abstract

High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study's conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarray datasets, mainly from the Cancer Genome Atlas (TCGA) project, in respect to technical and biological annotations. We observe technical bias in these datasets and discuss corrective interventions. We then suggest a general procedure to control study design, detect technical bias using linear regression of principal components, correct for batch effects, and re-evaluate principal components. This procedure is implemented in the R package swamp, and as graphical user interface software. In conclusion, high-throughput platforms that generate continuous measurements are sensitive to various forms of technical bias. For such data, monitoring of technical variation is an important analysis step.

摘要

高维数据集可能会因技术来源(如批次)的变异而产生混淆。未检测到的批次效应可能会对研究结论的有效性产生严重影响。我们评估了高通量RNA测序和miRNA测序以及DNA甲基化和基因表达微阵列数据集,主要来自癌症基因组图谱(TCGA)项目,涉及技术和生物学注释。我们在这些数据集中观察到技术偏差并讨论了纠正措施。然后,我们提出了一个通用程序,用于控制研究设计,使用主成分线性回归检测技术偏差,校正批次效应,并重新评估主成分。此程序在R包swamp中实现,并作为图形用户界面软件。总之,生成连续测量值的高通量平台对各种形式的技术偏差很敏感。对于此类数据,监测技术变异是一个重要的分析步骤。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/04a49b4e79f7/cin-12-2013-193f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/0da0879a1a44/cin-12-2013-193f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/fa8a1a404e71/cin-12-2013-193f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/1eb7ddf85cb9/cin-12-2013-193f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/04a49b4e79f7/cin-12-2013-193f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/0da0879a1a44/cin-12-2013-193f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/fa8a1a404e71/cin-12-2013-193f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/1eb7ddf85cb9/cin-12-2013-193f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b0/3785384/04a49b4e79f7/cin-12-2013-193f4.jpg

相似文献

1
Monitoring of technical variation in quantitative high-throughput datasets.定量高通量数据集中技术变异的监测。
Cancer Inform. 2013 Sep 23;12:193-201. doi: 10.4137/CIN.S12862. eCollection 2013.
2
Serum-based six-miRNA signature as a potential marker for EC diagnosis: Comparison with TCGA miRNAseq dataset and identification of miRNA-mRNA target pairs by integrated analysis of TCGA miRNAseq and RNAseq datasets.基于血清的六种miRNA特征作为子宫内膜癌诊断的潜在标志物:与TCGA miRNA测序数据集的比较以及通过整合分析TCGA miRNA测序和RNA测序数据集鉴定miRNA-mRNA靶标对
Asia Pac J Clin Oncol. 2018 Oct;14(5):e289-e301. doi: 10.1111/ajco.12847. Epub 2018 Jan 30.
3
MethylMix 2.0: an R package for identifying DNA methylation genes.MethylMix 2.0:用于鉴定 DNA 甲基化基因的 R 包。
Bioinformatics. 2018 Sep 1;34(17):3044-3046. doi: 10.1093/bioinformatics/bty156.
4
BatchI: Batch effect Identification in high-throughput screening data using a dynamic programming algorithm.批次效应识别在高通量筛选数据中使用动态规划算法。
Bioinformatics. 2019 Jun 1;35(11):1885-1892. doi: 10.1093/bioinformatics/bty900.
5
Batch effect correction for genome-wide methylation data with Illumina Infinium platform.基于 Illumina Infinium 平台的全基因组甲基化数据的批次效应校正。
BMC Med Genomics. 2011 Dec 16;4:84. doi: 10.1186/1755-8794-4-84.
6
CONFINED: distinguishing biological from technical sources of variation by leveraging multiple methylation datasets.受限条件下:通过利用多个甲基化数据集,区分生物和技术来源的变异。
Genome Biol. 2019 Jul 12;20(1):138. doi: 10.1186/s13059-019-1743-y.
7
Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned.DNA甲基化微阵列数据中批次效应的校正:经验教训
Front Genet. 2018 Mar 16;9:83. doi: 10.3389/fgene.2018.00083. eCollection 2018.
8
The sva package for removing batch effects and other unwanted variation in high-throughput experiments.sva 包用于去除高通量实验中的批次效应和其他不需要的变异。
Bioinformatics. 2012 Mar 15;28(6):882-3. doi: 10.1093/bioinformatics/bts034. Epub 2012 Jan 17.
9
PCA-Plus: Enhanced principal component analysis with illustrative applications to batch effects and their quantitation.PCA-Plus:增强主成分分析及其在批次效应及其定量分析中的示例应用
bioRxiv. 2024 Jan 3:2024.01.02.573793. doi: 10.1101/2024.01.02.573793.
10
RNAseqPS: A Web Tool for Estimating Sample Size and Power for RNAseq Experiment.RNAseqPS:一种用于估算RNA测序实验样本量和检验效能的网络工具。
Cancer Inform. 2014 Oct 13;13(Suppl 6):1-5. doi: 10.4137/CIN.S17688. eCollection 2014.

引用本文的文献

1
Comparison of the microbiome of bladder urine, upper urinary tract urine, and kidney stones in patients with urolithiasis.尿石症患者膀胱尿液、上尿路尿液和肾结石的微生物群比较。
Cent European J Urol. 2025;78(2):206-220. doi: 10.5173/ceju.2025.0020. Epub 2025 Apr 28.
2
Using gut microbiome metagenomic hypervariable features for diabetes screening and typing through supervised machine learning.利用肠道微生物组宏基因组高变特征,通过监督式机器学习进行糖尿病筛查和分型。
Microb Genom. 2025 Mar;11(3). doi: 10.1099/mgen.0.001365.
3
NPM: latent batch effects correction of omics data by nearest-pair matching.

本文引用的文献

1
Comprehensive molecular characterization of clear cell renal cell carcinoma.透明细胞肾细胞癌的全面分子特征分析。
Nature. 2013 Jul 4;499(7456):43-9. doi: 10.1038/nature12222. Epub 2013 Jun 23.
2
Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages.利用 inSilicoDb 和 inSilicoMerging R/Bioconductor 包挖掘公开可用的微阵列数据的潜力。
BMC Bioinformatics. 2012 Dec 24;13:335. doi: 10.1186/1471-2105-13-335.
3
Comprehensive molecular portraits of human breast tumours.
NPM:通过最近邻匹配对组学数据进行潜在批次效应校正。
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf084.
4
All the single cells: Single-cell transcriptomics/epigenomics experimental design and analysis considerations for glial biologists.所有单细胞:神经胶质生物学家的单细胞转录组学/表观基因组学实验设计与分析考量
Glia. 2025 Mar;73(3):451-473. doi: 10.1002/glia.24633. Epub 2024 Nov 19.
5
A review of machine learning methods for cancer characterization from microbiome data.基于微生物组数据的癌症特征机器学习方法综述。
NPJ Precis Oncol. 2024 May 30;8(1):123. doi: 10.1038/s41698-024-00617-7.
6
Transcriptome organization of white blood cells through gene co-expression network analysis in a large RNA-seq dataset.通过在大型 RNA-seq 数据集上进行基因共表达网络分析研究白细胞的转录组组织。
Front Immunol. 2024 Apr 2;15:1350111. doi: 10.3389/fimmu.2024.1350111. eCollection 2024.
7
Molecular patterns of resistance to immune checkpoint blockade in melanoma.黑色素瘤免疫检查点阻断治疗耐药的分子模式。
Nat Commun. 2024 Apr 9;15(1):3075. doi: 10.1038/s41467-024-47425-y.
8
Immune and molecular landscape behind non-response to Mycophenolate Mofetil and Azathioprine in lupus nephritis therapy.狼疮性肾炎治疗中对霉酚酸酯和硫唑嘌呤无反应背后的免疫和分子格局。
Res Sq. 2024 Jan 12:rs.3.rs-3783877. doi: 10.21203/rs.3.rs-3783877/v1.
9
Analysis of a Four-Component Competing Endogenous RNA Network Reveals Potential Biomarkers in Gastric Cancer: An Integrated Systems Biology and Experimental Investigation.一个四组分竞争性内源RNA网络的分析揭示了胃癌中的潜在生物标志物:一项综合系统生物学与实验研究
Adv Biomed Res. 2023 Oct 28;12:238. doi: 10.4103/abr.abr_185_23. eCollection 2023.
10
Dynamics of Melanoma-Associated Epitope-Specific CD8+ T Cells in the Blood Correlate With Clinical Outcome Under PD-1 Blockade.在 PD-1 阻断治疗下,血液中与黑色素瘤相关的抗原特异性 CD8+T 细胞的动力学与临床结果相关。
Front Immunol. 2022 Jul 7;13:906352. doi: 10.3389/fimmu.2022.906352. eCollection 2022.
人类乳腺肿瘤的全面分子特征图谱。
Nature. 2012 Oct 4;490(7418):61-70. doi: 10.1038/nature11412. Epub 2012 Sep 23.
4
Comprehensive genomic characterization of squamous cell lung cancers.全面基因组特征分析鳞状细胞肺癌
Nature. 2012 Sep 27;489(7417):519-25. doi: 10.1038/nature11404. Epub 2012 Sep 9.
5
Batch effect removal methods for microarray gene expression data integration: a survey.批量效应去除方法在微阵列基因表达数据整合中的应用:综述。
Brief Bioinform. 2013 Jul;14(4):469-90. doi: 10.1093/bib/bbs037. Epub 2012 Jul 31.
6
Comprehensive molecular characterization of human colon and rectal cancer.全面的人类结肠和直肠癌分子特征分析。
Nature. 2012 Jul 18;487(7407):330-7. doi: 10.1038/nature11252.
7
R/DWD: distance-weighted discrimination for classification, visualization and batch adjustment.R/DWD:用于分类、可视化和批量调整的距离加权判别。
Bioinformatics. 2012 Apr 15;28(8):1182-3. doi: 10.1093/bioinformatics/bts096. Epub 2012 Feb 24.
8
The sva package for removing batch effects and other unwanted variation in high-throughput experiments.sva 包用于去除高通量实验中的批次效应和其他不需要的变异。
Bioinformatics. 2012 Mar 15;28(6):882-3. doi: 10.1093/bioinformatics/bts034. Epub 2012 Jan 17.
9
Surrogate variable analysis using partial least squares (SVA-PLS) in gene expression studies.基于偏最小二乘法的替代变量分析在基因表达研究中的应用。
Bioinformatics. 2012 Mar 15;28(6):799-806. doi: 10.1093/bioinformatics/bts022. Epub 2012 Jan 11.
10
Empirical comparison of cross-platform normalization methods for gene expression data.基于基因表达数据的跨平台归一化方法的实证比较。
BMC Bioinformatics. 2011 Dec 7;12:467. doi: 10.1186/1471-2105-12-467.