Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
Division of Human Genetics, Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
Int J Mol Sci. 2021 Mar 25;22(7):3364. doi: 10.3390/ijms22073364.
RNA-seq has been a powerful method to detect the differentially expressed genes/long non-coding RNAs (lncRNAs) in schizophrenia (SCZ) patients; however, due to overfitting problems differentially expressed targets (DETs) cannot be used properly as biomarkers. This study used machine learning to reduce gene/non-coding RNA features. Dorsolateral prefrontal cortex (dlpfc) RNA-seq data from 254 individuals was obtained from the CommonMind consortium. The average predictive accuracy for SCZ patients was 67% based on coding genes, and 96% based on long non-coding RNAs (lncRNAs). Machine learning is a powerful algorithm to reduce functional biomarkers in SCZ patients. The lncRNAs capture the characteristics of SCZ tissue more accurately than mRNA as the former regulate every level of gene expression, not limited to mRNA levels.
RNA-seq 是一种强大的方法,可用于检测精神分裂症 (SCZ) 患者中差异表达的基因/长非编码 RNA (lncRNA);然而,由于过度拟合问题,差异表达靶标 (DET) 不能被正确用作生物标志物。本研究使用机器学习来减少基因/非编码 RNA 特征。从 CommonMind 联盟获得了 254 个人的背外侧前额叶皮层 (dlpfc) RNA-seq 数据。基于编码基因,SCZ 患者的平均预测准确率为 67%,基于长非编码 RNA (lncRNA),准确率为 96%。机器学习是一种强大的算法,可以减少 SCZ 患者的功能生物标志物。lncRNA 比 mRNA 更能准确地捕捉 SCZ 组织的特征,因为前者可以调节基因表达的各个层面,而不仅仅局限于 mRNA 水平。