Suppr超能文献

机器学习应用程序从自发性犬血管肉瘤的转录组数据中识别出新的基因特征。

Machine learning application identifies novel gene signatures from transcriptomic data of spontaneous canine hemangiosarcoma.

机构信息

School of Mathematics, College of Science and Engineering at the University of Minnesota, Minneapolis, MN, USA.

Animal Cancer Care and Research Program, Department of Veterinary Clinical Sciences, College of Veterinary Medicine at the University of Minnesota, St Paul, MN, USA.

出版信息

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa252.

Abstract

Angiosarcomas are soft-tissue sarcomas that form malignant vascular tissues. Angiosarcomas are very rare, and due to their aggressive behavior and high metastatic propensity, they have poor clinical outcomes. Hemangiosarcomas commonly occur in domestic dogs, and share pathological and clinical features with human angiosarcomas. Typical pathognomonic features of this tumor are irregular vascular channels that are filled with blood and are lined by a mixture of malignant and nonmalignant endothelial cells. The current gold standard is the histological diagnosis of angiosarcoma; however, microscopic evaluation may be complicated, particularly when tumor cells are undetectable due to the presence of excessive amounts of nontumor cells or when tissue specimens have insufficient tumor content. In this study, we implemented machine learning applications from next-generation transcriptomic data of canine hemangiosarcoma tumor samples (n = 76) and nonmalignant tissues (n = 10) to evaluate their training performance for diagnostic utility. The 10-fold cross-validation test and multiple feature selection methods were applied. We found that extra trees and random forest learning models were the best classifiers for hemangiosarcoma in our testing datasets. We also identified novel gene signatures using the mutual information and Monte Carlo feature selection method. The extra trees model revealed high classification accuracy for hemangiosarcoma in validation sets. We demonstrate that high-throughput sequencing data of canine hemangiosarcoma are trainable for machine learning applications. Furthermore, our approach enables us to identify novel gene signatures as reliable determinants of hemangiosarcoma, providing significant insights into the development of potential applications for this vascular malignancy.

摘要

血管肉瘤是形成恶性血管组织的软组织肉瘤。血管肉瘤非常罕见,由于其侵袭性行为和高转移倾向,临床预后较差。血管肉瘤常见于家养犬,具有与人类血管肉瘤相似的病理和临床特征。该肿瘤的典型特征是不规则的血管通道,充满血液,并由恶性和非恶性内皮细胞混合排列。目前的金标准是血管肉瘤的组织学诊断;然而,显微镜评估可能很复杂,特别是当由于存在大量非肿瘤细胞而无法检测到肿瘤细胞时,或者当组织标本中肿瘤含量不足时。在这项研究中,我们从犬血管肉瘤肿瘤样本(n=76)和非恶性组织(n=10)的下一代转录组数据中实施了机器学习应用,以评估其在诊断中的应用性能。应用了 10 倍交叉验证测试和多种特征选择方法。我们发现,在我们的测试数据集中,额外树和随机森林学习模型是血管肉瘤的最佳分类器。我们还使用互信息和蒙特卡罗特征选择方法鉴定了新的基因特征。额外树模型在验证集中显示出对血管肉瘤的高分类准确性。我们证明了犬血管肉瘤的高通量测序数据可用于机器学习应用的训练。此外,我们的方法使我们能够识别新的基因特征作为血管肉瘤的可靠决定因素,为这种血管恶性肿瘤的潜在应用提供了重要的见解。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验