NHLBI Integrated Cardiovascular Data Science Training Program at University of California, Los Angeles (UCLA), Suite 1-609, MRL Building, 675 Charles E. Young Dr. South, Los Angeles, CA 90095-1760, USA.
Department of Physiology, UCLA School of Medicine, Suite 1-609, MRL Building, 675 Charles E. Young Dr. South, Los Angeles, CA 90095-1760, USA.
Cardiovasc Res. 2022 Feb 21;118(3):732-745. doi: 10.1093/cvr/cvab067.
The search for new strategies for better understanding cardiovascular (CV) disease is a constant one, spanning multitudinous types of observations and studies. A comprehensive characterization of each disease state and its biomolecular underpinnings relies upon insights gleaned from extensive information collection of various types of data. Researchers and clinicians in CV biomedicine repeatedly face questions regarding which types of data may best answer their questions, how to integrate information from multiple datasets of various types, and how to adapt emerging advances in machine learning and/or artificial intelligence to their needs in data processing. Frequently lauded as a field with great practical and translational potential, the interface between biomedical informatics and CV medicine is challenged with staggeringly massive datasets. Successful application of computational approaches to decode these complex and gigantic amounts of information becomes an essential step toward realizing the desired benefits. In this review, we examine recent efforts to adapt informatics strategies to CV biomedical research: automated information extraction and unification of multifaceted -omics data. We discuss how and why this interdisciplinary space of CV Informatics is particularly relevant to and supportive of current experimental and clinical research. We describe in detail how open data sources and methods can drive discovery while demanding few initial resources, an advantage afforded by widespread availability of cloud computing-driven platforms. Subsequently, we provide examples of how interoperable computational systems facilitate exploration of data from multiple sources, including both consistently formatted structured data and unstructured data. Taken together, these approaches for achieving data harmony enable molecular phenotyping of CV diseases and unification of CV knowledge.
寻找更好地理解心血管疾病的新策略是一项持续的工作,涵盖了多种类型的观察和研究。对每种疾病状态及其生物分子基础的全面描述依赖于从各种类型的数据中广泛收集信息所获得的见解。心血管生物医学领域的研究人员和临床医生经常面临这样的问题:哪些类型的数据最能回答他们的问题;如何整合来自各种类型的多个数据集的信息;以及如何将机器学习和/或人工智能的新兴进展应用于他们的数据处理需求。生物医学信息学与心血管医学的结合经常被誉为具有巨大实际和转化潜力的领域,但它面临着庞大的数据集的挑战。成功应用计算方法来解码这些复杂而庞大的信息量是实现预期效益的关键步骤。在这篇综述中,我们检查了最近将信息学策略应用于心血管生物医学研究的努力:自动化信息提取和多组学数据的统一。我们讨论了为什么这个心血管信息学的跨学科领域特别适合和支持当前的实验和临床研究。我们详细描述了开放数据源和方法如何在不需要初始资源的情况下推动发现,这是云计算驱动平台广泛可用带来的优势。随后,我们提供了一些示例,说明如何通过互操作的计算系统来探索来自多个来源的数据,包括一致格式的结构化数据和非结构化数据。总之,这些实现数据协调的方法能够对心血管疾病进行分子表型分析,并统一心血管知识。