Dipartimento Di Elettronica, Informazione E Bioingegneria, Politecnico Di Milano, Milano, Italy.
Division of Radiology, IEO, European Institute of Oncology IRCCS, Milan, Italy.
J Digit Imaging. 2022 Aug;35(4):970-982. doi: 10.1007/s10278-022-00615-w. Epub 2022 Mar 16.
Integrating the information coming from biological samples with digital data, such as medical images, has gained prominence with the advent of precision medicine. Research in this field faces an ever-increasing amount of data to manage and, as a consequence, the need to structure these data in a functional and standardized fashion to promote and facilitate cooperation among institutions. Inspired by the Minimum Information About BIobank data Sharing (MIABIS), we propose an extended data model which aims to standardize data collections where both biological and digital samples are involved. In the proposed model, strong emphasis is given to the cause-effect relationships among factors as these are frequently encountered in clinical workflows. To test the data model in a realistic context, we consider the Continuous Observation of SMOking Subjects (COSMOS) dataset as case study, consisting of 10 consecutive years of lung cancer screening and follow-up on more than 5000 subjects. The structure of the COSMOS database, implemented to facilitate the process of data retrieval, is therefore presented along with a description of data that we hope to share in a public repository for lung cancer screening research.
将生物样本信息与数字数据(如医学图像)整合,随着精准医学的出现而受到关注。该领域的研究面临着越来越多的数据需要管理,因此需要以功能和标准化的方式对这些数据进行结构化,以促进机构之间的合作。受生物库数据共享最小信息(MIABIS)的启发,我们提出了一个扩展的数据模型,旨在标准化涉及生物和数字样本的数据集。在所提出的模型中,非常强调因素之间的因果关系,因为这些关系在临床工作流程中经常遇到。为了在现实环境中测试数据模型,我们考虑了连续观察吸烟受试者(COSMOS)数据集作为案例研究,该数据集包含超过 5000 名受试者的连续 10 年肺癌筛查和随访。还介绍了 COSMOS 数据库的结构,该数据库的实现旨在方便数据检索过程,以及我们希望在肺癌筛查研究公共存储库中共享的数据描述。