Streich Guillermo, Villalba Marcelo Blanco, Cid Christian, Bramuglia Guillermo F
Department of Oncology, Hospital Militar, Av Luis Maria Campos 726, Ciudad Autónoma de Buenos Aires (CABA), Argentina.
Department of Oncology, Centro Médico Austral, Montevideo 955 CABA, Argentina.
Ecancermedicalscience. 2022 Aug 4;16:1435. doi: 10.3332/ecancer.2022.1435. eCollection 2022.
Registries based on Real-World Data (RWD) are those obtained outside of systematised and randomised clinical trials. They allow the collection of information from a large number of patients and enable the participation of a significant number of professionals. PrecisaXperta is a web platform developed for this purpose with more than 2 years of operation, parameterised for oncology. Its design allows the construction of an epidemiological database in real time and exportable for processing.
To describe the characteristics and operation of this online data recording tool, explain how it was developed and analyse the quality of the information recorded, taking as an example the data obtained for breast cancer.
Physicians, computer scientists and data science analysts participated in the development. Patient data, history, educational level, diagnosis, staging, molecular markers, quality of life, types of treatments, progression and response, imaging, complications, adverse events are some of the fields included. Data treatment in terms of encryption, anonymisation, protection and validation is also explained. The selected breast cancer data for description were processed with medium-level statistical programmes, since the number required to apply Big Data engines is not yet available.
From a total of 6,892 solid tumours, 1,892 were breast cancer and 1,654 were selected that complied with a data set minimum elaborated ad hoc. Cases from 13 provinces showed a geolocation bias according to the place of practice of the professionals in the collaborative network. The predominant lack of data was detected in molecular markers (ki67) and correlativity in some lines of treatment. Inconsistencies in dates and therapeutic schemes were also detected. Data curation made it possible to exclude them. The age of the patients was 55.3 ± 11.88 years. At the time of diagnosis, the predominance was in stage I: 36.48% and II 30.06%, with positive hormone receptors in 1,424 (89.96%) cases. The predominant treatments were hormonal (61.54%) and target directed with 30.85% for HER2(+) and 39.14% for HER2(-) accompanied in most cases (85.9%) by some period of chemotherapy. Immunotherapy was much less represented (0.36%). Data were processed, homogenised, pooled and presented and made accessible in a form suitable for application to RWD analyses.
PrecisaXperta fulfils this purpose of systematising the information to facilitate its loading with its simple and intuitive interface. From the analysis of the data obtained in breast cancer, it is clear that some fields should be mandatory in order to improve the quality of the information. The results describing the registered breast cancers give us a surface view of the affected population and prepare us to design future studies when we have local Big Data. This type of development, with continuous improvements and online results, will allow with its dissemination, that the participating professionals have information of what happens in the real world, having available in a democratic way, the epidemiology to be able to study, publish and investigate with these data.
基于真实世界数据(RWD)的注册登记库是在系统化的随机临床试验之外获取的。它们允许从大量患者中收集信息,并使大量专业人员能够参与其中。PrecisaXperta是一个为此目的而开发的网络平台,已运行两年多,针对肿瘤学进行了参数设置。其设计允许实时构建一个可导出进行处理的流行病学数据库。
以乳腺癌数据为例,描述此在线数据记录工具的特点和运行情况,解释其开发方式,并分析所记录信息的质量。
医生、计算机科学家和数据科学分析师参与了开发。患者数据、病史、教育水平、诊断、分期、分子标记物、生活质量、治疗类型、进展和反应、影像学、并发症、不良事件等是其中包含的一些领域。还解释了在加密、匿名化、保护和验证方面的数据处理。由于应用大数据引擎所需的数量尚不可用,因此使用中级统计程序对所选的乳腺癌数据进行描述性处理。
在总共6892例实体瘤中,1892例为乳腺癌,从中选择了1654例符合专门制定的数据集最低要求的病例。根据协作网络中专业人员的执业地点,来自13个省份的病例存在地理定位偏差。在分子标记物(ki67)方面检测到主要的数据缺失以及某些治疗方案中的相关性问题。还检测到日期和治疗方案的不一致性。数据整理使得可以排除这些问题。患者年龄为55.3±11.88岁。诊断时,I期占主导:36.48%,II期占30.06%,1424例(89.96%)病例激素受体呈阳性。主要治疗方法是激素治疗(61.54%),针对HER2(+)的靶向治疗占30.85%,针对HER2(-)的靶向治疗占39.14%,大多数病例(85.9%)伴有一段时间的化疗。免疫治疗的比例要低得多(0.36%)。数据经过处理、同质化、汇总并以适合应用于真实世界数据分析的形式呈现并可供访问。
PrecisaXperta通过其简单直观的界面实现了将信息系统化以方便加载的目的。从对乳腺癌获得的数据的分析中可以清楚地看出,为了提高信息质量,某些领域应成为强制性要求。描述所登记乳腺癌的结果让我们对受影响人群有了一个大致的了解,并为我们在拥有本地大数据时设计未来研究做好准备。这种不断改进并能在线呈现结果的开发类型,随着其传播,将使参与的专业人员能够了解真实世界中发生的情况,以民主的方式获取流行病学信息,从而能够利用这些数据进行研究、发表和调查。