文献检索，用中文搜 PubMed

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

UNLABELLED

Cancer registries are collections of curated data about malignant tumor diseases. The amount of data processed by cancer registries increases every year, making manual registration more and more tedious.

OBJECTIVE

We sought to develop an automatic analysis pipeline that would be able to identify and preprocess registry input for incident prostate adenocarcinomas in a French regional cancer registry.

METHODS

Notifications from different sources submitted to the Bas-Rhin cancer registry were used here: pathology data and, ICD 10 diagnosis codes from hospital discharge data and healthcare insurance data. We trained a Support Vector Machine model (machine learning) to predict whether patient's data must be considered or not as a prostate adenocarcinoma incident case that should therefore be registered. The final registration of all identified cases was manually confirmed by a specialized technician. Text mining tools (regular expressions) were used to extract clinical and biological data from non-structured pathology reports.

RESULTS

We performed two successive analyses. First, we used 982 cases manually labeled by registrars from the 2014 dataset to predict the registration of 785 cases submitted in 2015. Then, we repeated the procedure using the 2089 cases labeled by registrars from the 2014 and 2015 datasets to predict the registration of 926 cases submitted in the 2016 data. The algorithm identified 663 cases of prostate adenocarcinoma in 2015, and 610 in 2016. From these findings, 663 and 531 cases were respectively added to the registry; and 641 and 512 cases were confirmed by the specialized technician. This registration process has achieved a precision level above 96 %. The algorithm obtained an overall precision of 99 % (99.5 % in 2015 and 98.5 % in 2016) and a recall of 97 % (97.8 % in 2015 and 96.9 % in 2016). When the information was found in pathology report, text mining was more than 90 % accuracy for major indicators: PSA test, Gleason score, and incidence date). For both PSA and tumor side, information was not detected in the majority of cases."

CONCLUSION

Machine learning was able to identify new cases of prostate cancer, and text mining was able to prefill the data about incident cases. Machine-learning-based automation of the registration process could reduce delays in data production and allow investigators to devote more time to complex tasks and analysis.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

机器学习应用于法国区域性癌症登记处前列腺腺癌病例的自动登记。

Machine learning application for incident prostate adenocarcinomas automatic registration in a French regional cancer registry.

机构信息

出版信息

UNLABELLED

OBJECTIVE

METHODS

RESULTS

CONCLUSION

相似文献

引用本文的文献

机器学习应用于法国区域性癌症登记处前列腺腺癌病例的自动登记。

Machine learning application for incident prostate adenocarcinomas automatic registration in a French regional cancer registry.

机构信息

出版信息

UNLABELLED

OBJECTIVE

METHODS

RESULTS

CONCLUSION

相似文献

引用本文的文献