• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生物医学研究中的分析代码共享实践。

Analytical code sharing practices in biomedical research.

作者信息

Sharma Nitesh Kumar, Ayyala Ram, Deshpande Dhrithi, Patel Yesha M, Munteanu Viorel, Ciorba Dumitru, Fiscutean Andrada, Vahed Mohammad, Sarkar Aditya, Guo Ruiwei, Moore Andrew, Darci-Maher Nicholas, Nogoy Nicole A, Abedalthagafi Malak S, Mangul Serghei

机构信息

Titus Family Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, 1540 Alcazar Street, Los Angeles, CA 90033, USA.

Quantitative and Computational Biology Department, USC Dana and David Dornsife College of Letters, Arts, and Sciences, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA.

出版信息

bioRxiv. 2023 Aug 7:2023.07.31.551384. doi: 10.1101/2023.07.31.551384.

DOI:10.1101/2023.07.31.551384
PMID:37609176
原文链接:
https://pmc.ncbi.nlm.nih.gov/articles/PMC10441317/
Abstract

Data-driven computational analysis is becoming increasingly important in biomedical research, as the amount of data being generated continues to grow. However, the lack of practices of sharing research outputs, such as data, source code and methods, affects transparency and reproducibility of studies, which are critical to the advancement of science. Many published studies are not reproducible due to insufficient documentation, code, and data being shared. We conducted a comprehensive analysis of 453 manuscripts published between 2016-2021 and found that 50.1% of them fail to share the analytical code. Even among those that did disclose their code, a vast majority failed to offer additional research outputs, such as data. Furthermore, only one in ten papers organized their code in a structured and reproducible manner. We discovered a significant association between the presence of code availability statements and increased code availability (p=2.71×10). Additionally, a greater proportion of studies conducting secondary analyses were inclined to share their code compared to those conducting primary analyses (p=1.15*10). In light of our findings, we propose raising awareness of code sharing practices and taking immediate steps to enhance code availability to improve reproducibility in biomedical research. By increasing transparency and reproducibility, we can promote scientific rigor, encourage collaboration, and accelerate scientific discoveries. We must prioritize open science practices, including sharing code, data, and other research products, to ensure that biomedical research can be replicated and built upon by others in the scientific community.

摘要

随着所产生的数据量持续增长,数据驱动的计算分析在生物医学研究中变得越来越重要。然而,缺乏共享研究成果(如数据、源代码和方法)的做法,影响了研究的透明度和可重复性,而这对科学进步至关重要。由于共享的文档、代码和数据不足,许多已发表的研究无法复现。我们对2016年至2021年间发表的453篇手稿进行了全面分析,发现其中50.1%未共享分析代码。即使在那些披露了代码的研究中,绝大多数也未能提供数据等其他研究成果。此外,只有十分之一的论文以结构化和可复现的方式组织其代码。我们发现代码可用性声明的存在与代码可用性的提高之间存在显著关联(p = 2.71×10)。此外,与进行初步分析的研究相比,进行二次分析的研究中有更大比例倾向于共享其代码(p = 1.15*10)。鉴于我们的发现,我们建议提高对代码共享实践的认识,并立即采取措施提高代码可用性,以改善生物医学研究中的可重复性。通过提高透明度和可重复性,我们可以促进科学严谨性,鼓励合作,并加速科学发现。我们必须优先考虑开放科学实践,包括共享代码、数据和其他研究产品,以确保生物医学研究能够被科学界的其他人复现并在此基础上进行拓展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7dba/10441317/198ccb840819/nihpp-2023.07.31.551384v3-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7dba/10441317/198ccb840819/nihpp-2023.07.31.551384v3-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7dba/10441317/198ccb840819/nihpp-2023.07.31.551384v3-f0001.jpg

相似文献

1
Analytical code sharing practices in biomedical research.生物医学研究中的分析代码共享实践。
bioRxiv. 2023 Aug 7:2023.07.31.551384. doi: 10.1101/2023.07.31.551384.
2
Analytical code sharing practices in biomedical research.生物医学研究中的分析代码共享实践。
PeerJ Comput Sci. 2024 Jun 28;10:e2066. doi: 10.7717/peerj-cs.2066. eCollection 2024.
3
Sharing Is Caring? International Society for Pharmacoepidemiology Review and Recommendations for Sharing Programming Code.分享即关爱?国际药物流行病学学会对分享编程代码的审查和建议。
Pharmacoepidemiol Drug Saf. 2024 Sep;33(9):e5856. doi: 10.1002/pds.5856.
4
Methods for enhancing the reproducibility of biomedical research findings using electronic health records.利用电子健康记录提高生物医学研究结果可重复性的方法。
BioData Min. 2017 Sep 11;10:31. doi: 10.1186/s13040-017-0151-7. eCollection 2017.
5
Code and Data Sharing Practices in the Radiology Artificial Intelligence Literature: A Meta-Research Study.放射学人工智能文献中的代码和数据共享实践:一项元研究。
Radiol Artif Intell. 2022 Aug 17;4(5):e220081. doi: 10.1148/ryai.220081. eCollection 2022 Sep.
6
Why don't we share data and code? Perceived barriers and benefits to public archiving practices.为什么我们不共享数据和代码?对公共存档实践的感知障碍和收益。
Proc Biol Sci. 2022 Nov 30;289(1987):20221113. doi: 10.1098/rspb.2022.1113. Epub 2022 Nov 23.
7
Validating the knowledge bank approach for personalized prediction of survival in acute myeloid leukemia: a reproducibility study.验证知识库方法在急性髓系白血病患者个体化生存预测中的应用:一项可重复性研究。
Hum Genet. 2022 Sep;141(9):1467-1480. doi: 10.1007/s00439-022-02455-8. Epub 2022 Apr 16.
8
Openness and Computational Reproducibility in Plant Pathology: Where We Stand and a Way Forward.植物病理学中的开放性和计算可重复性:我们的立场和前进道路。
Phytopathology. 2023 Jul;113(7):1159-1170. doi: 10.1094/PHYTO-10-21-0430-PER. Epub 2023 Sep 1.
9
Recommendations to enhance rigor and reproducibility in biomedical research.推荐增强生物医学研究的严谨性和可重复性的建议。
Gigascience. 2020 Jun 1;9(6). doi: 10.1093/gigascience/giaa056.
10
A survey of experimental stimulus presentation code sharing in major areas of psychology.心理学主要领域中实验刺激呈现代码共享情况的调查。
Behav Res Methods. 2024 Oct;56(7):6781-6791. doi: 10.3758/s13428-024-02390-8. Epub 2024 Apr 16.

本文引用的文献

1
epitopepredict: a tool for integrated MHC binding prediction.表位预测:一种用于综合主要组织相容性复合体结合预测的工具。
GigaByte. 2021 Feb 24;2021:gigabyte13. doi: 10.46471/gigabyte.13. eCollection 2021.
2
Why don't we share data and code? Perceived barriers and benefits to public archiving practices.为什么我们不共享数据和代码?对公共存档实践的感知障碍和收益。
Proc Biol Sci. 2022 Nov 30;289(1987):20221113. doi: 10.1098/rspb.2022.1113. Epub 2022 Nov 23.
3
A survey of researchers' code sharing and code reuse practices, and assessment of interactive notebook prototypes.
研究者代码共享和代码复用实践调查,以及交互式笔记本原型评估。
PeerJ. 2022 Aug 22;10:e13933. doi: 10.7717/peerj.13933. eCollection 2022.
4
Retractions are increasing, but not enough.撤稿数量在增加,但还不够。
Nature. 2022 Aug;608(7921):9. doi: 10.1038/d41586-022-02071-6.
5
Advancing code sharing in the computational biology community.推动计算生物学领域的代码共享。
PLoS Comput Biol. 2022 Jun 2;18(6):e1010193. doi: 10.1371/journal.pcbi.1010193. eCollection 2022 Jun.
6
A large-scale study on research code quality and execution.一项关于研究代码质量和执行情况的大规模研究。
Sci Data. 2022 Feb 21;9(1):60. doi: 10.1038/s41597-022-01143-6.
7
Collaborating with our community to increase code sharing.与我们的社区合作以增加代码共享。
PLoS Comput Biol. 2021 Mar 30;17(3):e1008867. doi: 10.1371/journal.pcbi.1008867. eCollection 2021 Mar.
8
Assessment of transparency indicators across the biomedical literature: How open is open?评估生物医学文献中的透明度指标:开放有多开放?
PLoS Biol. 2021 Mar 1;19(3):e3001107. doi: 10.1371/journal.pbio.3001107. eCollection 2021 Mar.
9
Promoting reproducibility with Code Ocean.借助Code Ocean提高可重复性。
Genome Biol. 2021 Feb 19;22(1):65. doi: 10.1186/s13059-021-02299-x.
10
Recognizing the value of software: a software citation guide.认识软件的价值:软件引文指南。
F1000Res. 2020 Oct 19;9:1257. doi: 10.12688/f1000research.26932.2. eCollection 2020.