合成生物学中的数据风险。

Data hazards in synthetic biology.

作者信息

Zelenka Natalie R, Di Cara Nina, Sharma Kieren, Sarvaharman Seeralan, Ghataora Jasdeep S, Parmeggiani Fabio, Nivala Jeff, Abdallah Zahraa S, Marucci Lucia, Gorochowski Thomas E

机构信息

Jean Golding Institute, University of Bristol, Bristol, UK.

BrisEngBio, University of Bristol, Bristol, UK.

出版信息

Synth Biol (Oxf). 2024 Jun 21;9(1):ysae010. doi: 10.1093/synbio/ysae010. eCollection 2024.

DOI:10.1093/synbio/ysae010

PMID:38973982

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11227101/

Abstract

Data science is playing an increasingly important role in the design and analysis of engineered biology. This has been fueled by the development of high-throughput methods like massively parallel reporter assays, data-rich microscopy techniques, computational protein structure prediction and design, and the development of whole-cell models able to generate huge volumes of data. Although the ability to apply data-centric analyses in these contexts is appealing and increasingly simple to do, it comes with potential risks. For example, how might biases in the underlying data affect the validity of a result and what might the environmental impact of large-scale data analyses be? Here, we present a community-developed framework for assessing data hazards to help address these concerns and demonstrate its application to two synthetic biology case studies. We show the diversity of considerations that arise in common types of bioengineering projects and provide some guidelines and mitigating steps. Understanding potential issues and dangers when working with data and proactively addressing them will be essential for ensuring the appropriate use of emerging data-intensive AI methods and help increase the trustworthiness of their applications in synthetic biology.

摘要

数据科学在合成生物学的设计和分析中发挥着越来越重要的作用。大规模平行报告基因检测等高通量方法、数据丰富的显微镜技术、计算蛋白质结构预测与设计以及能够生成大量数据的全细胞模型的发展推动了这一趋势。尽管在这些情况下应用以数据为中心的分析很有吸引力且越来越容易实现，但也存在潜在风险。例如，基础数据中的偏差可能如何影响结果的有效性，大规模数据分析的环境影响又可能是什么？在此，我们提出一个由社区开发的用于评估数据危害的框架，以帮助解决这些问题，并展示其在两个合成生物学案例研究中的应用。我们展示了常见类型生物工程项目中出现的各种考量因素，并提供了一些指导方针和缓解措施。了解处理数据时的潜在问题和危险并积极应对，对于确保正确使用新兴的数据密集型人工智能方法以及提高其在合成生物学应用中的可信度至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8156/11227101/8b98dca5deeb/ysae010f1.jpg

相似文献

Data hazards in synthetic biology.

Synth Biol (Oxf). 2024 Jun 21;9(1):ysae010. doi: 10.1093/synbio/ysae010. eCollection 2024.

Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).

Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.

Erratum: Eyestalk Ablation to Increase Ovarian Maturation in Mud Crabs.

J Vis Exp. 2023 May 26(195). doi: 10.3791/6561.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Biosecurity Risk Assessment for the Use of Artificial Intelligence in Synthetic Biology.

Appl Biosaf. 2024 Jun 20;29(2):96-107. doi: 10.1089/apb.2023.0031. eCollection 2024 Jun.

Vaccine design and development: Exploring the interface with computational biology and AI.

Int Rev Immunol. 2024;43(6):361-380. doi: 10.1080/08830185.2024.2374546. Epub 2024 Jul 10.

Design and Analysis of Massively Parallel Reporter Assays Using FORECAST.

Methods Mol Biol. 2023;2553:41-56. doi: 10.1007/978-1-0716-2617-7_3.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification

The future of Cochrane Neonatal.

Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

引用本文的文献

From consultors to collaborators - An SOP for advancing ethics engagement in science.

Synth Syst Biotechnol. 2025 Jun 23;10(4):1180-1189. doi: 10.1016/j.synbio.2025.06.006. eCollection 2025 Dec.

Data Hazards as An Ethical Toolkit for Neuroscience.

Neuroethics. 2025;18(1):15. doi: 10.1007/s12152-024-09580-3. Epub 2025 Feb 19.

本文引用的文献

Engineering is evolution: a perspective on design processes to engineer biology.

Nat Commun. 2024 Apr 29;15(1):3640. doi: 10.1038/s41467-024-48000-1.

Could AI-designed proteins be weaponized? Scientists lay out safety guidelines.

Nature. 2024 Mar;627(8004):478. doi: 10.1038/d41586-024-00699-0.

Beyond Biosecurity by Taxonomic Lists: Lessons, Challenges, and Opportunities.

Health Secur. 2023 Nov-Dec;21(6):521-529. doi: 10.1089/hs.2022.0109. Epub 2023 Oct 19.

iGEM 2021: A Year in Review.

Biodes Res. 2022 Mar 14;2022:9794609. doi: 10.34133/2022/9794609. eCollection 2022.

Whole-cell modeling of E. coli colonies enables quantification of single-cell heterogeneity in antibiotic responses.

PLoS Comput Biol. 2023 Jun 16;19(6):e1011232. doi: 10.1371/journal.pcbi.1011232. eCollection 2023 Jun.

Lot-to-Lot Variance in Immunoassays-Causes, Consequences, and Solutions.

Diagnostics (Basel). 2023 May 24;13(11):1835. doi: 10.3390/diagnostics13111835.

Whole-cell modeling of E. coli confirms that in vitro tRNA aminoacylation measurements are insufficient to support cell growth and predicts a positive feedback mechanism regulating arginine biosynthesis.

Nucleic Acids Res. 2023 Jul 7;51(12):5911-5930. doi: 10.1093/nar/gkad435.

Effective design and inference for cell sorting and sequencing based massively parallel reporter assays.

Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad277.

Bridging the gap between mechanistic biological models and machine learning surrogates.

PLoS Comput Biol. 2023 Apr 20;19(4):e1010988. doi: 10.1371/journal.pcbi.1010988. eCollection 2023 Apr.

Evolutionary-scale prediction of atomic-level protein structure with a language model.

Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

合成生物学中的数据风险。

Data hazards in synthetic biology.

作者信息

Zelenka Natalie R, Di Cara Nina, Sharma Kieren, Sarvaharman Seeralan, Ghataora Jasdeep S, Parmeggiani Fabio, Nivala Jeff, Abdallah Zahraa S, Marucci Lucia, Gorochowski Thomas E

机构信息

Jean Golding Institute, University of Bristol, Bristol, UK.

BrisEngBio, University of Bristol, Bristol, UK.

出版信息

Synth Biol (Oxf). 2024 Jun 21;9(1):ysae010. doi: 10.1093/synbio/ysae010. eCollection 2024.

DOI:10.1093/synbio/ysae010

PMID:38973982

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11227101/

Abstract

摘要

合成生物学中的数据风险。

Data hazards in synthetic biology.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

合成生物学中的数据风险。

Data hazards in synthetic biology.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献