Shah Imran, Tate Tia, Patlewicz Grace
Center for Computational Toxicology and Exposure, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, NC 27709, USA.
Bioinformatics. 2021 Oct 11;37(19):3380-3381. doi: 10.1093/bioinformatics/btab210.
Generalized Read-Across (GenRA) is a data-driven approach to estimate physico-chemical, biological or eco-toxicological properties of chemicals by inference from analogues. GenRA attempts to mimic a human expert's manual read-across reasoning for filling data gaps about new chemicals from known chemicals with an interpretable and automated approach based on nearest-neighbors. A key objective of GenRA is to systematically explore different choices of input data selection and neighborhood definition to objectively evaluate predictive performance of automated read-across estimates of chemical properties.
We have implemented genra-py as a python package that can be freely used for chemical safety analysis and risk assessment applications. Automated read-across prediction in genra-py conforms to the scikit-learn machine learning library's estimator design pattern, making it easy to use and integrate in computational pipelines. We demonstrate the data-driven application of genra-py to address two key human health risk assessment problems namely: hazard identification and point of departure estimation.
The package is available from github.com/i-shah/genra-py.
广义类推法(GenRA)是一种数据驱动的方法,通过从类似物进行推断来估计化学品的物理化学、生物学或生态毒理学性质。GenRA试图模仿人类专家的手动类推推理,以一种基于最近邻的可解释且自动化的方法,从已知化学品中填补新化学品的数据空白。GenRA的一个关键目标是系统地探索输入数据选择和邻域定义的不同选择,以客观评估化学品性质自动类推估计的预测性能。
我们已将genra-py实现为一个Python包,可免费用于化学安全分析和风险评估应用。genra-py中的自动类推预测符合scikit-learn机器学习库的估计器设计模式,使其易于使用并集成到计算管道中。我们展示了genra-py的数据驱动应用,以解决两个关键的人类健康风险评估问题,即:危害识别和出发剂量估计。
该包可从github.com/i-shah/genra-py获取。