Lee Kang-Woo, Lee Dong-Ho, Na In-Su, Kim Jin-Woo, Lee Na-Yeon, Park Jin-Soo, Park Keunwan, Nguyen Chau Hoang Bao, Kang Kyungsu, Shim Soon-Mi
Department of Food Science and Biotechnology, Sejong University, Seoul, South Korea.
Center for Natural Product Systems Biology, Gangneung Institute of Natural Products, Korea Institute of Science and Technology, Gangneung, South Korea.
J Sci Food Agric. 2025 Sep;105(12):6850-6861. doi: 10.1002/jsfa.14400. Epub 2025 May 30.
The present study aimed to measure bioavailability (BA) indicators, including epithelial barrier function, apparent permeability (P) and efflux ratio, of 84 types of phytochemicals using Caco-2 cell and to develop predictive model systems using machine learning with a quantitative structure-property relationship (QSPR) model based on BA indicators and an Isomeric Simplified Molecular Input Line Entry System (SMILES). Analysis of phytochemicals was carried out with a validated HPLC analytical method.
With these BA indicators, Isomeric SMILES including information such as the stereochemistry, chemical structure and properties of phytochemicals was encoded to molecular descriptors using PaDEL-Descriptor and alvaDesc. The validity of the dataset was verified using principal component analysis, leverage plot and Williams plot. In the case of transepithelial electrical resistance (TEER), R is 0.86, root mean square error (RMSE) is 55.25, R is 0.63 and RMSE is 74.77, respectively. Regarding the P, the model demonstrated strong performance on the training set with RMSE of 4.54 × 10 and R of 0.95 with the test set results (RMSE = 6.23 × 10 and R = 0.91). For the efflux ratio, the modle explains 92% of the variance with RMSE of 0.39, R of 0.92, R of 0.85 and RMSE of 0.71.
The present study suggests that a prediction system for bioavailability, including TEER, P and efflux ratio, can be developed using a QSPR model, which could contribute to advancements in discover of functional ingredients and drugs. © 2025 The Author(s). Journal of the Science of Food and Agriculture published by John Wiley & Sons Ltd on behalf of Society of Chemical Industry.
本研究旨在使用Caco-2细胞测量84种植物化学物质的生物利用度(BA)指标,包括上皮屏障功能、表观渗透率(P)和外排率,并使用基于BA指标的定量结构-性质关系(QSPR)模型和异构简化分子输入线性条目系统(SMILES),通过机器学习开发预测模型系统。采用经过验证的高效液相色谱(HPLC)分析方法对植物化学物质进行分析。
利用这些BA指标,通过PaDEL-Descriptor和alvaDesc将包含植物化学物质立体化学、化学结构和性质等信息的异构SMILES编码为分子描述符。使用主成分分析、杠杆图和威廉姆斯图验证数据集的有效性。对于跨上皮电阻(TEER),R为0.86,均方根误差(RMSE)为55.25,R为0.63,RMSE为74.77。关于P,该模型在训练集上表现出色,RMSE为4.54×10,测试集结果的R为0.95(RMSE = 6.23×10,R = 0.91)。对于外排率,该模型解释了92%的方差,RMSE为0.39,R为0.92,R为0.85,RMSE为0.71。
本研究表明,使用QSPR模型可以开发包括TEER、P和外排率在内的生物利用度预测系统,这有助于功能性成分和药物发现方面的进展。© 2025作者。《食品与农业科学杂志》由约翰·威利父子有限公司代表化学工业协会出版。