• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用 FoodMine 探索科学文献中的食品内容。

Exploring food contents in scientific literature with FoodMine.

机构信息

Network Science Institute, Northeastern University, Boston, MA, USA.

Division of Network Medicine, Department of Medicine, Harvard Medical School, Boston, MA, USA.

出版信息

Sci Rep. 2020 Oct 1;10(1):16191. doi: 10.1038/s41598-020-73105-0.

DOI:10.1038/s41598-020-73105-0
PMID:33004889
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7529743/
Abstract

Thanks to the many chemical and nutritional components it carries, diet critically affects human health. However, the currently available comprehensive databases on food composition cover only a tiny fraction of the total number of chemicals present in our food, focusing on the nutritional components essential for our health. Indeed, thousands of other molecules, many of which have well documented health implications, remain untracked. To explore the body of knowledge available on food composition, we built FoodMine, an algorithm that uses natural language processing to identify papers from PubMed that potentially report on the chemical composition of garlic and cocoa. After extracting from each paper information on the reported quantities of chemicals, we find that the scientific literature carries extensive information on the detailed chemical components of food that is currently not integrated in databases. Finally, we use unsupervised machine learning to create chemical embeddings, finding that the chemicals identified by FoodMine tend to have direct health relevance, reflecting the scientific community's focus on health-related chemicals in our food.

摘要

由于其携带的许多化学和营养成分,饮食对人类健康至关重要。然而,目前可用的综合性食物成分数据库仅涵盖了我们食物中存在的化学物质的一小部分,重点是对我们健康至关重要的营养成分。事实上,还有数千种其他分子,其中许多分子对健康有明确的影响,仍然没有被追踪到。为了探索有关食物成分的现有知识体系,我们构建了 FoodMine,这是一种使用自然语言处理技术从 PubMed 中识别可能报告大蒜和可可化学组成的论文的算法。从每篇论文中提取有关报告化学物质数量的信息后,我们发现科学文献中载有大量有关食物详细化学成分的信息,而这些信息目前并未整合到数据库中。最后,我们使用无监督机器学习来创建化学嵌入,发现 FoodMine 识别的化学物质往往与直接的健康相关性有关,反映了科学界对我们食物中与健康相关的化学物质的关注。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/137d30178ad8/41598_2020_73105_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/af1e04d83dca/41598_2020_73105_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/349868a6a639/41598_2020_73105_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/7da632f425ad/41598_2020_73105_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/4d00572cf061/41598_2020_73105_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/137d30178ad8/41598_2020_73105_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/af1e04d83dca/41598_2020_73105_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/349868a6a639/41598_2020_73105_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/7da632f425ad/41598_2020_73105_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/4d00572cf061/41598_2020_73105_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7303/7529743/137d30178ad8/41598_2020_73105_Fig5_HTML.jpg

相似文献

1
Exploring food contents in scientific literature with FoodMine.利用 FoodMine 探索科学文献中的食品内容。
Sci Rep. 2020 Oct 1;10(1):16191. doi: 10.1038/s41598-020-73105-0.
2
FoodAtlas: Automated knowledge extraction of food and chemicals from literature.食品图谱:从文献中自动提取食品和化学物质的知识
Comput Biol Med. 2024 Oct;181:109072. doi: 10.1016/j.compbiomed.2024.109072. Epub 2024 Aug 30.
3
Unsupervised word embeddings capture latent knowledge from materials science literature.无监督词嵌入方法可以从材料科学文献中提取潜在知识。
Nature. 2019 Jul;571(7763):95-98. doi: 10.1038/s41586-019-1335-8. Epub 2019 Jul 3.
4
Safety and nutritional assessment of GM plants and derived food and feed: the role of animal feeding trials.转基因植物及其衍生食品和饲料的安全性与营养评估:动物饲养试验的作用
Food Chem Toxicol. 2008 Mar;46 Suppl 1:S2-70. doi: 10.1016/j.fct.2008.02.008. Epub 2008 Feb 13.
5
Virtual food components: functional food effects expressed as food components.虚拟食品成分:以食品成分表示的功能性食品效应。
Eur J Clin Nutr. 2004 Feb;58(2):219-30. doi: 10.1038/sj.ejcn.1601769.
6
Filtering large-scale event collections using a combination of supervised and unsupervised learning for event trigger classification.结合监督学习和无监督学习对事件触发分类进行大规模事件集合过滤。
J Biomed Semantics. 2016 May 11;7:27. doi: 10.1186/s13326-016-0070-4. eCollection 2016.
7
Food composition data: the foundation of dietetic practice and research.食物成分数据:饮食实践与研究的基础。
J Am Diet Assoc. 2007 Dec;107(12):2105-13. doi: 10.1016/j.jada.2007.09.004.
8
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
9
Intake of energy and nutrients; harmonization of Food Composition Databases.能量和营养素的摄入;食品成分数据库的协调统一。
Nutr Hosp. 2015 Feb 26;31 Suppl 3:168-76. doi: 10.3305/nh.2015.31.sup3.8764.
10
Co-occurrence graphs for word sense disambiguation in the biomedical domain.生物医学领域词义消歧的共现图。
Artif Intell Med. 2018 May;87:9-19. doi: 10.1016/j.artmed.2018.03.002. Epub 2018 Mar 21.

引用本文的文献

1
Prevalence of processed foods in major US grocery stores.美国主要杂货店中加工食品的流行情况。
Nat Food. 2025 Mar;6(3):296-308. doi: 10.1038/s43016-024-01095-7. Epub 2025 Jan 13.
2
Decoding the Foodome: Molecular Networks Connecting Diet and Health.解码食物组学:连接饮食与健康的分子网络。
Annu Rev Nutr. 2024 Aug;44(1):257-288. doi: 10.1146/annurev-nutr-062322-030557.
3
GroceryDB: Prevalence of Processed Food in Grocery Stores.杂货店数据库:杂货店中加工食品的流行情况。
medRxiv. 2025 Jan 16:2022.04.23.22274217. doi: 10.1101/2022.04.23.22274217.
4
Volatilomics-Based Discovery of Key Volatiles Affecting Flavor Quality in Tomato.基于挥发组学发现影响番茄风味品质的关键挥发性物质
Foods. 2024 Mar 14;13(6):879. doi: 10.3390/foods13060879.
5
From data to insight: Exploring contaminants in different food groups with literature mining and machine learning techniques.从数据到洞察:运用文献挖掘和机器学习技术探索不同食物类别中的污染物
Curr Res Food Sci. 2023 Aug 3;7:100557. doi: 10.1016/j.crfs.2023.100557. eCollection 2023.
6
Food composition databases in the era of Big Data: Vegetable oils as a case study.大数据时代的食物成分数据库:以植物油为例的研究
Front Nutr. 2023 Jan 5;9:1052934. doi: 10.3389/fnut.2022.1052934. eCollection 2022.
7
MilkyBase, a database of human milk composition as a function of maternal-, infant- and measurement conditions.乳基数据库,该数据库收录了人乳成分,这些成分与产妇、婴儿和测量条件有关。
Sci Data. 2022 Sep 9;9(1):557. doi: 10.1038/s41597-022-01663-1.
8
A Catalog of Natural Products Occurring in Watermelon-.西瓜中天然产物目录。
Front Nutr. 2021 Sep 14;8:729822. doi: 10.3389/fnut.2021.729822. eCollection 2021.
9
ScanBious: Survey for Obesity Genes Using PubMed Abstracts and DisGeNET.ScanBious:利用PubMed摘要和DisGeNET进行肥胖基因调查。
J Pers Med. 2021 Mar 29;11(4):246. doi: 10.3390/jpm11040246.
10
The complexities of the diet-microbiome relationship: advances and perspectives.饮食-微生物组关系的复杂性:进展与展望。
Genome Med. 2021 Jan 20;13(1):10. doi: 10.1186/s13073-020-00813-7.