Suppr超能文献

基于机器学习的无分类学方法评估硅藻河流质量。

A taxonomy-free approach based on machine learning to assess the quality of rivers with diatoms.

机构信息

MARE - Marine and Environmental Sciences Centre, Department of Life Sciences, University of Coimbra, Portugal.

MARE - Marine and Environmental Sciences Centre, Department of Life Sciences, University of Coimbra, Portugal.

出版信息

Sci Total Environ. 2020 Jun 20;722:137900. doi: 10.1016/j.scitotenv.2020.137900. Epub 2020 Mar 12.

Abstract

Diatoms are a compulsory biological quality element in the ecological assessment of rivers according to the Water Framework Directive. The application of current official indices requires the identification of individuals to species or lower rank under a microscope based on the valve morphology. This is a highly time-consuming task, often susceptible of disagreements among analysts. In alternative, the use of DNA metabarcoding combined with High-Throughput Sequencing (HTS) has been proposed. The sequences obtained from environmental DNA are clustered into Operational Taxonomic Units (OTUs), which can be assigned to a taxon using reference databases, and from there calculate biotic indices. However, there is still a high percentage of unassigned OTUs to species due to the incompleteness of reference libraries. Alternatively, we tested a new taxonomy-free approach based on diatom community samples to assess rivers. A combination of three machine learning techniques is used to build models that predict diatom OTUs expected in test sites, under reference conditions, from environmental data. The Observed/Expected OTUs ratio indicates the deviation from reference condition and is converted into a quality class. This approach was never used with diatoms neither with OTUs data. To evaluate its efficiency, we built a model based on OTUs lists (HYDGEN) and another based on taxa lists from morphological identification (HYDMORPH), and also calculated a biotic index (IPS). The models were trained and tested with data from 81 sites (44 reference sites) from central Portugal. Both models were considered accurate (linear regression for Observed and Expected richness: R ≈ 0.7, interception ≈ 0.8) and sensitive to global anthropogenic disturbance (Rs > 0.30 p < 0.006 for global disturbance). Yet, the HYDGEN model based on molecular data was sensitive to more types of pressures (such as, changes in land use and habitat quality), which gives promising insights to its use for bioassessment of rivers.

摘要

根据《水框架指令》,硅藻是河流生态评估的强制性生物质量要素。当前官方指数的应用要求在显微镜下根据阀形态将个体鉴定为种或更低等级。这是一项非常耗时的任务,分析师之间往往存在分歧。作为替代方案,已提出使用 DNA metabarcoding 与高通量测序 (HTS) 相结合。从环境 DNA 获得的序列被聚类为操作分类单元 (OTU),可以使用参考数据库将其分配给分类群,并从那里计算生物指数。然而,由于参考文库的不完整性,仍有很大比例的 OTU 无法分配给物种。或者,我们测试了一种新的无分类学方法,该方法基于硅藻群落样本来评估河流。使用三种机器学习技术的组合来构建模型,该模型根据环境数据从参考条件下预测测试点预期的硅藻 OTU。Observed/Expected OTUs 比率表示与参考条件的偏差,并转换为质量等级。这种方法从未用于硅藻或 OTUs 数据。为了评估其效率,我们基于 OTUs 列表 (HYDGEN) 和基于形态识别的分类列表 (HYDMORPH) 构建了一个模型,还计算了一个生物指数 (IPS)。该模型使用来自葡萄牙中部 81 个站点 (44 个参考站点) 的数据进行训练和测试。两个模型都被认为是准确的 (线性回归对于 Observed 和 Expected 丰富度:R ≈ 0.7,截距 ≈ 0.8),并且对全球人为干扰敏感 (对于全球干扰,Rs > 0.30 p < 0.006)。然而,基于分子数据的 HYDGEN 模型对更多类型的压力(如土地利用和生境质量的变化)敏感,这为其用于河流生物评估提供了有希望的见解。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验