Suppr超能文献

使用监督学习和基于深度学习的特征提取相结合的方法识别芋螺种类。

Recognition of Conus species using a combined approach of supervised learning and deep learning-based feature extraction.

作者信息

Qasmi Noshaba, Bibi Rimsha, Rashid Sajid

机构信息

National Center for Bioinformatics, Quaid-i-Azam University, Islamabad, Pakistan.

出版信息

PLoS One. 2024 Dec 9;19(12):e0313329. doi: 10.1371/journal.pone.0313329. eCollection 2024.

Abstract

Cone snails are venomous marine gastropods comprising more than 950 species widely distributed across different habitats. Their conical shells are remarkably similar to those of other invertebrates in terms of color, pattern, and size. For these reasons, assigning taxonomic signatures to cone snail shells is a challenging task. In this report, we propose an ensemble learning strategy based on the combination of Random Forest (RF) and XGBoost (XGB) methods. We used 47,600 cone shell images of uniform size (224 x 224 pixels), which were split into an 80:20 train-test ratio. Prior to performing subsequent operations, these images were subjected to pre-processing and transformation. After applying a deep learning approach (Visual Geometry Group with a 16-layer deep model architecture) for feature extraction, model specificity was further assessed by including multiple related and unrelated seashell images. Both classifiers demonstrated comparable recognition ability on random test samples. The evaluation results suggested that RF outperformed XGB due to its high accuracy in recognizing Conus species, with an average precision of 95.78%. The area under the receiver operating characteristic curve was 0.99, indicating the model's optimal performance. The learning and validation curves also demonstrated a robust fit, with the training score reaching 1 and the validation score gradually increasing to 95 as more data was provided. These values indicate a well-trained model that generalizes effectively to validation data without significant overfitting. The gradual improvement in the validation score curve is crucial for ensuring model reliability and minimizing the risk of overfitting. Our findings revealed an interactive visualization. The performance of our proposed model suggests its potential for use with datasets of other mollusks, and optimal results may be achieved for their categorization and taxonomical characterization.

摘要

芋螺是有毒的海洋腹足纲动物,有950多种,广泛分布于不同栖息地。它们的锥形外壳在颜色、图案和大小方面与其他无脊椎动物的外壳非常相似。由于这些原因,为芋螺壳赋予分类特征是一项具有挑战性的任务。在本报告中,我们提出了一种基于随机森林(RF)和极端梯度提升(XGB)方法相结合的集成学习策略。我们使用了47600张尺寸统一(224×224像素)的芋螺壳图像,按照80:20的训练-测试比例进行划分。在执行后续操作之前,这些图像进行了预处理和变换。在应用深度学习方法(具有16层深度模型架构的视觉几何组)进行特征提取后,通过纳入多个相关和不相关的贝壳图像进一步评估模型的特异性。两个分类器在随机测试样本上表现出相当的识别能力。评估结果表明,RF在识别芋螺物种方面的准确率更高,平均精度为95.78%,优于XGB。接收器操作特征曲线下的面积为0.99,表明模型性能最佳。学习曲线和验证曲线也显示出稳健的拟合,随着提供的数据增多,训练分数达到1,验证分数逐渐提高到95。这些值表明模型训练良好,能够有效地推广到验证数据,且没有明显的过拟合。验证分数曲线的逐渐改善对于确保模型可靠性和最小化过拟合风险至关重要。我们的研究结果揭示了一种交互式可视化。我们提出的模型的性能表明其在处理其他软体动物数据集方面的潜力,并且在对它们进行分类和分类特征描述时可能会取得最佳结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a7/11627371/76cea4996a9c/pone.0313329.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验