Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of AI Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China.
School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China.
Genome Med. 2022 Apr 26;14(1):43. doi: 10.1186/s13073-022-01047-5.
The taxonomic structure of microbial community sample is highly habitat-specific, making source tracking possible, allowing identification of the niches where samples originate. However, current methods face challenges when source tracking is scaled up. Here, we introduce a deep learning method based on the Ontology-aware Neural Network approach, ONN4MST, for large-scale source tracking. ONN4MST outperformed other methods with near-optimal accuracy when source tracking among 125,823 samples from 114 niches. ONN4MST also has a broad spectrum of applications. Overall, this study represents the first model-based method for source tracking among sub-million microbial community samples from hundreds of niches, with superior speed, accuracy, and interpretability. ONN4MST is available at https://github.com/HUST-NingKang-Lab/ONN4MST .
微生物群落样本的分类结构高度特定于栖息地,这使得源追踪成为可能,从而能够确定样本的来源。然而,当源追踪扩展到大规模时,当前的方法面临挑战。在这里,我们引入了一种基于本体感知神经网络方法(ONN4MST)的深度学习方法,用于大规模源追踪。在对来自 114 个小生境的 125823 个样本进行源追踪时,ONN4MST 的准确性接近最优,优于其他方法。ONN4MST 还具有广泛的应用。总的来说,这项研究代表了第一个用于从数百个小生境中对数百万个微生物群落样本进行源追踪的基于模型的方法,具有卓越的速度、准确性和可解释性。ONN4MST 可在 https://github.com/HUST-NingKang-Lab/ONN4MST 获得。