Liu Renming, Krishnan Arjun
Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA.
Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
Bioinformatics. 2021 Oct 11;37(19):3377-3379. doi: 10.1093/bioinformatics/btab202.
Learning low-dimensional representations (embeddings) of nodes in large graphs is key to applying machine learning on massive biological networks. Node2vec is the most widely used method for node embedding. However, its original Python and C++ implementations scale poorly with network density, failing for dense biological networks with hundreds of millions of edges. We have developed PecanPy, a new Python implementation of node2vec that uses cache-optimized compact graph data structures and precomputing/parallelization to result in fast, high-quality node embeddings for biological networks of all sizes and densities.
PecanPy software is freely available at https://github.com/krishnanlab/PecanPy.
Supplementary data are available at Bioinformatics online.
学习大型图中节点的低维表示(嵌入)是在大规模生物网络上应用机器学习的关键。Node2vec是最广泛使用的节点嵌入方法。然而,其原始的Python和C++实现随着网络密度的增加扩展性较差,对于具有数亿条边的密集生物网络会失效。我们开发了PecanPy,这是一种新的Node2vec的Python实现,它使用缓存优化的紧凑图数据结构以及预计算/并行化,可为各种规模和密度的生物网络生成快速、高质量的节点嵌入。
PecanPy软件可在https://github.com/krishnanlab/PecanPy上免费获取。
补充数据可在《生物信息学》在线版获取。