Mathur Sachin, Beauvais Mathieu, Giribet Arnau, Barrero Nicolas Aragon, Zhang Chaorui-Tom, Rahman Towsif, Wang Seqian, Huang Jeremy, Nouri Nima, Kurlovs Andre, Bar-Joseph Ziv, Passban Peyman
R&D Data and Computational Sciences, Sanofi, Cambridge, MA 02141, United States.
R&D Data and Computational Sciences, Sanofi, Gentilly 94255, France.
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf158.
Several methods have been developed for trajectory inference in single-cell studies. However, identifying relevant lineages among several cell types and interpreting the results of downstream analysis remains a challenging task that requires deep understanding of various cell type transitions and progression patterns. Therefore, there is a need for methods that can aid researchers in the analysis and interpretation of such trajectories.
We developed PyEvoCell, a dashboard for trajectory interpretation and analysis that is augmented by large language model (LLM) capabilities. PyEvoCell applies the LLM to the outputs of trajectory inference methods such as Monocle3, to suggest biologically relevant lineages. Once a lineage is defined, users can conduct differential expression and functional analyses which are also interpreted by the LLM. Finally, any hypothesis or claim derived from the analysis can be validated using the veracity filter, a feature enabled by the LLM, to confirm or reject claims by providing relevant PubMed citations.
The software is available at https://github.com/Sanofi-Public/PyEvoCell. It contains installation instructions, user manual, demo datasets, as well as license conditions. https://doi.org/10.5281/zenodo.15114803.
在单细胞研究中,已经开发了几种用于轨迹推断的方法。然而,在几种细胞类型中识别相关谱系并解释下游分析结果仍然是一项具有挑战性的任务,需要深入了解各种细胞类型的转变和进展模式。因此,需要能够帮助研究人员分析和解释此类轨迹的方法。
我们开发了PyEvoCell,这是一个用于轨迹解释和分析的仪表盘,通过大语言模型(LLM)功能进行增强。PyEvoCell将大语言模型应用于轨迹推断方法(如Monocle3)的输出,以建议生物学上相关的谱系。一旦定义了一个谱系,用户就可以进行差异表达和功能分析,这些分析也由大语言模型进行解释。最后,可以使用真实性过滤器(大语言模型启用的一项功能)来验证从分析中得出的任何假设或主张,通过提供相关的PubMed引文来确认或反驳主张。