Gupta Shruti, Verma Ajay Kumar, Ahmad Shandar
School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Mehrauli Road, New Delhi 110067, India.
Genes (Basel). 2020 Dec 28;12(1):28. doi: 10.3390/genes12010028.
Single-cell transcriptomics data, when combined with in situ hybridization patterns of specific genes, can help in recovering the spatial information lost during cell isolation. Dialogue for Reverse Engineering Assessments and Methods (DREAM) consortium conducted a crowd-sourced competition known as DREAM Single Cell Transcriptomics Challenge (SCTC) to predict the masked locations of single cells from a set of 60, 40 and 20 genes out of 84 in situ gene patterns known in embryo. We applied a genetic algorithm (GA) to predict the most important genes that carry positional and proximity information of the single-cell origins, in combination with the base distance mapping algorithm DistMap. Resulting gene selection was found to perform well and was ranked among top 10 in two of the three sub-challenges. However, the details of the method did not make it to the main challenge publication, due to an intricate aggregation ranking. In this work, we discuss the detailed implementation of GA and its post-challenge parameterization, with a view to identify potential areas where GA-based approaches of gene-set selection for topological association prediction may be improved, to be more effective. We believe this work provides additional insights into the feature-selection strategies and their relevance to single-cell similarity prediction and will form a strong addendum to the recently published work from the consortium.
单细胞转录组学数据与特定基因的原位杂交模式相结合时,有助于恢复细胞分离过程中丢失的空间信息。逆向工程评估与方法对话(DREAM)联盟举办了一场众包竞赛,即DREAM单细胞转录组学挑战赛(SCTC),目的是根据胚胎中已知的84种原位基因模式中的60、40和20个基因,预测单细胞的隐蔽位置。我们应用遗传算法(GA)结合基本距离映射算法DistMap,来预测携带单细胞起源位置和邻近信息的最重要基因。结果发现,所选基因表现良好,在三个子挑战中的两个挑战中排名前十。然而,由于复杂的综合排名,该方法的细节未在主要挑战出版物中呈现。在这项工作中,我们讨论了GA的详细实现及其挑战后的参数化,旨在识别基于GA的拓扑关联预测基因集选择方法可能得到改进的潜在领域,以提高其有效性。我们相信这项工作为特征选择策略及其与单细胞相似性预测的相关性提供了更多见解,并将成为该联盟最近发表工作的有力补充。