一种用于高维噪声数据自动聚类的多阶段数学方法。

A multistage mathematical approach to automated clustering of high-dimensional noisy data.

作者信息

Friedman Alexander, Keselman Michael D, Gibb Leif G, Graybiel Ann M

机构信息

McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139.

McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139

出版信息

Proc Natl Acad Sci U S A. 2015 Apr 7;112(14):4477-82. doi: 10.1073/pnas.1503940112. Epub 2015 Mar 23.

DOI:10.1073/pnas.1503940112

PMID:25831512

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4394271/

Abstract

A critical problem faced in many scientific fields is the adequate separation of data derived from individual sources. Often, such datasets require analysis of multiple features in a highly multidimensional space, with overlap of features and sources. The datasets generated by simultaneous recording from hundreds of neurons emitting phasic action potentials have produced the challenge of separating the recorded signals into independent data subsets (clusters) corresponding to individual signal-generating neurons. Mathematical methods have been developed over the past three decades to achieve such spike clustering, but a complete solution with fully automated cluster identification has not been achieved. We propose here a fully automated mathematical approach that identifies clusters in multidimensional space through recursion, which combats the multidimensionality of the data. Recursion is paired with an approach to dimensional evaluation, in which each dimension of a dataset is examined for its informational importance for clustering. The dimensions offering greater informational importance are given added weight during recursive clustering. To combat strong background activity, our algorithm takes an iterative approach of data filtering according to a signal-to-noise ratio metric. The algorithm finds cluster cores, which are thereafter expanded to include complete clusters. This mathematical approach can be extended from its prototype context of spike sorting to other datasets that suffer from high dimensionality and background activity.

摘要

许多科学领域面临的一个关键问题是充分分离来自各个源的数据。通常，此类数据集需要在高度多维的空间中分析多个特征，特征和源之间存在重叠。通过同时记录数百个发出相位动作电位的神经元所生成的数据集带来了将记录信号分离为对应于各个信号生成神经元的独立数据子集（簇）的挑战。在过去三十年中已经开发出数学方法来实现这种尖峰聚类，但尚未实现具有完全自动化簇识别的完整解决方案。我们在此提出一种完全自动化的数学方法，该方法通过递归在多维空间中识别簇，以应对数据的多维性。递归与一种维度评估方法相结合，在该方法中，会检查数据集的每个维度对聚类的信息重要性。在递归聚类期间，赋予具有更大信息重要性的维度更大权重。为了应对强烈的背景活动，我们的算法根据信噪比指标采用迭代数据过滤方法。该算法找到簇核心，然后将其扩展以包括完整的簇。这种数学方法可以从其尖峰排序的原型背景扩展到其他受高维度和背景活动困扰的数据集。

相似文献

A multistage mathematical approach to automated clustering of high-dimensional noisy data.

Proc Natl Acad Sci U S A. 2015 Apr 7;112(14):4477-82. doi: 10.1073/pnas.1503940112. Epub 2015 Mar 23.

Noise-robust unsupervised spike sorting based on discriminative subspace learning with outlier handling.

J Neural Eng. 2017 Jun;14(3):036003. doi: 10.1088/1741-2552/aa6089. Epub 2017 Feb 15.

A Fully Automated Approach to Spike Sorting.

Neuron. 2017 Sep 13;95(6):1381-1394.e6. doi: 10.1016/j.neuron.2017.08.030.

Cluster tendency assessment in neuronal spike data.

PLoS One. 2019 Nov 12;14(11):e0224547. doi: 10.1371/journal.pone.0224547. eCollection 2019.

In quest of the missing neuron: spike sorting based on dominant-sets clustering.

Comput Methods Programs Biomed. 2012 Jul;107(1):28-35. doi: 10.1016/j.cmpb.2011.10.015. Epub 2011 Dec 2.

Automatic online spike sorting with singular value decomposition and fuzzy C-mean clustering.

BMC Neurosci. 2012 Aug 8;13:96. doi: 10.1186/1471-2202-13-96.

High-dimensional cluster analysis with the masked EM algorithm.

Neural Comput. 2014 Nov;26(11):2379-94. doi: 10.1162/NECO_a_00661. Epub 2014 Aug 22.

SpikeDeep-classifier: a deep-learning based fully automatic offline spike sorting algorithm.

J Neural Eng. 2021 Feb 5;18(1). doi: 10.1088/1741-2552/abc8d4.

A review on cluster estimation methods and their application to neural spike data.

J Neural Eng. 2018 Jun;15(3):031003. doi: 10.1088/1741-2552/aab385. Epub 2018 Mar 2.

t-SNE Visualization of Large-Scale Neural Recordings.

Neural Comput. 2018 Jul;30(7):1750-1774. doi: 10.1162/neco_a_01097. Epub 2018 Jun 12.

引用本文的文献

Striosomes Mediate Value-Based Learning Vulnerable in Age and a Huntington's Disease Model.

Cell. 2020 Nov 12;183(4):918-934.e49. doi: 10.1016/j.cell.2020.09.060. Epub 2020 Oct 27.

Remembered reward locations restructure entorhinal spatial maps.

Science. 2019 Mar 29;363(6434):1447-1452. doi: 10.1126/science.aav5297.

HOPE: Hybrid-Drive Combining Optogenetics, Pharmacology and Electrophysiology.

Front Neural Circuits. 2018 May 16;12:41. doi: 10.3389/fncir.2018.00041. eCollection 2018.

Inversely Active Striatal Projection Neurons and Interneurons Selectively Delimit Useful Behavioral Sequences.

Curr Biol. 2018 Feb 19;28(4):560-573.e5. doi: 10.1016/j.cub.2018.01.031. Epub 2018 Feb 8.

Bio-inspired benchmark generator for extracellular multi-unit recordings.

Sci Rep. 2017 Feb 24;7:43253. doi: 10.1038/srep43253.

Reliable Analysis of Single-Unit Recordings from the Human Brain under Noisy Conditions: Tracking Neurons over Hours.

PLoS One. 2016 Dec 8;11(12):e0166598. doi: 10.1371/journal.pone.0166598. eCollection 2016.

Analysis of complex neural circuits with nonlinear multidimensional hidden state models.

Proc Natl Acad Sci U S A. 2016 Jun 7;113(23):6538-43. doi: 10.1073/pnas.1606280113. Epub 2016 May 24.

本文引用的文献

High-dimensional cluster analysis with the masked EM algorithm.

Neural Comput. 2014 Nov;26(11):2379-94. doi: 10.1162/NECO_a_00661. Epub 2014 Aug 22.

Estimation of templates and timings of spikes in extracellular voltage signals containing overlaps of the arbitrary number of spikes.

Annu Int Conf IEEE Eng Med Biol Soc. 2013;2013:1992-5. doi: 10.1109/EMBC.2013.6609920.

Applicability of independent component analysis on high-density microelectrode array recordings.

J Neurophysiol. 2012 Jul;108(1):334-48. doi: 10.1152/jn.01106.2011. Epub 2012 Apr 4.

Spike sorting of heterogeneous neuron types by multimodality-weighted PCA and explicit robust variational Bayes.

Front Neuroinform. 2012 Mar 19;6:5. doi: 10.3389/fninf.2012.00005. eCollection 2012.

Performance comparison of extracellular spike sorting algorithms for single-channel recordings.

J Neurosci Methods. 2012 Jan 30;203(2):369-76. doi: 10.1016/j.jneumeth.2011.10.013. Epub 2011 Oct 21.

Comprehensive cluster analysis with Transitivity Clustering.

Nat Protoc. 2011 Mar;6(3):285-95. doi: 10.1038/nprot.2010.197. Epub 2011 Feb 10.

Automation of an algorithm based on fuzzy clustering for analyzing tumoral heterogeneity in human skin carcinoma tissue sections.

Lab Invest. 2011 May;91(5):799-811. doi: 10.1038/labinvest.2011.13. Epub 2011 Feb 28.

Quantifying the isolation quality of extracellularly recorded action potentials.

J Neurosci Methods. 2007 Jul 30;163(2):267-82. doi: 10.1016/j.jneumeth.2007.03.012. Epub 2007 Mar 24.

Quantitative measures of cluster quality for use in extracellular recordings.

Neuroscience. 2005;131(1):1-11. doi: 10.1016/j.neuroscience.2004.09.066.

Multiple neural spike train data analysis: state-of-the-art and future challenges.

Nat Neurosci. 2004 May;7(5):456-61. doi: 10.1038/nn1228.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于高维噪声数据自动聚类的多阶段数学方法。

A multistage mathematical approach to automated clustering of high-dimensional noisy data.

作者信息

Friedman Alexander, Keselman Michael D, Gibb Leif G, Graybiel Ann M

机构信息

McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139.

McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139

出版信息

Proc Natl Acad Sci U S A. 2015 Apr 7;112(14):4477-82. doi: 10.1073/pnas.1503940112. Epub 2015 Mar 23.

DOI:10.1073/pnas.1503940112

PMID:25831512

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4394271/

Abstract

摘要

一种用于高维噪声数据自动聚类的多阶段数学方法。

A multistage mathematical approach to automated clustering of high-dimensional noisy data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一种用于高维噪声数据自动聚类的多阶段数学方法。

A multistage mathematical approach to automated clustering of high-dimensional noisy data.

作者信息

机构信息

出版信息