Buswinka Christopher J, Rosenberg David B, Simikyan Rubina G, Osgood Richard T, Fernandez Katharine, Nitta Hidetomi, Hayashi Yushi, Liberman Leslie W, Nguyen Emily, Yildiz Erdem, Kim Jinkyung, Jarysta Amandine, Renauld Justine, Wesson Ella, Wang Haobing, Thapa Punam, Bordiga Pierrick, McMurtry Noah, Llamas Juan, Kitcher Siân R, López-Porras Ana I, Cui Runjia, Behnammanesh Ghazaleh, Bird Jonathan E, Ballesteros Angela, Vélez-Ortega A Catalina, Edge Albert S B, Deans Michael R, Gnedeva Ksenia, Shrestha Brikha R, Manor Uri, Zhao Bo, Ricci Anthony J, Tarchini Basile, Basch Martín L, Stepanyan Ruben, Landegger Lukas D, Rutherford Mark A, Liberman M Charles, Walters Bradley J, Kros Corné J, Richardson Guy P, Cunningham Lisa L, Indzhykulian Artur A
Eaton Peabody Laboratories, Mass Eye and Ear, Boston, MA, 02114, USA.
Department of Otolaryngology, Head and Neck Surgery, Harvard Medical School, Boston, MA, 02114, USA.
Sci Data. 2024 Apr 23;11(1):416. doi: 10.1038/s41597-024-03218-y.
Our sense of hearing is mediated by cochlear hair cells, of which there are two types organized in one row of inner hair cells and three rows of outer hair cells. Each cochlea contains 5-15 thousand terminally differentiated hair cells, and their survival is essential for hearing as they do not regenerate after insult. It is often desirable in hearing research to quantify the number of hair cells within cochlear samples, in both pathological conditions, and in response to treatment. Machine learning can be used to automate the quantification process but requires a vast and diverse dataset for effective training. In this study, we present a large collection of annotated cochlear hair-cell datasets, labeled with commonly used hair-cell markers and imaged using various fluorescence microscopy techniques. The collection includes samples from mouse, rat, guinea pig, pig, primate, and human cochlear tissue, from normal conditions and following in-vivo and in-vitro ototoxic drug application. The dataset includes over 107,000 hair cells which have been identified and annotated as either inner or outer hair cells. This dataset is the result of a collaborative effort from multiple laboratories and has been carefully curated to represent a variety of imaging techniques. With suggested usage parameters and a well-described annotation procedure, this collection can facilitate the development of generalizable cochlear hair-cell detection models or serve as a starting point for fine-tuning models for other analysis tasks. By providing this dataset, we aim to give other hearing research groups the opportunity to develop their own tools with which to analyze cochlear imaging data more fully, accurately, and with greater ease.
我们的听觉由耳蜗毛细胞介导,耳蜗毛细胞有两种类型,排列成一排内毛细胞和三排外毛细胞。每个耳蜗包含5000 - 15000个终末分化的毛细胞,它们的存活对听力至关重要,因为受损后它们不会再生。在听力研究中,通常希望对耳蜗样本中的毛细胞数量进行量化,无论是在病理状态下还是在对治疗的反应中。机器学习可用于自动量化过程,但需要大量多样的数据集进行有效训练。在本研究中,我们展示了大量带注释的耳蜗毛细胞数据集,这些数据集用常用的毛细胞标记物标记,并使用各种荧光显微镜技术成像。该数据集包括来自小鼠、大鼠、豚鼠、猪、灵长类动物和人类耳蜗组织的样本,涵盖正常情况以及体内和体外耳毒性药物应用后的情况。该数据集包含超过107000个已被识别并注释为内毛细胞或外毛细胞的毛细胞。这个数据集是多个实验室共同努力的结果,并且经过精心策划以代表各种成像技术。通过建议的使用参数和详细描述的注释程序,这个数据集可以促进通用耳蜗毛细胞检测模型的开发,或作为微调其他分析任务模型的起点。通过提供这个数据集,我们旨在为其他听力研究小组提供机会,使他们能够开发自己的工具,以便更全面、准确且更轻松地分析耳蜗成像数据。