Herpertz Julian, Dwyer Bridget, Taylor Jacob, Opel Nils, Torous John
Department of Psychiatry and Psychotherapy, Jena University Hospital, Jena, Germany.
Division of Digital Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA.
Sci Rep. 2025 Apr 6;15(1):11775. doi: 10.1038/s41598-025-96369-w.
Despite regulatory efforts, many smartphone health applications remain unregulated, raising concerns about privacy, security, and evidence-based effectiveness. The lack of standardized regulation has led to the proliferation of over 130 frameworks, introducing new criteria and methodologies for app evaluation. The sheer number of frameworks, coupled with their varying approaches to app evaluation, create challenges for comparison. Our study aims to synthesize existing knowledge and propose a standardized app evaluation framework. We conducted a synthesis of reviews on health app evaluation frameworks. Using natural language processing (NLP), we analyzed evaluation domains and grouped them into clusters based on semantic similarities. Standardized definitions for these clusters were developed. We identified eight review articles that met the inclusion criteria, each proposing between six and 17 app evaluation domains. Using NLP, we identified five clusters of app evaluation: Effectiveness & Development, Technology & Functionality, Validity & Legal, Safety & Privacy, and Implementation & Ethics, each of which was assigned a standardized definition. The clusters align with but expand on the American Psychiatric Association's evaluation domains, incorporating critical aspects such as inclusivity, safety, engagement, and ethical principles. Temporal analysis revealed an increasing focus on Effectiveness & Development, while Safety & Privacy showed a stagnation in attention over time.
尽管有监管措施,但许多智能手机健康应用程序仍未得到规范,这引发了对隐私、安全和循证有效性的担忧。缺乏标准化监管导致了130多个框架的激增,引入了新的应用程序评估标准和方法。框架数量众多,再加上它们对应用程序评估的不同方法,给比较带来了挑战。我们的研究旨在综合现有知识并提出一个标准化的应用程序评估框架。我们对健康应用程序评估框架的综述进行了综合分析。使用自然语言处理(NLP),我们分析了评估领域,并根据语义相似性将它们分组为不同的集群。为这些集群制定了标准化定义。我们确定了八篇符合纳入标准的综述文章,每篇文章提出了6到17个应用程序评估领域。使用NLP,我们确定了应用程序评估的五个集群:有效性与开发、技术与功能、有效性与法律、安全与隐私以及实施与伦理,每个集群都有一个标准化定义。这些集群与美国精神病学协会的评估领域一致,但有所扩展,纳入了包容性、安全性、参与度和伦理原则等关键方面。时间分析显示,对有效性与开发的关注日益增加,而安全与隐私随着时间的推移关注度停滞不前。