Suppr超能文献

基于传染病百分比的省份聚类,使用 BCBimax 双聚类算法。

Province clustering based on the percentage of communicable disease using the BCBimax biclustering algorithm.

机构信息

IPB Bogor University, Bogor.

Airlangga University, Surabaya.

出版信息

Geospat Health. 2023 Sep 12;18(2). doi: 10.4081/gh.2023.1202.

Abstract

Indonesia needs to lower its high infectious disease rate. This requires reliable data and following their temporal changes across provinces. We investigated the benefits of surveying the epidemiological situation with the imax biclustering algorithm using secondary data from a recent national scale survey of main infectious diseases from the National Basic Health Research (Riskesdas) covering 34 provinces in Indonesia. Hierarchical and k-means clustering can only handle one data source, but BCBimax biclustering can cluster rows and columns in a data matrix. Several experiments determined the best row and column threshold values, which is crucial for a useful result. The percentages of Indonesia's seven most common infectious diseases (ARI, pneumonia, diarrhoea, tuberculosis (TB), hepatitis, malaria, and filariasis) were ordered by province to form groups without considering proximity because clusters are usually far apart. ARI, pneumonia, and diarrhoea were divided into toddler and adult infections, making 10 target diseases instead of seven. The set of biclusters formed based on the presence and level of these diseases included 7 diseases with moderate to high disease levels, 5 diseases (formed by 2 clusters), 3 diseases, 2 diseases, and a final order that only included adult diarrhoea. In 6 of 8 clusters, diarrhea was the most prevalent infectious disease in Indonesia, making its eradication a priority. Direct person-to-person infections like ARI, pneumonia, TB, and diarrhoea were found in 4-6 of 8 clusters. These diseases are more common and spread faster than vector-borne diseases like malaria and filariasis, making them more important.

摘要

印度尼西亚需要降低其高传染病发病率。这需要可靠的数据,并跟踪其在各省的时间变化。我们使用二级数据调查了使用 imax 双向聚类算法调查传染病流行情况的效果,这些二级数据来自最近对印度尼西亚全国主要传染病的国家规模调查(Riskesdas),涵盖了印度尼西亚 34 个省。层次聚类和 K-均值聚类只能处理一个数据源,但 BCBimax 双向聚类可以对数据矩阵的行和列进行聚类。通过几个实验确定了最佳的行和列阈值,这对获得有用的结果至关重要。根据各省的情况,将印度尼西亚七种最常见传染病(ARI、肺炎、腹泻、结核病(TB)、肝炎、疟疾和丝虫病)的百分比进行排序,不考虑临近度形成组,因为聚类通常相距较远。将 ARI、肺炎和腹泻分为幼儿和成人感染,将目标疾病从 7 种增加到 10 种。基于这些疾病的存在和严重程度形成的双向聚类集包括 7 种疾病(中度至高度疾病水平)、5 种疾病(由 2 个聚类组成)、3 种疾病、2 种疾病,最后一个仅包括成人腹泻。在 8 个聚类中的 6 个中,腹泻是印度尼西亚最常见的传染病,因此将其消除作为优先事项。ARI、肺炎、TB 和腹泻等直接人与人之间的感染在 8 个聚类中的 4-6 个中存在。这些疾病比疟疾和丝虫病等媒介传播疾病更为常见且传播速度更快,因此更为重要。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验