Department of Anesthesiology, Vanderbilt University Medical Center, 1301 Medical Center Dr., Nashville, TN, 37232, USA,
J Med Syst. 2015 May;39(5):44. doi: 10.1007/s10916-015-0232-4. Epub 2015 Mar 3.
The increasingly large databases available to researchers necessitate high-quality metadata that is not always available. We describe a method for generating this metadata independently. Cluster analysis and expectation-maximization were used to separate days into holidays/weekends and regular workdays using anesthesia data from Vanderbilt University Medical Center from 2004 to 2014. This classification was then used to describe differences between the two sets of days over time. We evaluated 3802 days and correctly categorized 3797 based on anesthesia case time (representing an error rate of 0.13%). Use of other metrics for categorization, such as billed anesthesia hours and number of anesthesia cases per day, led to similar results. Analysis of the two categories showed that surgical volume increased more quickly with time for non-holidays than holidays (p < 0.001). We were able to successfully generate metadata from data by distinguishing holidays based on anesthesia data. This data can then be used for economic analysis and scheduling purposes. It is possible that the method can be expanded to similar bimodal and multimodal variables.
研究人员可利用的日益庞大的数据库需要高质量的元数据,但这些元数据并非总是可用。我们描述了一种独立生成这种元数据的方法。利用范德比尔特大学医学中心 2004 年至 2014 年的麻醉数据,通过聚类分析和期望最大化将每天分为节假日/周末和正常工作日。然后,使用这种分类方法来描述随着时间的推移,两组数据之间的差异。我们评估了 3802 天,其中 3797 天根据麻醉案例时间进行了正确分类(错误率为 0.13%)。使用计费麻醉时间和每天麻醉案例数等其他指标进行分类,得到了类似的结果。对这两个类别的分析表明,非节假日的手术量随时间的增长速度快于节假日(p<0.001)。我们能够成功地通过区分麻醉数据中的节假日来从数据中生成元数据。然后可以将这些数据用于经济分析和调度目的。该方法可能可以扩展到类似的双峰和多峰变量。