Aalborg University, Department of the Built Environment, Aalborg, 9220, Denmark.
Aalborg University, Department of Mathematical Sciences, Aalborg, 9220, Denmark.
Sci Data. 2022 Jul 19;9(1):420. doi: 10.1038/s41597-022-01502-3.
The now widespread use of smart heat meters for buildings connected to district heating networks generates data at an unknown extent and temporal resolution. This data encompasses information that enables new data-driven approaches in the building sector. Real-life data of sufficient size and quality are necessary to facilitate the development of such methods, as subsequent analyses typically require a complete equidistant dataset without missing or erroneous values. Thus, this work presents three years (2018-01-03 till 2020-12-31) of screened, interpolated, and imputed data from 3,021 commercial smart heat meters installed in Danish residential buildings. The screening aimed to detect data from not used meters, resolve issues caused by the data storage process and identify erroneous values. Linear interpolation was used to obtain equidistant data. After the screening, 0.3% of the data were missing, which were imputed using a weighted moving average based on a systematic comparison of nine different imputation methods. The original and processed data are published together with the code for data processing ( https://doi.org/10.5281/zenodo.6563114 ).
如今,连接到区域供热网络的建筑物广泛使用智能热量表,这些热量表以未知的程度和时间分辨率生成数据。这些数据包含了可以在建筑领域中采用新的数据驱动方法的信息。为了促进这些方法的发展,需要有足够规模和质量的实际数据,因为后续分析通常需要一个完整的等距数据集,其中没有缺失或错误的值。因此,这项工作展示了三年(2018 年 1 月 3 日至 2020 年 12 月 31 日)从丹麦住宅建筑中安装的 3021 个商业智能热量表中筛选、插值和插补的数据。筛选的目的是检测未使用的仪表的数据,解决数据存储过程中引起的问题,并识别错误的值。线性插值用于获得等距数据。筛选后,有 0.3%的数据丢失,使用基于九种不同插补方法的系统比较的加权移动平均值进行插补。原始和处理后的数据以及数据处理的代码(https://doi.org/10.5281/zenodo.6563114)一起发布。