Towards missing electric power data imputation for energy management systems

Ming Chang Wang, Chih Fong Tsai, Wei Chao Lin*

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

36 Scopus citations

Abstract

Demand for electricity is gradually increasing in many countries. Efforts in related studies have been made for the application of data mining techniques over related electric power data for the development of more effective energy management systems. However, one major challenge is how to compensate for parts of the collected dataset, such as power consumption, voltage, or electric current that may be missing for a specific period of time. In the literature, several methods have been employed for imputation of the missing data, especially single feature value imputation. However, the performance of the different types of imputation methods, i.e. statistical and machine learning methods, for multiple missing features of electric power data has not been fully explored. Moreover, variations in their imputation performance during the summer/non-summer seasons and in the peak/off-peak/semi-peak times have not been investigated. In this paper, the performance of five well-known imputation methods for processing electric power data, two statistical methods, autoregressive integrated moving average (ARIMA) and linear interpolation (LI) models, and three machine learning methods, k-nearest neighbor (K-NN), multilayer perceptron (MLP), and support vector regression (SVR) is compared. The experimental results, based on electric power data for a two-year period in Taiwan, show that the machine learning methods generally perform better than the statistical ones, with K-NN and SVR performing the best. In particular, all of the imputation methods produced higher error rates during the summer season than the non-summer seasons. Moreover, the machine learning methods (especially K-NN) are better choices for the imputation of missing data during peak times, whereas the statistical methods (especially LI) are better for off-peak and semi-peak times.

Original languageEnglish
Article number114743
JournalExpert Systems with Applications
Volume174
DOIs
StatePublished - 15 07 2021
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2021 Elsevier Ltd

Keywords

  • Data mining
  • Electric power data
  • Energy management system
  • Machine learning
  • Missing value imputation

Fingerprint

Dive into the research topics of 'Towards missing electric power data imputation for energy management systems'. Together they form a unique fingerprint.

Cite this