Improved practices in machine learning algorithms for NTL detection with imbalanced data

Gerardo Figueroa, Yi Shin Chen, Nelson Avila, Chia Chi Chu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

43 Scopus citations

Abstract

Non-technical losses (NTLs) in electrical power grids, which mainly concern electrical theft, can have a major impact on the economies of energy providers and nations. The use of machine learning algorithms to detect NTLs has been widely studied to attenuate the costs of on-site inspections of electricity consumers showing suspicious consumption behavior. An issue that has not received enough attention in the research is the imbalance between fraudulent and non-fraudulent data, which can have a major negative impact on the performance of supervised learning methods. Furthermore, most methods proposed in the literature have not evaluated the effectiveness of their methodology using meaningful performance measures. We propose a framework that addresses the problem of data imbalance in supervised classification techniques for NTL detection through resampling techniques. Additionally, we present the results of our experimental evaluation using an extensive list of performance metrics, two of which have not been previously reported in the literature - the Matthews Correlation Coefficient and the Fβ-score. Experiments have been carried out using 22 months of electricity consumption data corresponding to over 3,400 industrial and commercial customers in Honduras. Our experimental results show that class imbalance strategies applied on supervised classifiers for NTL detection can significantly improve the quality of predictions.

Original languageEnglish
Title of host publication2017 IEEE Power and Energy Society General Meeting, PESGM 2017
PublisherIEEE Computer Society
Pages1-5
Number of pages5
ISBN (Electronic)9781538622124
DOIs
StatePublished - 29 01 2018
Externally publishedYes
Event2017 IEEE Power and Energy Society General Meeting, PESGM 2017 - Chicago, United States
Duration: 16 07 201720 07 2017

Publication series

NameIEEE Power and Energy Society General Meeting
Volume2018-January
ISSN (Print)1944-9925
ISSN (Electronic)1944-9933

Conference

Conference2017 IEEE Power and Energy Society General Meeting, PESGM 2017
Country/TerritoryUnited States
CityChicago
Period16/07/1720/07/17

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Keywords

  • Imbalanced Data
  • Non-Technical Loss Detection
  • Performance Metrics
  • Supervised Machine Learning

Fingerprint

Dive into the research topics of 'Improved practices in machine learning algorithms for NTL detection with imbalanced data'. Together they form a unique fingerprint.

Cite this