Recommending missing sensor values

Chung Yi Li, Wei Lun Su, Todd G. McKenzie, Fu Chun Hsu, Shou De Lin, Jane Yung Jen Hsu, Phillip B. Gibbons

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Datasets gathered from sensor networks often suffer from a significant fraction of missing data, due to issues such as communication and sensor interference, power depletion, and hardware failure. Many standard data analysis tools such as classification engines, time-sequence pattern analysis modules, and statistical tools are ill-equipped to deal with missing values - hence, there is a vital need for highly-accurate techniques for imputing missing readings prior to analysis. This paper presents novel imputation methods that take a «recommendation systems» view of the problem: the sensors and their readings at each time step are viewed as products and user product ratings, with the goal of estimating the missing ratings. Sensor readings differ from product ratings, however, in that the former exhibit high correlation in both time and space. To incorporate this property, we modify the widely successful matrix factorization approach for recommendation systems to model inter-sensor and intra-sensor correlations and learn latent relationships among these dimensions. We evaluate the approach using two sensor network datasets, one indoor and one outdoor, and two imputation scenarios, corresponding to intermittent readings and failed sensors. Next, we consider sensor networks with multiple sensor types at each node. We present two techniques for extending our model to account for possible correlations among sensor types (e.g., temperature and humidity) with promising results. Finally, we study how the imputed values affect the result of data analysis. We consider a popular data analysis task - building regression-based prediction models - and show that, compared to prior approaches for imputation, our method leads to a much higher quality prediction model.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015
EditorsFeng Luo, Kemafor Ogan, Mohammed J. Zaki, Laura Haas, Beng Chin Ooi, Vipin Kumar, Sudarsan Rachuri, Saumyadipta Pyne, Howard Ho, Xiaohua Hu, Shipeng Yu, Morris Hui-I Hsiao, Jian Li
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages381-390
Number of pages10
ISBN (Electronic)9781479999255
DOIs
StatePublished - 22 12 2015
Externally publishedYes
Event3rd IEEE International Conference on Big Data, IEEE Big Data 2015 - Santa Clara, United States
Duration: 29 10 201501 11 2015

Publication series

NameProceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015

Conference

Conference3rd IEEE International Conference on Big Data, IEEE Big Data 2015
Country/TerritoryUnited States
CitySanta Clara
Period29/10/1501/11/15

Bibliographical note

Publisher Copyright:
© 2015 IEEE.

Keywords

  • Matrix Factorization
  • Missing Data Imputation
  • Recommendation Systems
  • Tensor Factorization
  • Wireless Sensor Network

Fingerprint

Dive into the research topics of 'Recommending missing sensor values'. Together they form a unique fingerprint.

Cite this