A Multivariate Model for Data Cleansing in Sensor Networks


Authors

P. Ji and M. Szczodrak

Abstract

A Multivariate Model for Data Cleansing in Sensor Networks Ping Ji Dept. of Mathematics & Computer Science John Jay College of Criminal Justice Computer Science PhD Program, Graduate Center City University of New York Marcin Szczodrak Computer Science PhD Program Graduate Center City University of New York Abstract: A sensor network comprises a collection of sensor nodes that can measure characteristics of their local environment, perform certain computations, and transmit the measurement result, typically in a collaborative fasshion, to an external data collection point for data processing and storage. The collected measurement result however often contain erroneous data due to inevitable system problems involving various hardware and software components ranging from the sensor device for data collection, to computation device for data fusion and processing, to communication device for data transmissions. Such “dirty data” are expected to be sporadic. In this research, our objective is to detect and repair such dirty data. Our approach is to leverage on the intrinsic redundancies and correlations among the collected data, as information about a single event of interest in a sensor network is usually reected in multiple measurement data points. This data correlation can exhibit temporally, spatially, and across different data types. The inconsistency among multiple sensor measurements serves as an indicator for data quality problem. Furthermore, by carefully constructing a data model, we may be able to correct the dirty data in that data produced by one data source can serve as an error correction code for others. The focus of this paper is therefore to study methods that can effectively identify and correct errorneous data among inconsistent observations based on the correlation structure of various sensor measurement series. We propose a multivariate model to achieve this goal.

Publication Date

September, 2008

Venue

ACITA 2008

Published To

None


Publication Type

ITA Conference paper

ITA Area

Project 7, Technical area 3

Download a copy of the paper here

covclean.pdf

Return to main page