A Framework to Detect Disguised Missing Data

A Framework to Detect Disguised Missing Data

Author: 
Belen, Rahime
Place: 
Hershey, PA
Publisher: 
IGI Global
Date published: 
2010
Record type: 
Responsibility: 
Temizel, Tugba Taskaya, jt. author
Editor: 
Kumar, A.V. Senthil
Source: 
Knowledge Discovery Practices and Emerging Applications of Data Mining
Abstract: 

Many manually populated very large databases suffer from data quality problems such as missing, inaccurate data and duplicate entries. A recently recognized data quality problem is that of disguised missing data which arises when an explicit code for missing data such as NA (Not Available) is not provided and a legitimate data value is used instead. Presence of these values may affect the outcome of data mining tasks severely such that association mining algorithms or clustering techniques may result in biased inaccurate association rules and invalid clusters respectively. Detection and elimination of these values are necessary but burdensome to be carried out manually. In this chapter, the methods to detect disguised missing values by visual inspection are explained first. Then, the authors describe the methods used to detect these values automatically. Finally, the framework to detect disguised missing data is proposed and a demonstration of the framework on spatial and categorical data sets is provided.

Series: 
Advances in Data Mining and Database Management

CITATION: Belen, Rahime. A Framework to Detect Disguised Missing Data edited by Kumar, A.V. Senthil . Hershey, PA : IGI Global , 2010. Knowledge Discovery Practices and Emerging Applications of Data Mining - Available at: https://library.au.int/framework-detect-disguised-missing-data