Mandeep Singh
www.mandeepvba.blogspot.com
In any organization Data Analysis is important in order to improve the business performance and to sort out different problems associated with the business. To analyse the situation first thing you need is historical data (relevant to the problem). Once you have data, its need to be in proper format so it can be easily analyse. There may be number of problems you have to face while bringing the data into proper format. Missing Data is most common problem that comes up during the data analysis process, especially when you get some feed back from customers to analyse the situation and due to some reason they are not interested to answer all the questions. During the Data Preparation phase (Data Mining Phases) you have to sort out these things to bring the data into proper format. There are two methods that are widely used to come up such situation and helps in well formed decision making in the end of analysis. One is “Avoid the missing data” and other is “Data Imputation” Many data cleansing methods had been developed and incorporated into analysis softwares (SPSS) to handle these problems.
Avoid Missing data:
This is the easiest way to handle the missing data. Delete all those records that are not completely filled. For example you send feedback form to 1000 customers, and out of 1000 only 800 feedback forms are completely filled. So easiest way is to avoid 200 forms and only consider rest of 800 feedback forms for analysis purposes.
Some time it may happen for a particular attribute or column you get less response as compare to other attributes. For example out of 1000 responders, 500 don’t like to answer question 6th. So in this case attribute corresponding to 6th number question can be discarded from the analysis.
Main advantage of this method is, it is not time consuming and same time it is very easy to follow. But there are many drawback associated with this method. Avoiding record or feedback results in losing some information. If the sample data size is large avoiding some records or attributes may not effect the results, but still you need to keep in mind you are losing something.
Data imputation:
Data imputation is another method of handling missing values. By using this method we try to fill missing values in the in the records and attributes. This method is quite useful because by following this method we can make sure we have all the information from responders. There are number of methods to fill the missing values few are given below.
Case Substitution
Mean Substitution
Hot Deck Imputation
Nearest Neighbour Imputation
Case Substitution:
By using this method we can replace the missing value with historical value from similar cases. We can not use value from current sample for case substitution; it must be from previous observations.
Mean Substitution:
This is quite simple method. In this method we can simply replace the missing value with the mean value for that particular attribute. But this method can not be applied for categorical data. This method is only useful for column or attributes imputation not for row imputation.
Hot Deck Imputation:
In this method, missing value is filled with value that comes from similar cases or records in the current sample. That means if two records are quite similar and in one record, value for some attribute is missing then we can fill the value from other similar record.
Cold Deck Imputation:
In this method we can replace missing value with single fixed value. Value must be from external source, which mean it can not be from current sample. Fixed value means if in particular attribute we are missing 5 values, then in all cases we have to fill same value.
Next Article: Population Sampling and Need of Sampling
Mandeep Singh
www.mandeepvba.blogspot.com
- Related Videos
- Related Articles
- Ask / Related Q&A
- Data Entry,data Management, Processing, Capture, Data-catalog Conversion, Imaging Services
- Data Cleansing
- Data Quality experts give away Excel Data Cleansing software to Charities and Non profit organisations
- Data Cleansing : Top 7 Reasons Why
- DATA SERVICES AT CHENNAI
- Looking for Data Entry Outsourcing Services in Usa
- Data Conversion And Data Processing Services
- Data migration services




Toshiba Regza 37XV635 LCD TV Review
By: Scotty Hughes | 06/01/2010Toshiba has come up with the cutting-edge in LCD television engineering with the Toshiba Regza 37XV635.It easily merges a stylish blueprint with state of the art technology. You will be astounded by this unmatched television system. The Toshiba Regza 37XV635 is equipped with the latest in TV technology known as the MetaBrain. It controls certain TV functions in order to achieve better performance. Features such as Active Picture Processing, Resolution+, AutoView function and Dolby V...
Examsoon 000-284 test questions
By: Adela1987 | 06/01/2010000-284 test questions from Examsoon will be the most reliable source for a good quality. With much thorough analysis of the feedback from thousands of certified experts, we are able to determine which providers will provide you with updated and relevant 000-284 practice questions and good quality 000-284 practice test.
Pass4side 000-543 braindumps
By: Adela1987 | 06/01/2010Pass4side Practice Exams for IBM 000-543 are written to the highest standards of technical accuracy, using only certified subject matter experts and published authors for development.
Examsoon 000-154 braindumps exam
By: Adela1987 | 06/01/2010We provides high quality IBM 000-154 braindumps exam. It is the best and the lastest IBM Practice Exams. Furthermore, we are constantly updating our Examsoon 000-154 Exam. These Exam Resources updates are supplied free of charge to Examsoon customers. If you have any question about Examsoon 000-154 braindumps, please feel free to contact us at any time.
Examsoon 000-936 exam Guide
By: Adela1987 | 06/01/2010The IBM 000-936 exam Guide in pdf format that can also be downloaded from our training package. The IBM 000-936 Braindumps are also available for free in our training package to practice for the exams. These Braindumps can be really very useful to prepare exam. You will find on Braindumps are given by people which have gone through the same exam that you are about to do. They are seasoned experts and know the ins and outs of various IBM 000-936 Exams and others to give you an added edge when yo
Examsoon 000-817 exam
By: Adela1987 | 06/01/2010Examsoon 000-817 study guide will introduce you to the core logic of various subjects so that you not only learn, but you also understand various technologies and subjects. We guarantee that using our 000-817 practice test will adequately prepare you for your 000-817 exam, and set you up to pass your 000-817 exam the First Time.
Examsoon 000-101 Exam Objectives
By: Adela1987 | 06/01/2010You can find a better solution to your 000-101 preparation needs than Examsoon Links. Our 000-101 Free Notes, IBM 000-101 Sample Questions and 000-101 Brain dumps are reliable and are updated regularly with the changing IBM 000-101 Exam Objectives to give you the most accurate 000-101 Study Material possible. You can trust on our 000-101 Free Notes, IBM 000-101 Sample Questions and 000-101 Free Notes for a successful preparation of IBM 000-101 Certification Exam.
Examsoon 006-002 exam Trainings
By: Adela1987 | 06/01/2010Examsoon 006-002 is written to coincide with the real test by the experienced IT experts and specialists. With the complete collection of Questions and Answers, Examsoon 006-002 is high enough to help the candidates to pass this exam easily without any other study materials and no need to attend the expensive training class.
Need of Sampling and Sampling Methods
By: Mandeep Singh | 24/09/2008 | Marketing TipsIn the area of Marketing Research Sampling is very important topic. If our initial steps are not correct we can never get those results that we expect from any marketing campaign. So in this article we will discuss about need of sampling and sampling methods.
How to Handle Missing Values
By: Mandeep Singh | 27/08/2008 | Information TechnologyTo analyse the situation first thing you need is historical data (relevant to the problem). Once you have data, its need to be in proper format so it can be easily analyse. There may be number of problems you have to face while bringing the data into proper format. One problem is how to handle missing or inconsistent data. In this article we go through few basic techniques that used to solve this problem.
Data Mining Phases/process
By: Mandeep Singh | 24/08/2008 | Project ManagementData Mining can be very complex if you are not sure what exactly you have to do in the whole process. Before you perform any data mining task one thing is sure you have some sort of problem in hand and your intention is to solve it with best possible solution. This article will help you to understand the phases involved in the Data Mining process, and what responsibilities we need to perform at each phase. We will also discuss about CRISP-DM (Data mining standard used in industry) in detail.