Category:Data Preparation - Cleaning, Tidying and Weighting

From Market Research
Jump to: navigation, search

Cleaning, tidying and weighting are activities that are performed before trying to work out what the data in a survey means:

  • Data cleaning refers to checking and correcting anomalies in a data file. The goal is to identify data that is, in some way, clearly incorrect. The unglamorous world of data cleaning can be a key determinant of the quality of data analysis, particularly when the data is from a messy source (e.g., customer records, collected using a cheap data collection program).
  • Data tidying involves manipulating the way that data is set up to make it easier to interpret. For example, changing birth dates into age categories, removing 'don't know' categories.
  • Weighting is a technique which adjusts the results of a survey to bring them into line with what is known about the population.

These steps are also sometimes referred to as data processing.

It is possible to first clean the data, then tidy the data and, then, if necessary, weight the data. However, in practice it is much more efficient to simultaneously clean and tidy the data and then weight the data.

Basic process

  1. Choosing Survey Analysis Software
  2. Getting a Data File
  3. Creating a Summary Report
  4. Interpreting a Summary Report
  5. Correcting Metadata
  6. Missing Values
  7. Merging Categories and Creating NETs
  8. Recoding Variables
  9. Coding Text Variables
  10. Creating New Variables
  11. Deleting Respondents
  12. Checking Representativeness
  13. Weighting

Previous section

Data Collection

Next page

Choosing Survey Analysis Software

Next section

Basic Data Analysis