Category:Data Preparation - Cleaning, Tidying and Weighting
Cleaning, tidying and weighting are activities that are performed before trying to work out what the data in a survey means:
- Data cleaning refers to checking and correcting anomalies in a data file. The goal is to identify data that is, in some way, clearly incorrect. The unglamorous world of data cleaning can be a key determinant of the quality of data analysis, particularly when the data is from a messy source (e.g., customer records, collected using a cheap data collection program).
- Data tidying involves manipulating the way that data is set up to make it easier to interpret. For example, changing birth dates into age categories, removing 'don't know' categories.
- Weighting is a technique which adjusts the results of a survey to bring them into line with what is known about the population.
These steps are also sometimes referred to as data processing.
It is possible to first clean the data, then tidy the data and, then, if necessary, weight the data. However, in practice it is much more efficient to simultaneously clean and tidy the data and then weight the data.
- Choosing Survey Analysis Software
- Getting a Data File
- Creating a Summary Report
- Interpreting a Summary Report
- Correcting Metadata
- Missing Values
- Merging Categories and Creating NETs
- Recoding Variables
- Coding Text Variables
- Creating New Variables
- Deleting Respondents
- Checking Representativeness
Pages in category ‘Data Preparation - Cleaning, Tidying and Weighting’
The following 24 pages are in this category, out of 24 total.
- How to Change the Label of a Category of a Question
- How to Change the Name of a Question or Variable
- How to Change Variable Type and Question Type
- How to Combine Variables into Multiple Response Questions and Grids
- How To Remove a Category and Re-Base a Table
- How To Show Missing Values on a Table
- How to Split Questions Into Separate Variables