Recoding Variables

From Market Research
Jump to: navigation, search

Recoding involves substituting the values of a variable with values that are more useful. Consider the following summary table of age (computed using Data Cracker). Ordinarily in data analysis age is considered a categorical variable. But, from time-to-time it is useful to treat it as a numeric variable and compute its average. The average age is shown at the bottom of the table as 5.9 which is clearly not the true average age.


The reason for this curious average age calculation is that categorical variables, such as age, are usually stored in the data file with a 1 for the first category, a 2 for the second category and so on. This is shown below, where on the left we can see the age categories for 13 respondents while on the right we can see the values that are stored in the data file. When the average was computed it was computed using these values (e.g., somebody aged 25 to 29 was given a value of 3 when computing an average).


The next two images are form the Recode Values feature in Data Cracker (most data analysis programs have a similar feature). The left example shows the original values in the data file. The values on the right show the values after recoding. This is an example of mid-point recoding, where the value assigned is the value in the middle of the range (e.g., 27 is in the middle of 25 to 29).


The recoded numeric variable is shown on the right below, and we can see from the table on the left that the average is now a much more sensible 42.5.


Previous page

Merging Categories and Creating NETs

Next page

Coding Text Variables

See also