Creating New Variables

From Market Research
Jump to: navigation, search

Often it is useful to construct a new variable. Sometimes, this will be a numeric variable. For example, if one question asks about the number of bottles of Coke consumed in a week and another asks about the number of bottles of Pepsi it may be useful to work out the number of bottles of cola consumed. Other times it will be useful to create a categorical variable. For example, it may be useful to combine together questions about age, marital status and children to create a new categorical variable indicating family life stage (e.g., young singles, young couples, etc.).

Copying, merging and recoding

Often a new variable needs to be derived from an existing variable. For example, it may be that there is an existing variable with categories of Strongly Disagree, Somewhat Disagree, Neither Agree nor Disagree, Somewhat Agree and Strongly Agree, and there is a desire to create a new variable with categories of Agree and Not Agree. In such a situation the straightforward approach is to copy the existing variable and then merge the categories in the copied variable. Where programs do not permit the user to merge categories, such as in SPSS, the same outcome can be achieved by recoding.

Automatic categorization

The more sophisticated programs, such as SPSS, R and Q have various in-built tools for automatically categorizing numeric variables (e.g., into quartiles).

Formulas

The most powerful, but also most complex, way of creating new variables is to use formulas. For example, if wanting to create a new numeric variable by adding two variables called q1 and q2 most programs will have a facility to create a new variable using a formula such as:

q1 + q2

Or, if creating a new categorical variable the formula may look something like:

if (age <= 35){
  if (numberChildren == 0) {
     if (married) 2; //young couples
     else 1; //young singles
  }
  else {
     if (married) 3; //young families
     else 4;// young single families
  }
else{
  if (numberChildren == 0) {
     if (married) 2; //older couples
     else 1; //older singles
  }
  else {
     if (married) 3; //older families
     else 4;// older single families
  }
}

Multivariate analysis

Many multivariate analysis methods, such as Factor Analysis and Latent Class Analysis automatically output new variables that are summaries of the data.

Previous page

Coding Text Variables

Next page

Deleting Respondents