Selecting a Sample
Once the population has been defined, we need a way of finding the people in the population. To do this we need either a list of people in the population, or, a list of places where we can find people.[note 1]
Surveys involve either selecting everybody from a list or at a specific location (which is known as a census), or, sampling, which involves randomly selecting people from lists or locations.
Common examples of lists include:
- Online panels, which are databases of people willing to participate in market research in return for pay or prizes.
- Telephone directories.
- Delivery Point IDs, which are lists of all the postal addresses in Australia.
- Lists of customers’ email addresses.
- Lists of people willing to participate in focus groups and depth interviews (these lists are held by recruiters).
Such lists are sometimes referred to as the sample. However, it is important not to confuse this with use of the word sample with its normal meaning (where the normal meaning is respondents that have completed the questionnaire).
Common examples of lists of places where we can find people include:
- Shopping centres.
- Websites’ banner advertisements (e.g., “click here to earn money”).
- List of telephone prefixes (which can be used to randomly generate phone numbers).
- Suburbs, where face-to-face interviews may knock on, say, every fifth door.
- People that phone in to phone-in-polls run by TV stations.
Checking to see if the list is good
The coverage of a survey refers to the extent to which people in the population (e.g.,consumers in the market) could, potentially, have been included in the study. A challenging aspect of all surveys is that most people are unwilling to participate. This means that all surveys tend to have poor coverage. For example, most surveys are conducted using online panels and around 1% of the population is on these panels.
Coverage error is error that occurs when the list is not representative of the population in terms of the things being measured. Consider, for example, using an online panel to understand attitudes to the internet. People that do not use the internet will not be on such a panel and, consequently, any online study seeking to understand attitudes to the internet in the general community will be massively biased.
In an ideal world for research, we would have access to lists containing everybody in the population. This rarely happens in practice. Even companies’ lists of their customers are generally incomplete, due to people having provided incorrect contact details, having changed contact details or having advised the company they do not want to participate in market research. Consequently, when evaluating whether coverage error is a problem for a survey, our focus is not on working out whether the list contains 100% of the population. Rather, our focus is on determining whether the factors that cause somebody to be omitted from a list will be correlated with how the data they would have provided if they had been included. For example, it is problematic to use the a telephone directory as a list if conducting a study of high net worth individuals, as wealthier people are more likely to have unlisted numbers.
Selecting from the list
Often you will not want to attempt to interview everybody on your list. Typically, if you have a large list, the process is:
- Work out your required sample size.
- Take a guess at your response rate, which is the proportion of people that complete an interview relative to the proportion that you invite to the interview. For example, if you invite 10,000 people and 1,000 complete the interview then you have a 10% response rate. It is best to be pessimistic when guessing your response rate.
- Select the required number of people from your list. E.g., if you need 1,000 interviews and your response rate is 10% then you need to select 10,000 people from the list. The process of selecting these people from the list should be random (i.e., it should not favor any particular group of people). A simple way to achieve this outcome is to take, say, every 20th person from your list (where the '20' is adjusted based on the size of the list).
- Such lists are sometimes referred to as the sampling frame.