The respondents to a survey are only a sample of a total 'population', so we could never be certain that the figures obtained are exactly those that would have been if everybody had been interviewed (the 'true' values). However, the extent to which the sample result varies from the 'true' values can be predicted from knowing the size of the samples on which the results are based and the number of times that a particular answer is given.
This level of uncertainty is represented as a 'Confidence Interval' (CI), and it represents the level of confidence for a prediction of a 'true', underlying mean from a sample mean. A 95% CI is usual, and it means that there is a 95% chance that the 'true' value will fall within a specified range.
With percentage measures, the width of the CI depends on the sample size (n) and the average percentage with a characteristic (p).
As a worked example, let us imagine that 60% of users are satisfied with a particular product and the sample size upon which this information was based is 800, then the 95% Confidence Interval is:
i.e. +/-3.4 percentage point (or between 56.6% and 63.4%).
The table below provides some typical scenarios:
However, in practice it is not always as simple as this. If a weighting scheme is imposed after data collection, or if you run a clustered sample (where only certain localities in the survey area are selected), then the precision would not be as great as would be suggested from using the straight-forward simple random sample formula. Consequently, the confidence interval would be somewhat wider. If on the other hand your sample constitutes a large part of a finite population (such as employee surveys or surveys of MPs), or if you impose a stratification system into the sample, then precision could be somewhat better than that given from the standard formula and your confidence interval would therefore be narrower..
Accuracy and Freedom from Biases
Above, we have described how increasing the sample size increases the precision of the results. However, large samples may still be “inaccurate” on the grounds that the sample is not a true representation of the population. The mix of old:young, or afflent:deprived persons within the sample may be different to that within the population. In such cases, the final result attributed to the population (as a result of the survey) may be skewed towards the sub-groups which are over-represented in the sample. It is therefore important for the researcher to be aware of the possibility of such biases and the survey to be designed to minimise or eliminate such possibilities. Such techniques will include:
- A good understanding of the Study Population;
- Appropriateness of the Sampling Frame;
- Care taken in selecting and executing the sampling technique;
- Setting appropriate Quota Controls;
- Running an appropriate weighting scheme, once the results have been collected, to ensure that no population sub-groups are over / under-represented in the analysis.