Market research >> Introduction to market research

Market research statistical techniques

Statistics for market research Most quantitative market research is delivered in the form of percentages, often pulled out of cross-tabulations to compare differences between groups (on the banner or cross-break) according to the different categories of answers on the stub.

Beyond these basics, a number of statistical techniques provide deeper analysis of the data. Depending on the application these include cluster analysis, factor analysis, regression and display techniques like perceptual maps. Statistical methods need a level of expertise and understanding of the underlying mathematics to avoid drawing fallacious conclusions.

Basic statistics

The basic data from a market research is presented in the form of percentages. What percentage of the sample gave this response? How many of each subgroup gave another response? Are there significant differences between subgroups?

To investigate these questions the researcher runs a set of tabs (cross-tabulations) which list percentage answering each question (the stub) by the total and other interesting subgroups (the break, or banner).

Cross-tabs are usually shown as the percentage of a base number at the top of a column. The base is the number of people answering the question, or in the column sub-group. The base can be weighted - adjusted to balance against a known population profile - or unweighted - shown raw.

Weighting adjusts for small variations in the sample profile to match known population data. For instance to adjust a sample of consumers from 49:51 men to women so it calculates as 50:50. Weighting always reduces the effective sample size for statistical tests. Weights always bring the problem that you can make one person seemingly represent a large number of people. So it's important to keep an eye on weight sizes and underlying base sizes. Large weights on tables should generally be avoided.

Differences between groups can be checked to see if the difference is likely to be significant or just a statistical artefact. Statistical significance is typically calculated using standard statistical estimates - the most common of which is the Student t-test - and is a probability of the difference occurring by pure chance. Normally results at 95% significance level are 'significant'. Similar calculations can also be used to derived a confidence interval around a particular result - again for a given significance level.

However, note that just because something is significant statistically, it's not necessarily significant commercially unless you can leverage the difference. And in long term studies data will naturally fluctuate around the mean because of the underlying statistics.

For numeric and scales questions answers can be scored and a mean score estimated. As means can hide a lot of useful additional data, it's always worth looking at the distribution of answers. And for numeric questions, checking for issues like outliers (extreme values) that might throw out a mean value. For long tailed items like income, medians may be more useful than means.

Regression and correlation

Many surveys have a bank of rating or scale questions to understand attitudes or to assess performance across a range of different areas. One of the basic questions is how these ratings drive opinions. For instance in a customer satisfaction survey you might ask how overall satisfaction is related to satisfaction with specific aspects such as delivery, ordering, appearance, packaging etc.

For this you can take the overall satisfaction and run regression analysis for the sample as a whole to understand the impact of the individual elements in driving overall satisfaction and thereby obtain what is known as a 'derived importance' measure. In practice, the individual items themselves are often related to each other, which can make a regression model difficult to use. So some people just look at correlation variable by variable.

A second area for regression is in propensity analysis, or modelling how different variables influence the probability of being in a particular group - for instance, a buyer or a non-buyer.

There are several types of regression analysis depending on the properties of the underlying variables and data also needs to be cleaned and then normalised before modelling.

Factor analysis and principle component analysis

Large studies might have a number of different ratings questions and variables that are potentially related to each other. Factor analysis, and related methods of principle component analysis are methods to simplify the data to find underpinning hidden drivers. They take a group of related variables and identify a reduced set of meta-variables or factors that group reduce the full raw set of variables to a smaller group.

Each factor draws on a number of underlying raw ratings or scores - so for instance you might find a meta-factor for service that relates aspects of helpfulness, checkout speed, queue length and delivery time. These are then pulled together into a single factor and the single factors can then be used as a feed in either to a regression (eg looking at drivers of purchase or satisfaction), or in some instances as a feed into cluster analysis.

For instance, a series of measures on physical activity - enjoy working out, like running, enjoy ball games might combine statistically into a single "sporty" dimension that implicitly links all the other ratings.

In more complex situations this can be used to find underlying associations that are less obvious - for instance different axes for travellers based on their views of activity, sun, food and service.

Cluster analysis

One of the objectives of segmentation is to see what groups exist in the market. One method is to use cluster analysis. Cluster analysis attempts to group individuals according to the similarity or difference of their answers. There are two main methods - hierarchical cluster or k-means, but there are also other clustering algorithms available.

As clustering always creates groups, a mix of approaches is generally recommended to see if different algorithms discover the same groups - if they do, it increases the probability that the clusters are real groups and not just artefacts of the clustering.

Clustering works best used with some a priori feeling for what types of segments might come out as one challenge always for cluster-based solutions is replicating the cluster solution in the real world. Having some ideas beforehand makes it easier to check and validate the groups. If clustering is the only method to uncover the groups then some form of scoring or marking will be needed to try to find the groups in subsequent follow up work. It is common to use regression and correlation analysis to identify these groups, but CHAID might also be possible.

Decision trees, CHAID and CART

Decision trees are methods for splitting groups of respondents to identify variables that explain key potential differences and drivers of behaviour. The aim is to produce a prediction model that can be used to forecast behaviours or propensity. CHAID (chi-square interaction detection) and CART (classification and regression trees) are commonly used in market research, whereas the concept of decision trees has been extended and developed for use in machine learning across large datasets giving rise to large scale techniques such as random forests.

The intuitive aim is to relate a key variable for instance, buying versus non-buyers against underlying drivers. By hand, we would look to compare buyers versus non-buyers looking for significant differences on other variables. For instance, are younger people more likely to be buyers than older people? Similarly are people in the north more likely to buy than people in the south? CHAID or CART looks through the variables automatically for significant differences, and picks the variable with the greatest statistical impact - say age. The implication being that age predicts likelihood to buy. The process then runs through the next level of variables looking for the next level of significant relationships if any. The result is a 'tree' of linked variables that contribute to the buyer/non-buyer split - hence decision trees.

With large numbers of variables and large amounts of data the search through the data for relationships can be extremely large and unwieldy. Consequently, automatic and randomised schemes are used to find potential interactions within the dataset using machine learning tools.

Perceptual mapping and multi-dimensional scaling

Perceptual mapping is a method for investigating products or brands positioning by converting judgements and ratings about the brand into a two-dimensional map that shows which brands are similar, and which are different in the minds of consumers on core underlying dimensions.

A simplistic approach is simply to map brands directly on ratings. A simple two-variable map might be perceived price against perceived quality. However, where there are multiple ratings and comparisons between the products, the data has to be reduced and transformed before being mapped.

The result is to show how brands or products are grouped in the minds of consumers in a visual format indicating products that are competing together (tightly grouped), versus those that have established differences from the group. With suitable data, the maps can also be centred on an 'ideal point' showing distances and directions for each brand from the overall sweetspot.

Experimental design, ad testing and product testing

Often research is used to evaluate different combinations of product features, or how well advertising works, or what the effect of different combinations of media is on advertising impact. These all fall under the auspices of statistical experimental design. This can involve evaluating a test versus control to see if the test has a statistical advantage, or, with more variations to test, is can involve formal experimental plans that enable multiple factors to be tested in combination.

Because surveys have an element of statistical and psychological noise, the design of experiments and tests requires planning and structure to ensure comparisons can be properly made. Full testing of advertising impact involves pre- and post-measurement, and the use of test and control areas to measure uplift. While evaluative testing of products or designs needs to be considered both monadically (ratings taken for the test product before comparison), and comparatively and can be branded or unbranded, open or blinded.

Blending data from multiple sources

A common feature of many surveys is the ability to blend in real behavioural data, for instance taken from a database, or based on observation of someone completing a customer journey plus survey task. Data from database analysis - for instance purchase rates, periodicity, last purchase information, and variables from websites such as level of website engagement can be combined with survey data.

Typically survey data will contain information about opinions and perceptions, whereas the database data is narrow around facts relating to purchase or usage (eg visits to a website). Typically database data is multi-level and time-based (eg information about multiple shopping trips, or multiple products purchased), whereas survey data is normally 'flat' - a single record per respondent. Combining database and survey data therefore requires restructuring and summarising the data, including coding, counts and deriving meta data.

However, the combination of survey data and behavioural insight, allows investigation of how attitudes drive purchase and allows for the development of models and segmentations to identify new prospects, or different groups of customers, not immediately obvious from database data alone.

For help and advice on carrying out any research projects on-line or off-line contact info@dobney.com