9 MIN READ

How to Use Synthetic Data in Market Research

Dr Katharine Johnson

How to Use Synthetic Data in Market Research

In recent years, the rapid advancement of technology and data analytics has revolutionised working p...

9 MIN READ

Dr Katharine Johnson

    How to Use Synthetic Data in Market Research
    9:14

    In recent years, the rapid advancement of technology and data analytics has revolutionised working practices in the market research industry. One significant and growing innovation to watch is the promised uses of synthetic data.

    Synthetic data is artificially generated data that mimics real-world data without compromising privacy or sensitive information. Many insight experts are working to better understand its potential in market research and wider applications, but how much do we currently know about its advantages and potential applications in market research?

    What is Synthetic Data?

    Synthetic data is generated using algorithms and statistical methods rather than being collected from real-world responses. It is designed to represent the characteristics of actual data realistically, including patterns, distributions, and correlations. Synthetic data can be created using generative adversarial networks (GANs) and simulation models.

    The key advantage of synthetic data is its ability to mimic real data while not needing to collect personally identifiable information (PII) from the participants who would usually complete real-world studies. This removes the ethical and legal concerns associated with using actual data, particularly when it comes to personal information. Therefore, the ability to generate synthetic data is valuable in many businesses and industries where data privacy is extremely important, such as healthcare, finance, and market research.

    Tweet from FlexMR Tweet This
    Synthetic data is generated using algorithms and statistical methods rather than real-world responses - so what does this mean for market research and how can we use it?

    Advantages of Synthetic Data in Market Research

    1. Data privacy

    To comply with GDPR regulations, companies need to carefully manage their data, who has access, how long it is stored, where it is stored, etc. Synthetic data allows researchers to conduct analysis without the risk of storing or exposing personal information, thus ensuring compliance with legal requirements.

    2. Cost-effectiveness

    Synthetic data is known to be more cost effective than data from a primary source. Collecting and cleaning real-world data can be expensive and time-consuming and synthetic data can be generated quickly and at a lower cost. This could allow researchers to allocate resources and funds to other projects and to allocate resources more efficiently.

    3. Enhance or boost data sets

    Synthetic data could be generated to fill in missing data from partially completed datasets or to boost sample sizes of collected data to meet a specific sample size, or fill-in incomplete data. These algorithms could also help create a more balanced representation of various demographic groups. This could potentially lead to more robust analyses and insights.

    4. Flexibility and scalability

    Synthetic data could be generated based on the researchers’ requirements for example to meet specific needs such as varying sample sizes of specific demographic characteristics. This flexibility may enable market researchers to simulate different scenarios and test concepts and hypotheses without the constraints of the availability of real-world data.

    5. Rapid testing

    In market research, the ability to quickly gain feedback and test ideas is crucial. Synthetic data has the ability to enable researchers to simulate market conditions and consumer behaviour, allowing for faster turnaround of insights.

    Potential Uses of Synthetic Data in Market Research

    Boosting sample sizes

    Synthetic data could be used to gain a specific number of responses for example if a researcher required a sample of 1000 but only achieved 920 responses, synthetic data could be used to generate an additional 80 responses. This could be particularly useful should there be difficult to reach or hard to obtain audiences to gain data from. Therefore, the synthetic data could be used to generate responses that mimic the behaviour of these audiences with the additional advantage of being able to collect within a set timescale.

    Completing partially completed datasets

    The percentage of dropouts within a survey varies depending on the complexity of the survey, number of questions, difficulty, engaging contact, ease of completion etc. However, synthetic data could be generated to ‘fill-in’ any incomplete responses if a participant has dropped out before completion. The algorithms are  able to generate responses based on the already completed data and any demographic data known.

    Generation of data from specific demographic groups or segmentations

    Synthetic data could be used to generate data from under-represented groups within your sample or boost specific targeted group responses as well as generating data to reflect different segment or consumer characteristics.

    Consumer behaviour analysis

    Synthetic datasets could be used for analysis of consumer behaviour and the study of purchasing behaviour and trends as whole datasets could be generated to reflect specific consumer profiles. This could help companies understand their customer groups in more detail and tailor their marketing strategies or products to meet the needs of their consumers.

    Survey design and testing

    Researchers could generate responses to survey questions using synthetic data before collecting real-world data. This may help researchers test survey questions and the method used prior to data collection to ensure the survey is effective in capturing valuable insights. This will also help test the accuracy of the synthetic data (more about that later).

    Product development or A/B testing

    Synthetic data could also be used to simulate market responses to new products or services so market researchers could evaluate the potential for new products or the effectiveness of different strategies. Perhaps, the data could create scenarios that reflect different market conditions, and then companies could assess potential success and make informed decisions about a new product launch.

    Predicting trends

    By analysing synthetic datasets that simulate future market conditions, researchers could identify emerging trends and consumer behaviours. This foresight enables businesses to stay ahead of the competition and adapt their strategies accordingly.

    Improving predictive models

    Market researchers use predictive models in the analysis of real-world data and synthetic data could be used to help improve the accuracy of these predictive models.

    Presently Discovered Challenges

    While synthetic data could have numerous advantages, there are challenges that researchers must consider. One concern is the potential for synthetic data to introduce biases if the generation process does not accurately reflect real-world conditions e.g. if the training data does not accurately represent the target population. Additionally, while synthetic data can mimic the statistical properties of real data, it may not capture the nuances and complexities of human behaviour and be quite limited in its diversity. The data quality will depend on the algorithms used to generate them and it could be hard to validate the trustworthiness of the generated data.

    Researchers must also ensure that the synthetic datasets generated are used ethically and responsibly. Researchers should also be transparent in their methodologies used to create synthetic data to maintain trust and credibility within the industry and in the analysis of these datasets especially as synthetic data may not comply with all regulatory standards in some highly regulated industries.

    Tweet from FlexMR Tweet This
    Reliable synthetic data models will have great market research applications, from boosting sample sizes to survey design testing and consumer trend prediction. But this doesn't mean there aren't challenges too.

    Other Risks and Accuracy of Synthetic Data

    At present, the use of synthetic data within the market research industry is emerging from its infancy. There are companies offering to generate synthetic data but researchers are still cautious as to the accuracy of these algorithms. Recently, FlexMR tested the potential use of synthetic data in three scenarios: boosting overall response rates, completing incomplete responses and boosting under-indexed segments. The results showcasing the reliability and accuracy of the synthetically generated data are extremely interesting and will be discussed at the upcoming Market Research Society Financial Services Conference 2024.

    Ultimately, the use of synthetic data has the potential to help fill sampling gaps, but it needs to be used with caution. Declining participation rates or the need to fill difficult sample quotas may lead researchers to use more synthetic data but they need to ensure the representativeness of the target population including minority groups in addition to being accurate. However, the benefit of not having to deal with PII, sensitive or confidential data, its cost-effectiveness, flexibility, and ability to enhance data quality could make the use of synthetic data a good option in certain circumstances in the future. As technology continues to evolve, the potential applications of synthetic data in market research will undoubtedly grow.

    Insights Empowerment

    You might also like...

    Blog Featured Image Header

    How Delivery Platforms Reshaped the...

    Have you noticed how ordering food today feels like a whole new experience compared to a few years back? The restaurant and food delivery industry has been on quite a journey, and it’s amazing to see ...

    8 MIN READ
    Blog Featured Image Header

    Market Research Room 101: Round 2

    On Thursday 9th May 2024, Team Russell and Team Hudson duelled in a panel debate modelled off the popular TV show Room 101. This mock-gameshow-style panel, hosted by Keen as Mustard Marketing's Lucy D...

    8 MIN READ
    Blog Featured Image Header

    Delivering AI Powered Qual at Scale...

    It’s safe to say artificial intelligence, and more specifically generative AI, has had a transformative impact on the market research sector. From the contentious emergence of synthetic participants t...

    7 MIN READ