In the dynamic world of market research, the need for robust and accurate data is critical. Whilst the increased prevalence of generative AI has provided a range of benefits to market research processes, such as enhanced data analysis and improved cost and time efficiency, a common concern with generative AI, is that insights will inevitably be less reliable. Synthetic data, by its artificial nature, is particularly controversial, as it mimics real-world data but does not contain any actual records of real individuals. Synthetic data could be the key to unlocking richer insights, but does it remove the ‘heart’ from the industry?
Why Not Synthetic Data?
While useful in certain contexts, it may not always be the best option to use synthetic data for research. Firstly, it often lacks the complexity of real-world data, which can lead to oversimplified models and incorrect conclusions. Of course, sufficient validation against real data can be used to support credibility, but ultimately, relying solely on synthetic data can compromise the validity of research findings.
Mike Stevens, Insight Platforms CEO, provides his take on the use of synthetic data applications in the industry in a video with Nexxt Intelligence, highlighting that “human understanding and empathy” is the essence of market research. Mike describes his experience, particularly in qualitative research, of synthetic datasets never inducing a smile or a laugh, the same way we know a truly human, unpredictable response can.
Synthetic data also poses a challenge for companies whose primary business model is the purchase, processing, and resale of data, though many would view this a plus point in favour of synthetic data. The automation of real-world data minimises the necessity to store personal information about individuals, enhancing data privacy.
![]() |
Tweet This |
---|---|
Synthetic data is the latest technology to dazzle the insights industry; but as with all new tech, it's important to know when and how to use it well to maximise its impact. |
Why Synthetic Data?
Despite validity concerns, synthetic data is a valuable resource for several reasons. Though the absence of real-world data is controversial, it allows for exploration of scenarios otherwise difficult to explore with real-world data, and addresses privacy concerns around data sharing at the same time. Testing and validation of algorithms and Large Language Models (LLMs) also becomes simpler, and more robust, with a greater pool of testing data available. Some of the key benefits are summarised below:
Cost
Acquiring research data through sampling via a panel provider or purchasing datasets, for example, can be expensive, and businesses with larger budgets certainly have an advantage. FlexMR’s CMO, Chris Martin, speaks to the benefits of synthetic data as a leveller of the playing field between “established brands and challengers”. The option to scale up datasets using synthetic data means a more modest budget is no longer a barrier to gleaning widescale insight.
Scalability
Once a generative model is developed, it can produce large volumes of synthetic data quickly and efficiently, enabling the creation of extensive datasets without the logistical challenges of real-world data collection.
Learning applications
Customer segmentation is a tool widely utilised in market research, but it is not easy to create profiles which are dynamic, and which grow and change to reflect the real individuals who they are based on, who are growing and changing every day. Chris describes the learning opportunity in using synthetic data to support the creation of interactive personas, which react to current events and issues.
On Balance
Does synthetic data provide faster, more cost-effective, scalable market insights? Yes. Does it hold the key to richer market insights? Not necessarily. While volume may enhance the richness of insights by facilitating better generalisations, for truly rich and detailed, qualitative responses, the importance of exploring real human experiences remains. Steve Phillips, Founder and CEO of Zappi, asserts in the 2024 MRS Report 'Using synthetic participants for market research' that “synthetic data will never be a replacement for real consumer data, [as] AI is only as smart and up-to-date as the data it’s trained on”. Real-world data will always be needed to inform AI-powered technology. As the name implies, it is artificial. It cannot render outputs which reflect truly human thoughts, feelings and experiences, without us.
![]() |
Tweet This |
---|---|
Does synthetic data provide faster, more cost-effective, scalable insights? Does it hold the key to richer market research insights? |
How can we make synthetic data work for us?
The advantage of removing real consumer data in certain scenarios is certainly clear. Most consumers, including myself, feel reassured by any development which serves to protect our personal data.
One potential utilisation which supports this, could be creating hypothetical scenarios that may be rare or difficult to observe in real life. This can provide insights into potential outcomes without ethical or logistical concerns. For example, research into how consumers might feel when their holiday is cancelled. Travel companies (hopefully) aren’t about to cancel our upcoming holidays to test our reactions to this.
But using synthetic data – perhaps that of interactive profiles discussed above – they may be better equipped to predict and mitigate against any chain reactions which follow the cancellation of a holiday, i.e., what will make that consumer feel better in that situation, and make them want to continue using the travel provider? A further exercise could follow, asking real consumers how they feel about the results reflected in the synthetic data study – is it a true representation of how they might feel? In this way, synthetic data can be used to incite real change, while keeping the consumer at the heart of research, and ensuring their interests remain protected.
Ultimately, synthetic data should be seen for what it is: a tool like many others that enhances the scale of our understanding and world view, but in the end is not a replacement for customer closeness; for truly human connections and real-world experiences which continue to drive our desire to understand consumers and what is important to them.