Synthetic data is everywhere, but is it any good?
The market research sector has a problem: You don’t pick up your damn phone anymore. Some eight in 10 of us don’t answer when an unknown number calls, according to the Pew Research Center, a shift that has had a knock-on effect on pollsters’ ability to get us to share our thoughts. Online surveys, too, can be easily gamed, and because they require people to opt in by physically visiting a website, they can be even easier to ignore than phone surveys. That’s where AI can help. Across the polling and consumer research industries, firms are using artificial intelligence to manufacture synthetic survey responses, creating plausible answers from fake people to stand in for, or pad out, real ones. Qualtrics, the experience-management giant, now offers synthetic panels that take a survey as an input and produce record-level responses designed to be statistically modeled the same way as responses from 1,000 humans, according to Ali Henriques, the company’s executive director of market research. The system leans heavily on Qualtrics’ own data: A publicly available base model contributes between 5 and 10% of the final result, with the remaining 95%-plus drawn from the firm’s commissioned research and aggregated, anonymized client data, stripped of brands and no more than 18 months to two years old to keep it relevant. It’s not just Qualtrics. In May, Gallup, the 90-year-old pollster, disclosed a partnership with Simile, an AI company founded by Stanford researchers, to build “agents” from in-depth interviews with around 1,000 members of its probability-based panel. But Gallup, which didn’t respond to an interview request, has been careful to say simulated responses won’t be used to produce its published population estimates, and has pledged never to present them as human answers. “Our work on simulated responses is not a departure from that commitment,” the company said in its blog post announcing the partnership. “It is built on top of it.” Such caution is needed, says