Your focus group loved sustainable packaging. Said they'd absolutely pay the premium. Three months after launch, you're staring at inventory that won't move and a product team that's stopped returning your emails.
This isn't a story about lying consumers or bad research. It's a story about a fundamental confusion in how we think about understanding people, one that gets worse, not better, when we add AI to the mix.
I've spent the last year designing and deploying synthetic persona solutions, watching companies excitedly deploy "the always on voice of customer" only to grow skeptical within first contact. The pattern is always the same: initial enthusiasm about scale and speed, followed by confusion when the predictions don't match reality, either ending in quiet abandonment and a vague sense that "AI isn't there yet.” Or on rare occasions, someone having the breakthrough realization, shifting their frame entirely, and suddenly finding the tool invaluable.
But here's what I've learned: the problem isn't that the AI isn't good enough. The problem is we're asking it to predict something humans can't even predict about themselves.
Last week a potential client asked me point-blank…
Can your personas tell me if people will actually buy this product?
I've been asked this question dozens of times. But this time, after a year of watching what works and what doesn't, I finally had clarity about the real answer.
No. But they can tell you what they'll say about it, and that might be more valuable than you think.
This confused them. It should confuse you too. Because if you can't predict behavior, what's the point?
The answer to that question requires us to understand something Richard Sutton, one of the founding fathers of reinforcement learning and recent Turing Award winner, recently explained about what large language models actually do. And more importantly, what they can't do.
Part 1: The Focus Group Lie
Here's the thing about focus groups. People don't lie to you.
They lie to themselves, and you just happen to be in the room.
Show someone two cars. Car A costs $35,000, gets 90 mpg, emits minimal CO2, and comes loaded with eco-friendly materials and technology. Car B costs $25,000, gets worse mileage, higher emissions, and has a traditional interior.
Ask them which they'd choose. Watch what happens.
When researchers tested this scenario with AI personas, the responses varied dramatically depending on how the personas were generated. But here's the deeper truth. The speech-reality gap in human decision-making is so well-documented that even AI trained on human discourse has learned to replicate it.
Survey researchers have documented a persistent gap between stated preferences and actual behavior, what they call social desirability bias, hypothetical bias, or the attitude-behavior gap. Meta-analyses show that people systematically overstate their willingness to pay for products in surveys compared to actual purchasing decisions. The pattern is consistent enough across domains that researchers treat it as a fundamental challenge in survey design.
The stated preference is clear. The revealed preference is... different.
This isn't about environmentalism specifically. It's about everything. College students say a four-year degree is worth it "even with loans" at rates far higher than their actual enrollment and completion behavior suggests. People say they'd pay more for ethical sourcing right up until they click "add to cart" on the cheaper option. We claim we'll cancel subscriptions we keep paying for months later.
The gap between what people say and what they do is so well-documented it's almost boring.
Except here's what makes it not boring. That gap is where all the valuable information lives.
When someone says they'd buy the eco-friendly car but chooses the cheaper one, they're not giving you useless data. They're telling you exactly what they value, what they wish they valued, what constraints actually bind their decisions, and what story they need to tell themselves and others about their choices.
The articulated reasoning, "I care about the environment, but I have student loans" isn't noise. It's not covering up the "real" preference. It's revealing the actual cognitive landscape they're navigating.
Competing values, practical constraints, identity maintenance, social signaling.
Traditional market research treats this as a problem to solve. What if it's actually the thing you're trying to measure?
Part 2: What LLMs Actually Learn (And Why That's Perfect)
In a recent podcast, Richard Sutton, who literally invented many of the core techniques in reinforcement learning, made a distinction that most people in AI completely missed.
The host asked him about large language models and their "robust world models." Sutton disagreed with the premise entirely.
To mimic what people say is not really to build a model of the world at all," he said. "You're mimicking things that have a model of the world—the people. They have the ability to predict what a person would say. They don't have the ability to predict what will happen.
This might sound like a criticism of LLMs. It's not. It's a precise description of what they actually do, and once you understand it, everything about synthetic personas makes more sense.
LLMs learn from internet text. From Reddit arguments and blog posts and academic papers and customer reviews. They learn the discourse layer, the things humans say when they're articulating, explaining, justifying, objecting, recommending.
They don't learn from experience. They don't have bodies. They've never bought a car, tasted disappointment at a restaurant, or felt the physical relief of finally canceling a subscription they'd been meaning to drop.
Sutton's point is that this means LLMs can't predict what will happen in the world. They can only predict what people will say about what happens.
For most AI applications, this is a real limitation. If you're trying to build a robot that navigates physical space, or an agent that learns to play chess through trial and error, you need the kind of learning that comes from embodied experience, from trying things and seeing what actually happens.
But here's the thing. Humans are also much better at articulating their reasoning than predicting their own behavior.
We're incredibly fluent at explaining our values, walking through our decision-making process, articulating objections, and describing what appeals to us. We're articulate about trade-offs, about constraints, about competing priorities. We can tell you exactly why we care about sustainability but also why we're not sure we can afford the premium.
What we can't do reliably is predict which car we'll actually buy when we're standing in the dealership six months from now, exhausted from the negotiation, worried about the monthly payment, and just wanting to be done.
LLMs and humans are both operating in the same layer, the discourse layer, the explanation layer, the articulation layer.
And here's what nobody wants to admit. For most market research questions, that's exactly the layer you actually need.
Part 3: Where Ideas Actually Spread
Think about the last time you recommended a product to a friend.
You didn't run a cost-benefit analysis out loud. You didn't cite studies. You told a story about your experience, articulated why it worked for you, acknowledged potential downsides, maybe preempted an objection you thought they'd have.
That conversation, that specific act of articulation, is where most purchase decisions actually get influenced. Not at the moment of transaction, but in the accumulated conversations that happened before it.
This is why the speech-action gap isn't a bug in human psychology to be corrected. It's the primary mechanism through which ideas spread, products get adopted, and markets actually move.
When someone says "I love the idea of that eco-car but I'm worried about the monthly payment," they're not failing to predict their behavior. They're rehearsing the conversation they'll have with their spouse, with their financially prudent friend, with themselves at 2am when they're second-guessing the decision.
These articulated concerns don't just reflect internal deliberation, they shape it. The reasons we give out loud become the reasons we believe. The objections we voice to others become the objections we take seriously ourselves.
And critically, these are the conversations that happen at scale, in places you can't observe. Reddit threads about whether your product is worth it. Slack channels where someone asks if anyone's tried your service. Dinner table conversations about whether to upgrade.
You will never predict the exact behavior of Customer #4847. But if you understand what gets said in these conversational spaces, what resonates, what falls flat, what objections surface, what narratives take hold…you can actually influence the aggregate outcome.
This is what calibrated personas give you access to.
Not a crystal ball that predicts individual behavior. But a way to stress-test ideas in the discourse layer where they'll actually be discussed, debated, recommended, and dismissed.
When we calibrate a persona to sound more like their real human counterpart, training on actual interview transcripts, actual patterns of reasoning, we're not trying to predict what they'll do next Tuesday at 3pm.
We're trying to capture how they talk about decisions like this one. What values they invoke. What constraints they mention. What makes them defensive, what makes them curious, what talking points resonate.
Because here's what we've learned. When someone's synthetic persona says "I care about sustainability but my budget is tight right now," that's not a prediction that the real person will or won't buy your eco-friendly product.
It's intelligence about the cognitive terrain. It tells you that your marketing needs to address the guilt of not buying the premium option. That your messaging should acknowledge financial constraints rather than ignoring them. That there's an opening for a mid-tier option that lets people buy into the values without the full premium.
The persona isn't predicting behavior. It's revealing the conversational reality your product will enter into.
And when you align the conversational reality of real humans with AI personas in one area, they can align more broadly in others too. Huge!
What This Means In Practice
If you're used to thinking about synthetic personas as prediction engines, this reframe requires some tactical adjustments.
Stop asking "will they do X" questions. Start asking "how would they talk about X" questions.
Instead of: "Will this customer segment subscribe to our premium tier?" Ask: "What concerns would they voice about the premium tier? What would make it feel justified versus overpriced? How would they explain the decision to upgrade, or not to, if a colleague asked?"
The second set of questions gives you something actionable. You learn that price isn't the real objection, it's clarity about what the premium features actually do. Or you discover that the upgrade feels like admitting the basic tier wasn't good enough, creating a subtle ego barrier you hadn't considered.
Memories matter more than you think. If you know 25% of your audience uses Google Chrome, don't be disappointed when you look up the market share of Google Chrome and it's 70%, instead calibrate your personas on what you know about the market and ask them the questions you don't know the answer to. Give 25% of them as memory that they use that browser, and let that inform how they react to everything else. We support it in the product with Base Memories. Or adding manual_memories in chat.
This isn't cheating. It's recognizing that the value isn't in having the AI derive basic facts from first principles. The value is in how those facts shape the reasoning, the objections, the reactions.
Use personas to explore, not to validate. The worst use case is: "We think Feature X will drive adoption. Let's ask the personas if they agree."
The better use case is “We think Feature X will drive adoption. Let's see what responses emerge. Enthusiasm, confusion, skepticism, or indifference. Let's understand why each reaction happens, what would shift it, what we're not seeing.”
You're not looking for a yes/no vote. You're mapping the possibility space of reactions.
Calibration is the difference between theater and insight. Uncalibrated personas give you generic LLM responses, helpful, balanced, articulate, and utterly disconnected from how your actual customers talk.
Calibrated personas, trained on real people, pick up the specific vocabulary, the particular concerns, the reasoning patterns of real humans. They surface the objection you didn't think to ask about. They use the frame that actually matters to this audience.
This is why we developed our enterprise calibration engine toward individual persona-level calibration. Smaller audiences, higher realism. Not to claim we can predict behavior, but to capture how real humans in your market actually talk about decisions like the one you're asking them to make. And it can be integrated directly into video generation for those times where you need to bring the personas to life.
The Honest Pitch
Here's what synthetics can't promise you: they can't tell you if Customer #4847 will click "buy now" on Tuesday.
Here's what synthetics can promise. They can help you understand what gets said about your product in the conversational spaces where decisions actually get shaped. What narratives take hold, what objections surface, what makes an idea feel compelling versus suspicious.
Synthetics can help you stress-test messaging before you spend the budget. Explore positioning options before you commit. Understand why an idea that seems obvious to you might land completely differently with the people you're trying to reach.
The catch is you have to stop thinking about prediction and start thinking about exploration. Stop looking for the crystal ball and start using the conversation mirror.
Because here's the thing. Even humans can't reliably predict their own behavior. But they're remarkably good at articulating their reasoning, and that reasoning, voiced and shared and socially reinforced, is what ultimately moves markets.
The speech-reality gap isn't a problem to solve. It's the territory we're all operating in.
Calibrated personas just help you navigate it more carefully.
DM me to discuss running our calibration engine to build your virtual focus group.