Can it Predict PMF?

Product-market fit isn't just a milestone, it's an ongoing measurement in a constant state of flux. The tried and tested PMF survey gives a snap shot into whether people want what you've put into their hands. What if you could run the same measurement in minutes, dare I say, without involving any users?

That sounded crazy typing it, but if you could predict a PMF missfire, you could skip building the thing nobody wants and focus instead on the thing the market has an appetite for. Get a new killer feature in the shower after you've hit send on a real survey and the opportunity to ask is gone. In the real world, you can't use data that you did'nt collect. But with an audience simulator, it's almost as good as having a time machine. Just spinup another crowd ready-to-roast and test your new value props. Rinse repeat until you have enough PMF in the virtual world to merit the build.

Rally has real users and we just ran a PMF survey. So I tested how well AI personas could predict the actual responses. First with an audience cloned from demo transcripts, then gave them "memories" of the survey data but witholding the actual PMF question/response. The AI personas knew things like how they explained Rally to others and the key benefit, but did'nt know if they'd be disappointed if they could no longer use Rally.

The idea being if this worked, you could try and run it for a product you've not built yet by sourcing similar data for tangent product offerings. e.g. scraping review websites or even surveying the users of established players. With your enriched AI personas, you could then battle test your product concepts knowing the AI personas are grounded in real humans voices.

Here's what happened.

Setting the Benchmark With Real Humans

The responses to my survey are still coming in, but after 32 real human submisions, here's what they answered to "How would you feel if you could no longer use Rally?"

- Very Disappointed: 15.6%

- Somewhat Disappointed: 62.5%

- Not Disappointed: 21.9%

The 40% "Very Disappointed" rule of thumb suggests you've hit PMF when at least 4 in 10 users would be very disappointed to lose your product. Rally sits at 15.6%. Solid engagement, but clearly room to grow. The majority (62.5%) would be somewhat disappointed, indicating the product delivers value but isn't quite indispensable (yet).

This benchmark became my target: could AI personas replicate these exact percentages, synthetically?

When testing with the enriched AI personas, I gave them memories of answeres to the other questions in the survey, just not the PMF question.

What is the main benefit you receive from using Rally?
How would you explain Rally to a friend or colleague?
What's stopping you from purchasing a paid subscription?
How could we double the value you get from Rally?

Here's what that looked like when added to the persona background.

Designing Eight Synthetic Panels

I tested the question across a few models, with and without memories (data enrichment) and tweaking the product description.

Models tested:

- Anthropic Claude (fast)

- OpenAI GPT-4 (fast)

- Google Gemini (fast)

Context levels:

- Generic personas: Basic demographic and psychographic profiles

- Enriched personas: Loaded with real user memories of actual survey responses

Copy tone:

- Positive marketing blurb: e.g. "Rally is used as a rapid fire replacement for running focus groups when..."

- Neutral factual description: e.g. "Rally is a research platform that generates synthetic personas so you can..."

Each of the eight panels answered the identical PMF question.

The Results

Key findings:

These results show that a product description with the positive-tone routinely over-predicted "Very Disappointed" (often >90%). More neutral copy helped boost accuracy. Enriching with real survey data contributed to a big improvement to accuracy. The Anthropic/fast coming out on top, with an overall accuracy of 88.5%. Around 17 points off "Very Disappointed" and "Somewhat Disappointed" votes from real humans, and nearly perfect for "Not Disappointed". However, OpenAI was'nt far behind, and a model you can use on our $20/m plan.

The key takeaway for me here being, generic personas (without enriched memories) consistently missed the mark by wider margins. Helping the case for running a strategic data enrichment if wanting to use synthetic audience simulators to pre-validate your product.

Wrapping up

PMF isn't one-and-done measurement. Markets shift. Products evolve. Competition emerges. You need to constantly re-measure where you stand. Even though AI has made it easier / faster to build products, shipping them, aquiring users, and collecting PMF signal takes time. Now that I have an audience and simulator showing some promising PMF predictive signals, one test I'd like to run next is simulate the many ideas that have been discussed internally and stress test our currenmt roadmap to see if the bets work in silicone. Then truth check with the humans. Every cycle sharpening a mental model of what really moves the needle.

PMF measurement just got a lot faster. Use it wisely.

Ask Rally