Character design represents one of the most critical early decisions in game development, as it influences downstream design choices and can be extremely costly to change later. This experiment compares two distinct visual approaches for Seaborne, a conquest MMO, to determine which character style resonates more strongly with target players and why certain aesthetic choices drive preference.
Q1: Character Designs for a Conquest MMO (Seaborne)

Tested preference between realistic/Pixar-style character designs versus blocky/polygon-style designs for a conquest MMO game. AI personas showed 96% preference for realistic design compared to 86% human preference, correctly identifying the winner but with overconfidence.

π Hypothesis
"Realistic character designs will be preferred over blocky designs by mobile gamers due to their polished appearance and broader market appeal, making them more suitable for mainstream MMO audiences."
Introduction
π¬ Methodology
Our first question for development of the Seaborne game concept is “Which of these character designs for a conquest MMO do you like better, and why?”. We have two designs, one more realistic or Pixar-looking, and another one more blocky or polygon style. We want to know which to go with, because it can affect a lot of downstream design decisions as we develop the game. Getting it wrong and changing it later would be supremely costly.
Data Source: James Cramer, Skunkworks
Audience: US Mobile Games - Core Demographics: Geographic Focus: United States-based mobile game players Age Distribution: Primarily middle-aged gamers, with the largest segment being 35-44 years old (42%), followed by equal representation from 25-34 and 45-54 age groups (20% each) Gender Split: Male-dominated audience (66%) with significant female representation (30%), plus small percentages of non-binary and other gender identities Gaming Preferences: This audience shows diverse gaming interests across multiple genres: Top Categories: Adventure games (11.5%), Strategy (11%), and Role-playing (10%) represent the most popular genres Secondary Interests: Word games (9%), Card games (7%), Trivia (7%), and Simulation games (7%) Action & Casual: Moderate interest in Action (12%), Racing (5.5%), Sports (5.5%) Niche Segments: Smaller but notable groups enjoy Educational (3.5%), Casino (3%), Family (1.5%), and Music games (0.5%)
Simulator: chat
π Results
Performance Metrics
Baseline
0%
Optimized
90%
Metric: Alignment (90% achieved)
Option |
PickFu (Human) |
Rally |
Realistic (A) |
86% |
96% |
Blocky (B) |
14% |
4% |
One of the things I find most helpful is seeing not just what option was chosen, but why. They liked the polished look and broader appeal of the realistic design, whereas the blocky design was too reminiscent of Minecraft. If we ran this test with a younger audience we might see a completely different preference, highlighting how important it is to know what customers fit your ideal profile.
π Analysis
Rally correctly identified the realistic design's popularity, though Rally overestimated it (96% vs 86%). Rally's prediction captured the strong human preference despite overcompensating when I ran it using Google rather than OpenAI, in Smart mode (the larger Google Gemini 2.0 model). OpenAI in Fast mode (GPT-4o mini) slightly preferred the blocky design instead, highlighting the importance of calibration–checking the model and audience you chose gives results that match your past experiments.
π‘ Conclusions
Rally successfully identified the correct winner but demonstrated a pattern of overconfidence, predicting 96% preference versus the actual 86% human preference. The AI captured the core insight that realistic designs have broader appeal due to their polished appearance, while correctly identifying that blocky designs reminded players too much of Minecraft. This experiment reveals that AI testing provides directionally accurate results but may overestimate the margin of preference, suggesting the need for calibration against historical data.
π§ͺ Similar Experiments
Q2: Theme Preferences (Seaborne)
Mobile game theme testing: AI predicts player preferences for cartoon vs anime vs adventure...
CalibrationQ3: Theme Preferences (Goblin Quest)
Goblin Quest theme testing achieves 98% accuracy: AI vs human preferences for mobile game visual...
CalibrationQ4: Advert Preference (Goblin Quest Images)
Mobile game ad testing reveals AI bias: Solo vs team character advertisements show divergent...
CalibrationAbout the Researcher

Mike Taylor
Mike Taylor is the CEO & Co-Founder of Rally. He previously co-founded a 50-person growth marketing agency called Ladder, created marketing & AI courses on LinkedIn, Vexpower, and Udemy taken by over 450,000 people, and published a book with OβReilly on prompt engineering.