Theme selection shapes the entire game experience and marketing approach, influencing player acquisition and retention rates. This experiment tests four distinct visual themes for Seaborne to identify which aesthetic direction best captures player interest and provides the strongest foundation for both gameplay and advertising creative development.
Q2: Theme Preferences (Seaborne)

Evaluated four theme options (Cartoon, Anime, Sunset, Island) for Seaborne game concept. AI correctly identified Sunset/Ships as the winner (66% vs 44% human preference) but showed overconfidence in winning options while missing some nuanced preferences.

π Hypothesis
"Maritime adventure themes featuring sunset imagery will outperform cartoon and anime styles among mobile gamers due to their ability to convey action potential and broader demographic appeal."
Introduction
π¬ Methodology
The next query we ran is to figure out the theme: “Which theme would you most like to play from the four choices below?”. There is some blurb about the game to give context, and then four different images are provided that represent different styles. This casts a wider net than just character development, and we’re really trying to determine more broadly what type of game is more appealing to the audience.
Data Source: James Cramer, Skunkworks
Audience: US Mobile Games - Core Demographics: Geographic Focus: United States-based mobile game players Age Distribution: Primarily middle-aged gamers, with the largest segment being 35-44 years old (42%), followed by equal representation from 25-34 and 45-54 age groups (20% each) Gender Split: Male-dominated audience (66%) with significant female representation (30%), plus small percentages of non-binary and other gender identities Gaming Preferences: This audience shows diverse gaming interests across multiple genres: Top Categories: Adventure games (11.5%), Strategy (11%), and Role-playing (10%) represent the most popular genres Secondary Interests: Word games (9%), Card games (7%), Trivia (7%), and Simulation games (7%) Action & Casual: Moderate interest in Action (12%), Racing (5.5%), Sports (5.5%) Niche Segments: Smaller but notable groups enjoy Educational (3.5%), Casino (3%), Family (1.5%), and Music games (0.5%)
Simulator: chat
π Results
Performance Metrics
Baseline
0%
Optimized
87%
Metric: Alignment (87% achieved)
Option |
PickFu (Human) |
Rally |
Cartoon (A) |
18% |
0% |
Anime (B) |
10% |
14% |
Sunset (C) |
44% |
66% |
Island (D) |
28% |
20% |
Here we can see the potential issue with testing similar-looking designs – there was some confusion between the Anime and Sunset style. This probably conflated the final results to some degree, so it would make sense to develop the testing plan further with better naming conventions or more differentiated designs. We correctly identified the Sunset winner, but better than that we have a reason why: the players liked that it showed potential for action. We could incorporate that feedback into our next iteration with the design team, and see if different colors or scenes better evoke that sense of adventure in the audience.
π Analysis
Rally strongly predicted theme preferences, but still over-compensates, going too hard on winning options. Sunset/Ships was correctly identified as the winner but overshot its popularity (66% vs 44%) while missing the Cartoon's limited level of appeal. Rally's high-conviction but spikily distributed predictions shows a trend that synthetic audiences tend to get things right, but with too much conviction.
π‘ Conclusions
Rally correctly identified the Sunset/Ships theme as the winner but again showed overconfidence, predicting 66% versus 44% actual preference. The AI completely missed the Cartoon option's appeal (0% vs 18% human preference) while reasonably approximating other options. The experiment highlighted potential issues with testing visually similar designs, as confusion between Anime and Sunset styles may have influenced results. The key insight that players preferred themes showing "potential for action" provides valuable direction for future design iterations.
π§ͺ Similar Experiments
Q1: Character Designs for a Conquest MMO (Seaborne)
Character design A/B testing for mobile MMO games: AI vs human preferences comparing realistic...
CalibrationQ3: Theme Preferences (Goblin Quest)
Goblin Quest theme testing achieves 98% accuracy: AI vs human preferences for mobile game visual...
CalibrationQ4: Advert Preference (Goblin Quest Images)
Mobile game ad testing reveals AI bias: Solo vs team character advertisements show divergent...
CalibrationAbout the Researcher

Mike Taylor
Mike Taylor is the CEO & Co-Founder of Rally. He previously co-founded a 50-person growth marketing agency called Ladder, created marketing & AI courses on LinkedIn, Vexpower, and Udemy taken by over 450,000 people, and published a book with OβReilly on prompt engineering.