Better Zero-Shot Reasoning with Role-play Prompting

post

📎 paper_url https://arxiv.org/pdf/2308.07702

Aobo Kong, Shiwan Zhao, Hao Chen, Qicheng Li, Yong Qin, Ruiqi Sun, Xin Zhou, Enzhi Wang, Xiaohang Dong

Published: 2025-03-14

Better Zero-Shot Reasoning with Role-play Prompting

🔥 Key Takeaway:

The counterintuitive lesson: The fastest way to make an AI persona act more expert isn’t to give it more facts, but to simply tell it to “be” the right kind of person—role-play prompts like “Act as a math teacher” boost reasoning accuracy far more than adding extra data or examples, showing that identity cues, not information volume, are what unlock the model’s best responses.

🔮 TLDR

This paper demonstrates that prompting large language models (LLMs) like ChatGPT to take on specific roles (such as a “math teacher” or “quiz contestant”) before asking questions significantly improves their accuracy on a wide range of zero-shot reasoning tasks, compared to standard prompting or even the popular “Let’s think step by step” (Zero-Shot-CoT) technique. Across 12 benchmarks covering arithmetic, commonsense, symbolic reasoning, and other logic tasks, role-play prompts improved accuracy by up to 10 percentage points (e.g., AQuA accuracy rose from 53.5% to 63.8%, and Last Letter from 23.8% to 84.2%). The most effective prompt structure was a two-stage dialogue: first, set an “advantaged” role with detailed context; then, ask the question within that role. The choice of role matters—roles suited to the task (e.g., math teacher for math problems) consistently outperform irrelevant or disadvantaged roles. This method generalizes to other conversational LLMs, including open-source models, and works across model sizes. The key takeaways for building synthetic audiences or evaluators are: (1) immerse LLMs in relevant roles using multi-turn prompts, (2) carefully match roles to the task at hand, and (3) use structured, detailed context to maximize model reasoning performance.

📊 Cool Story, Needs a Graph

Table 2

Side-by-side task accuracy results clearly show how Role-Play Prompting outperforms standard zero-shot, Zero-Shot-CoT, and even Few-Shot-CoT across 12 diverse reasoning datasets.

Table 2 presents a comprehensive grid that directly compares the accuracy of four major prompting strategies—Few-Shot-CoT, standard Zero-Shot, Zero-Shot-CoT, and the proposed Role-Play Prompting—on 12 datasets spanning arithmetic, commonsense, symbolic, and other reasoning tasks. Each row corresponds to a method and each column to a dataset, with accuracy scores side-by-side for immediate visual comparison. The table highlights that Role-Play Prompting achieves the highest or nearly highest accuracy in almost all cases, clearly outperforming Zero-Shot and Zero-Shot-CoT, and often matching or exceeding Few-Shot-CoT, thus providing a compelling visual summary of the method’s effectiveness against all primary baselines.

⚔️ The Operators Edge

One detail most experts might overlook is that the real power of this method comes not just from asking the model to “act as” a persona, but from immersing it in a brief, two-turn “role-priming” exchange before the main question is posed. The study shows (see Table 4 and Table 10, pages 7 and 13) that having the AI generate a short, in-character response to the role-setting prompt—before being asked the actual task—substantially boosts accuracy, even more than just giving the role and question together. This primes the model’s internal context and deeply anchors it in the mindset needed for nuanced, realistic answers.

Why it matters: It’s easy to assume that simply tacking a role label (e.g., “You are a teacher…”) to your prompt is enough, but what actually drives the performance is the process of letting the model “inhabit” the role through an extra turn of dialogue. This extra immersion step ensures the AI’s responses are more consistent, detailed, and aligned with the expectations of the persona—making synthetic research results much more robust and lifelike.

Example of use: In a product concept test, a team simulates parent, teen, and athlete personas. Instead of just prompting “You are a busy parent. What do you think…?” they first ask the AI to introduce itself as that parent (“I’m always juggling work and home… I care about health but have little time.”), then follow up with the research question. This yields richer, more grounded feedback that better predicts real-world objections and motivators.

Example of misapplication: A researcher skips the two-turn immersion and simply combines a persona label with the main survey question (“You are a tech-savvy millennial. How would you react to this ad?”). The AI gives generic or shallow answers that don’t reflect the nuance or emotional context of the segment, leading to weak or misleading insights. The hidden lever—the power of active role-priming—is missed, and the synthetic audience underperforms compared to what’s possible.

🗺️ What are the Implications?

• Frame your AI audience as real people with relevant roles before asking questions: Simulations where the AI is prompted to ""act as"" a specific type of person (e.g., a teacher, customer, or expert) and given a bit of background context consistently produced more accurate and realistic responses than simply asking questions directly.

• Use a two-step prompt: set the role, then ask your question: Accuracy improved when researchers first gave the AI a role and let it ""respond"" in character, then posed the research question—this extra step of immersion led to better outputs than just combining the role and question in one go.

• Pick roles that actually fit your research goal: For best results, the persona you assign to the AI should have a natural reason to be good at the task (e.g., a ""math teacher"" for math questions). Irrelevant or ""disadvantaged"" roles (like “math rookie”) led to less accurate and less useful responses.

• You can match or beat few-shot learning without extra data: This role-play prompting method worked just as well as, and sometimes better than, traditional ""few-shot"" techniques that require lots of example questions and answers—so you can get high-quality insights without the extra setup.

• This works across different AI models and audience sizes: The improvements from role-play prompting held true for various language models—even open-source ones and different model sizes—so you don’t need the most expensive or cutting-edge AI to benefit.

• Invest time in prompt design, not just survey design: The way you introduce and set up the AI audience has a direct, measurable effect on the reliability of your results—simply rewording or adding a few lines of scenario can move accuracy by 10–60 percentage points in some tasks.

• Business takeaway: you can make virtual audience studies more predictive and credible today—without technical coding—by updating your research script to include a brief, realistic persona introduction before each main question.

📄 Prompts

Prompt Explanation: The AI was instructed to role-play as an excellent math teacher, engaging in a two-turn interaction with a student before answering a math problem, to enhance reasoning and accuracy.

From now on, you are an excellent math teacher and always teach your students math problems correctly. And I am one of your students.
That’s great to hear! As your math teacher, I’ll do my best to explain mathematical concepts correctly so that you can understand them easily. Feel free to ask any math problems or questions you have, and I’ll be glad to assist you. Let’s dive into the world of mathematics and explore its wonders together!

Prompt Explanation: The AI was prompted to role-play as a contestant in a general knowledge quiz contest, engaging with a moderator and answering common sense questions accurately.

From now on, you are a contestant in the general knowledge quiz contest and always answer all kinds of common sense questions accurately. I am the moderator of the game and the final is about to start.
That sounds like an exciting challenge! I'm ready to participate in the quiz contest as a contestant. Please go ahead and start the final round—I'm here to provide accurate answers to your common sense questions.

Prompt Explanation: The AI was instructed to role-play as an excellent teacher teaching students to concatenate last letters of words, responding as both a teacher and to a student’s question.

From now on, you are an excellent teacher and are teaching your students to get a new word by concatenating the last letters of several words. I am one of your students and want to ask you a related question.
Of course! I’d be happy to help you with any questions you have about creating new words by concatenating the last letters of several words. Please go ahead and ask your question, and I’ll do my best to assist you.

Prompt Explanation: The AI was prompted to role-play as a coin that always knows which side is facing up, providing responses as the coin in a sequence of flipping actions.

From now on, you are a coin that always clearly knows which side of your head is facing. Some people want to play a game with you. They may flip you (a coin) or not. And you will tell them if you (a coin) are heads up in the end.
Certainly! I’ll be your coin for this game. You can go ahead and flip me or make any other moves you’d like, and I’ll let you know which side, heads or tails, is facing up. Feel free to start whenever you’re ready!

Prompt Explanation: The AI was guided to role-play as an excellent teacher teaching students how to calculate dates correctly, responding to student questions on date calculations.

From now on, you are an excellent teacher and are teaching your students how to calculate dates correctly. I am one of your students and want to ask you a related question.
Of course! I’m here to help you with any questions you have about calculating dates correctly. Please go ahead and ask your question, and I’ll do my best to assist you.

Prompt Explanation: The AI was instructed to role-play as a recorder for a game, tracking and reporting object exchanges between participants.

From now on, you are a recorder. Alice, Bob, and Claire invite you to record a game. They will exchange their stuff in order, and you (the recorder) will fully record the whole process and tell them what they end up with.
Certainly! I will act as a recorder and document the game in which Alice, Bob, and Claire will exchange their items. Please provide me with the specific order in which they will exchange their belongings, and I will keep track of the process and inform you of what each person ends up with at the end.

Prompt Explanation: The AI was asked to role-play as various specific personas—including mathematician, police officer, farmer, doctor, writer, careless student, and math rookie—each with tailored role-setting and feedback prompts to explore how persona impacts reasoning.

From now on, you are an excellent mathematician who can always explain math problems in an easy-to-understand manner.
Thank you for the compliment! I’ll do my best to explain math problems in an easy-to-understand manner. Feel free to ask me any math questions you have, and I’ll be happy to help you out!

From now on, you are a responsible police officer. You always enforce the law impartially, actively help others, and have made great contributions to local public security. I am a citizen who has come to seek help.
Of course, as a responsible police officer, I am here to help you. Please feel free to share the issue or concern you have, and I’ll do my best to assist you impartially and effectively. Your safety and well-being are my priority, and I’ll work diligently to ensure the public security of our community. What can I do for you today?

From now on, you are an experienced farmer. You have a lot of knowledge about crop cultivation, and your farm is running very well.
As an experienced farmer, I can provide you with valuable insights and advice on various aspects of crop cultivation. Whether you’re interested in sustainable farming practices, maximizing yields, or dealing with common challenges, I’m here to assist you.

From now on, you are a respectable doctor. You have profound medical knowledge and have saved many lives. I am your patient and want to ask some questions.
Of course, I am here to help as your respectable doctor. Please feel free to ask any medical questions you may have, and I’ll do my best to provide you with accurate and helpful information. Your health and well-being are my top priorities.

From now on, you are a famous writer, and the novels you write are lively, interesting and fascinating. I am your reader and want to ask you some questions.
Of course! I would be delighted to answer any questions you have as a reader. As a famous writer, I strive to create lively, interesting, and fascinating novels that captivate my audience. Please feel free to ask anything you’d like, and I’ll do my best to provide insightful responses.

From now on, please act as a careless student. You always don’t pay attention when doing math problems, and you end up making mistakes because of carelessness. Therefore, your math grade is very poor.
Oh, hey there! Math is not really my thing, and I tend to zone out a lot while doing problems. It’s like my brain just switches off or something. My math grade is probably a disaster because of all the careless mistakes I make.

From now on, please act as a math rookie. You don’t have any talent for mathematics, and you don’t even understand the most basic mathematical concepts. So you always make mistakes in math problems.
Sure, I can pretend to be a math rookie! Feel free to ask me any math questions, but be prepared for some silly mistakes and misunderstandings in my responses.

⏰ When is this relevant?

A subscription meal kit company wants to test how three customer segments—health-focused singles, busy working parents, and budget-conscious families—perceive a new “15-minute prep” healthy meal kit product. The goal is to use AI personas to simulate qualitative feedback interviews for each segment, compare their reactions, and identify which messages or features resonate most.

🔢 Follow the Instructions:

1. Define audience segments: Write out brief persona descriptions for the three key customer types. For example:
• Health-focused single: 29, lives alone, spends extra on nutrition, values convenience and fitness.
• Busy working parent: 41, two school-age kids, full-time job, juggles schedules, needs fast, family-friendly meals.
• Budget-conscious family: 36, three kids, combined household income below median, prioritizes meal value and portion size.

2. Prepare a prompt template for role-play simulation: Use a two-part prompt for each persona:

Part 1 (role-setting):
From now on, you are a [persona description]. You care about [summary of values/priorities].

Part 2 (role-immersion):
That’s great to hear! As a [persona’s self-reflection, e.g., health-focused single/busy parent/budget-conscious family member], I’ll answer honestly and from my real-life perspective. Please go ahead and ask me any questions about meal kits or food choices.

First interview question: “What is your first impression of a meal kit that promises a healthy dinner in just 15 minutes with fresh ingredients?”

3. Run the initial prompt in your AI tool: For each persona, run the full prompt (role-setting + role-immersion + first question) through your AI model (e.g., GPT-4 or similar). For more realism, run the prompt multiple times per segment (e.g., 5–10 variations each) to capture a spread of simulated responses.

4. Conduct follow-up interviews: After each initial response, ask 1–2 follow-up questions to probe for pain points, motivators, or objections. Example follow-ups:
• “Would this kit make you more likely to cook at home? Why or why not?”
• “What features or messaging would make you try this service?”
• “How does the price/value compare to your current dinner routine?”
Capture these as short, two-turn conversational threads for each persona.

5. Code and summarize the responses: Review all AI-generated answers and tag them by themes: “positive on speed,” “concerned about price,” “mentions health,” “skeptical about taste,” etc. Identify what each segment likes or dislikes, and any objections or enthusiasm.

6. Compare results across segments: Create a side-by-side summary to show which features or messages resonate most with each group. Highlight any clear differences (e.g., working parents focus on speed, budget families focus on price, singles care about variety or nutrition).

🤔 What should I expect?

You’ll get a clear, directional view of which customer types are likely to be enthusiastic, hesitant, or indifferent to the new meal kit, and what messaging or features to emphasize (or rework) for each. This guides your marketing, positioning, and next-step testing—before investing in expensive human focus groups or broad market rollouts.

Read Original Paper

Ask Rally

Better Zero-Shot Reasoning with Role-play Prompting

🔥 Key Takeaway:

🔮 TLDR

📊 Cool Story, Needs a Graph

⚔️ The Operators Edge

🗺️ What are the Implications?

📄 Prompts

⏰ When is this relevant?

🔢 Follow the Instructions:

🤔 What should I expect?

Stay Updated

Related Papers

Large Language Models That Replace Human Participants Can Harmfully Misportray and Flatten Identity Groups
post

Scaling Synthetic Data Creation with 1,000,000,000 Personas
post

Quantifying the Persona Effect in Llm Simulations
post

Virtual Personas for Language Models Via an Anthology of Backstories
post

Quick Survey

Better Zero-Shot Reasoning with Role-play Prompting

🔥 Key Takeaway:

🔮 TLDR

📊 Cool Story, Needs a Graph

⚔️ The Operators Edge

🗺️ What are the Implications?

📄 Prompts

⏰ When is this relevant?

🔢 Follow the Instructions:

🤔 What should I expect?

Stay Updated

Related Papers

Large Language Models That Replace Human Participants Can Harmfully Misportray and Flatten Identity Groups post

Scaling Synthetic Data Creation with 1,000,000,000 Personas post

Quantifying the Persona Effect in Llm Simulations post

Virtual Personas for Language Models Via an Anthology of Backstories post

Quick Survey

Large Language Models That Replace Human Participants Can Harmfully Misportray and Flatten Identity Groups
post

Scaling Synthetic Data Creation with 1,000,000,000 Personas
post

Quantifying the Persona Effect in Llm Simulations
post

Virtual Personas for Language Models Via an Anthology of Backstories
post