
LLM-Based Role-Playing Simulations: Demographic Gaps and Mitigation Strategies

As researchers increasingly employ large language models (LLMs) to role-play virtual survey respondents, significant demographic gaps have emerged in their accuracy and realism. Certain populations—such as older adults, racial minorities, women (particularly women of color), lower socioeconomic groups, and ideological centrists—are consistently underrepresented or misrepresented by these models. This post examines the underlying reasons for these demographic discrepancies, including biased training data, alignment-induced censorship, and oversimplified demographic interactions. It also presents practical strategies for mitigating these biases, outlining how thoughtful prompt engineering, targeted fine-tuning, and nuanced alignment adjustments can help create more authentic and inclusive LLM-based role-play simulations.