Why Are All LLMs Obsessed with Japanese Culture?
본 논문은 대형 언어 모델(LLM)이 문화적 편향을 보이며 특히 일본 문화에 대한 편중된 관심을 나타내는 현상을 분석했다. 저자들은 문화 관련 개방형 질문(CROQ) 데이터셋을 새로 제안하여 LLM이 일본을 포함한 특정 국가에 편향된 응답을 하는 경향을 발견했다. 또한, 영어 등 고자원 언어로 프롬프트할 때 편향이 줄어들고 다양성이 증가하는 점과, 이러한 문화적 편향이 사전학습이 아닌 감독 미세조정 단계에서 주로 나타난다는 점을 밝혔다. 이 연구는 LLM의 문화적·지역적 편향 문제를 심층적으로 조명하며, AI 모델의 공정성과 다양성 개선에 중요한 시사점을 제공한다.

Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs
LLMs have been showing limitations when it comes to cultural coverage and competence, and in some cases show regional biases such as amplifying Western and Anglocentric viewpoints. While there have been works analysing the cultural capabilities of LLMs, there has not been specific work on highlighting LLM regional preferences when it comes to cultural-related questions. In this work, we propose a new dataset based on a comprehensive taxonomy of Culture-Related Open Questions (CROQ). The results show that, contrary to previous cultural bias work, LLMs show a clear tendency towards countries such as Japan. Moveover, our results show that when prompting in languages such as English or other high-resource ones, LLMs tend to provide more diverse outputs and show less inclinations towards answering questions highlighting countries for which the input language is an official language. Finally, we also investigate at which point of LLM training this cultural bias emerges, with our results suggesting that the first clear signs appear after supervised fine-tuning, and not during pre-training.
