Date
Publisher
arXiv
Large language models (LLMs) have demonstrated the ability to generate
formative feedback and instructional hints in English, making them increasingly
relevant for AI-assisted education. However, their ability to provide effective
instructional support across different languages, especially for mathematically
grounded reasoning tasks, remains largely unexamined. In this work, we present
the first large-scale simulation of multilingual tutor-student interactions
using LLMs. A stronger model plays the role of the tutor, generating feedback
in the form of hints, while a weaker model simulates the student. We explore
352 experimental settings across 11 typologically diverse languages, four
state-of-the-art LLMs, and multiple prompting strategies to assess whether
language-specific feedback leads to measurable learning gains. Our study
examines how student input language, teacher feedback language, model choice,
and language resource level jointly influence performance. Results show that
multilingual hints can significantly improve learning outcomes, particularly in
low-resource languages when feedback is aligned with the student's native
language. These findings offer practical insights for developing multilingual,
LLM-based educational tools that are both effective and inclusive.
What is the application?
Why use AI?
Study design
