Date
Publisher
arXiv
Factuality is a necessary precursor to useful educational tools. As adoption
of Large Language Models (LLMs) in education continues of grow, ensuring
correctness in all settings is paramount. Despite their strong English
capabilities, LLM performance in other languages is largely untested. In this
work, we evaluate the correctness of the Llama3.1 family of models in answering
factual questions appropriate for middle and high school students. We
demonstrate that LLMs not only provide extraneous and less truthful
information, but also exacerbate existing biases against rare languages.
What is the application?
Who is the user?
Who age?
Why use AI?
Study design
