Date
Publisher
arXiv
We evaluate the effectiveness of LLM-Tutor, a large language model
(LLM)-powered tutoring system that combines an AI-based proof-review tutor for
real-time feedback on proof-writing and a chatbot for mathematics-related
queries. Our experiment, involving 148 students, demonstrated that the use of
LLM-Tutor significantly improved homework performance compared to a control
group without access to the system. However, its impact on exam performance and
time spent on tasks was found to be insignificant. Mediation analysis revealed
that students with lower self-efficacy tended to use the chatbot more
frequently, which partially contributed to lower midterm scores. Furthermore,
students with lower self-efficacy were more likely to engage frequently with
the proof-review-AI-tutor, a usage pattern that positively contributed to
higher final exam scores. Interviews with 19 students highlighted the
accessibility of LLM-Tutor and its effectiveness in addressing learning needs,
while also revealing limitations and concerns regarding potential over-reliance
on the tool. Our results suggest that generative AI alone like chatbot may not
suffice for comprehensive learning support, underscoring the need for iterative
design improvements with learning sciences principles with generative AI
educational tools like LLM-Tutor.
What is the application?
Who is the user?
Who age?
Why use AI?
Study design
