Mathedu: Feedback Generation On Problem-Solving Processes For Mathematical Learning Support

Authors

Wei-Ling Hsu,

Yu-Chien Tang,

An-Zi Yen

Date

01/2026

Publisher

arXiv

Link

https://arxiv.org/pdf/2505.18056v2

The increasing reliance on Large Language Models (LLMs) across various domains extends to education, where students progressively use generative AI as a tool for learning. While prior work has examined LLMs' mathematical ability, their reliability in grading authentic student problem-solving processes and delivering effective feedback remains underexplored. This study introduces MathEDU, a dataset consisting of student problem-solving processes in mathematics and corresponding teacher-written feedback. We systematically evaluate the reliability of various models across three hierarchical tasks: answer correctness classification, error identification, and feedback generation. Experimental results show that fine-tuning strategies effectively improve performance in classifying correctness and locating erroneous steps. However, the generated feedback across models shows a considerable gap from teacher-written feedback. Critically, the generated feedback is often verbose and fails to provide targeted explanations for the student's underlying misconceptions. This emphasizes the urgent need for trustworthy and pedagogy-aware AI feedback in education.

What is the application?

Teaching – Assessment and Feedback

Who is the user?

Student

Who age?

Post-Secondary

Why use AI?

Outcomes – Numeracy,

Outcomes – Differentiation

Study design

Technical – Computational

Search and Filter

Submit a research study

Mathedu: Feedback Generation On Problem-Solving Processes For Mathematical Learning Support