Date
Publisher
arXiv
Asynchronous learning environments (ALEs) are widely adopted for formal and
informal learning, but timely and personalized support is often limited. In
this context, Virtual Teaching Assistants (VTAs) can potentially reduce the
workload of instructors, but rigorous and pedagogically sound evaluation is
essential. Existing assessments often rely on surface-level metrics and lack
sufficient grounding in educational theories, making it difficult to
meaningfully compare the pedagogical effectiveness of different VTA systems. To
bridge this gap, we propose an evaluation framework rooted in learning sciences
and tailored to asynchronous forum discussions, a common VTA deployment context
in ALE. We construct classifiers using expert annotations of VTA responses on a
diverse set of forum posts. We evaluate the effectiveness of our classifiers,
identifying approaches that improve accuracy as well as challenges that hinder
generalization. Our work establishes a foundation for theory-driven evaluation
of VTA systems, paving the way for more pedagogically effective AI in
education.
What is the application?
Who is the user?
Who age?
Why use AI?
Study design
