Date
Publisher
arXiv
Real-time voice interfaces using multimodal Generative AI (GenAI) can
potentially address the accessibility needs of novice programmers with
disabilities (e.g., related to vision). Yet, little is known about how novices
interact with GenAI tools and their feedback quality in the form of audio
output. This paper analyzes audio dialogues from nine 9th-grade students using
a voice-enabled tutor (powered by OpenAI's Realtime API) in an authentic
classroom setting while learning Python. We examined the students' voice
prompts and AI's responses (1210 messages) by using qualitative coding. We also
gathered students' perceptions via the Partner Modeling Questionnaire. The
GenAI Voice Tutor primarily offered feedback on mistakes and next steps, but
its correctness was limited (71.4% correct out of 416 feedback outputs).
Quality issues were observed, particularly when the AI attempted to utter
programming code elements. Students used the GenAI voice tutor primarily for
debugging. They perceived it as competent, only somewhat human-like, and
flexible. The present study is the first to explore the interaction dynamics of
real-time voice GenAI tutors and novice programmers, informing future
educational tool design and potentially addressing accessibility needs of
diverse learners.
What is the application?
Who is the user?
Who age?
Why use AI?
