Date
Publisher
arXiv
Large Language Models (LLMs) now excel at generative skills and can create
content at impeccable speeds. However, they are imperfect and still make
various mistakes. In a Computer Science education context, as these models are
widely recognized as "AI pair programmers," it becomes increasingly important
to train students on evaluating and debugging the LLM-generated code. In this
work, we introduce HypoCompass, a novel system to facilitate deliberate
practice on debugging, where human novices play the role of Teaching Assistants
and help LLM-powered teachable agents debug code. We enable effective task
delegation between students and LLMs in this learning-by-teaching environment:
students focus on hypothesizing the cause of code errors, while adjacent skills
like code completion are offloaded to LLM-agents. Our evaluations demonstrate
that HypoCompass generates high-quality training materials (e.g., bugs and
fixes), outperforming human counterparts fourfold in efficiency, and
significantly improves student performance on debugging by 12% in the
pre-to-post test.
What is the application?
Who age?
Why use AI?
Study design
