Date
Publisher
arXiv
We evaluate the effectiveness of GPT-4 Turbo in generating educational
questions from NCERT textbooks in zero-shot mode. Our study highlights GPT-4
Turbo's ability to generate questions that require higher-order thinking
skills, especially at the "understanding" level according to Bloom's Revised
Taxonomy. While we find a notable consistency between questions generated by
GPT-4 Turbo and those assessed by humans in terms of complexity, there are
occasional differences. Our evaluation also uncovers variations in how humans
and machines evaluate question quality, with a trend inversely related to
Bloom's Revised Taxonomy levels. These findings suggest that while GPT-4 Turbo
is a promising tool for educational question generation, its efficacy varies
across different cognitive levels, indicating a need for further refinement to
fully meet educational standards.
What is the application?
Who is the user?
Who age?
Why use AI?
Study design
