Date
Publisher
arXiv
Scientific sketches (e.g., models) offer a powerful lens into students'
conceptual understanding, yet AI-powered automated assessment of such
free-form, visually diverse artifacts remains a critical challenge. Existing
solutions often treat sketch evaluation as either an image classification task
or monolithic vision-language models, which lack interpretability, pedagogical
alignment, and adaptability across cognitive levels. To address these
limitations, we present SketchMind, a cognitively grounded, multi-agent
framework for evaluating and improving student-drawn scientific sketches.
SketchMind comprises modular agents responsible for rubric parsing, sketch
perception, cognitive alignment, and iterative feedback with sketch
modification, enabling personalized and transparent evaluation. We evaluate
SketchMind on a curated dataset of 3,575 student-generated sketches across six
science assessment items with different highest order of Bloom's level that
require students to draw models to explain phenomena. Compared to baseline
GPT-4o performance without SRG (average accuracy: 55.6%), and with SRG
integration achieves 77.1% average accuracy (+21.4% average absolute gain). We
also demonstrate that multi-agent orchestration with SRG enhances SketchMind
performance, for example, GPT-4.1 gains an average 8.9% increase in sketch
prediction accuracy, outperforming single-agent pipelines across all items.
Human evaluators rated the feedback and co-created sketches generated by
\textsc{SketchMind} with GPT-4.1, which achieved an average of 4.1 out of 5,
significantly higher than those of baseline models (e.g., 2.3 for GPT-4o).
Experts noted the system's potential to meaningfully support conceptual growth
through guided revision. Our code and (pending approval) dataset will be
released to support reproducibility and future research in AI-driven education.
What is the application?
Who is the user?
Study design
