Date
Publisher
arXiv
LEARN is a layout-aware diffusion framework designed to generate
pedagogically aligned illustrations for STEM education. It leverages a curated
BookCover dataset that provides narrative layouts and structured visual cues,
enabling the model to depict abstract and sequential scientific concepts with
strong semantic alignment. Through layout-conditioned generation, contrastive
visual-semantic training, and prompt modulation, LEARN produces coherent visual
sequences that support mid-to-high-level reasoning in line with Bloom's
taxonomy while reducing extraneous cognitive load as emphasized by Cognitive
Load Theory. By fostering spatially organized and story-driven narratives, the
framework counters fragmented attention often induced by short-form media and
promotes sustained conceptual focus. Beyond static diagrams, LEARN demonstrates
potential for integration with multimodal systems and curriculum-linked
knowledge graphs to create adaptive, exploratory educational content. As the
first generative approach to unify layout-based storytelling, semantic
structure learning, and cognitive scaffolding, LEARN represents a novel
direction for generative AI in education. The code and dataset will be released
to facilitate future research and practical deployment.
What is the application?
Who is the user?
Why use AI?
Study design
