Date
Publisher
arXiv
LEARN is a layout-aware diffusion framework designed to generate
pedagogically aligned illustrations for STEM education. It leverages a curated
BookCover dataset that provides narrative layouts and structured visual cues,
enabling the model to depict abstract and sequential scientific concepts with
strong semantic alignment. Through layout-conditioned generation, contrastive
visual-semantic training, and prompt modulation, LEARN produces coherent visual
sequences that support mid-to-high-level reasoning in line with Bloom's
taxonomy while reducing extraneous cognitive load as emphasized by Cognitive
Load Theory. By fostering spatially organized and story-driven narratives, the
framework counters fragmented attention often induced by short-form media and
promotes sustained conceptual focus. Beyond static diagrams, LEARN demonstrates
potential for integration with multimodal systems and curriculum-linked
knowledge graphs to create adaptive, exploratory educational content. As the
first generative approach to unify layout-based storytelling, semantic
structure learning, and cognitive scaffolding, LEARN represents a novel
direction for generative AI in education. The code and dataset will be released
to facilitate future research and practical deployment.
What is the application?
Who age?
Study design
