Date
Publisher
arXiv
Personalized Learning Path Planning (PLPP) aims to design adaptive learning
paths that align with individual goals. While large language models (LLMs) show
potential in personalizing learning experiences, existing approaches often lack
mechanisms for goal-aligned planning. We introduce Pxplore, a novel framework
for PLPP that integrates a reinforcement-based training paradigm and an
LLM-driven educational architecture. We design a structured learner state model
and an automated reward function that transforms abstract objectives into
computable signals. We train the policy combining supervised fine-tuning (SFT)
and Group Relative Policy Optimization (GRPO), and deploy it within a
real-world learning platform. Extensive experiments validate Pxplore's
effectiveness in producing coherent, personalized, and goal-driven learning
paths. We release our code and dataset to facilitate future research.
What is the application?
Who is the user?
Who age?
Why use AI?
