Date
Publisher
arXiv
School dropout is a serious problem in distance learning, where early
detection is crucial for effective intervention and student perseverance.
Predicting student dropout using available educational data is a widely
researched topic in learning analytics. Our partner's distance learning
platform highlights the importance of integrating diverse data sources,
including socio-demographic data, behavioral data, and sentiment analysis, to
accurately predict dropout risks. In this paper, we introduce a novel model
that combines sentiment analysis of student comments using the Bidirectional
Encoder Representations from Transformers (BERT) model with socio-demographic
and behavioral data analyzed through Extreme Gradient Boosting (XGBoost). We
fine-tuned BERT on student comments to capture nuanced sentiments, which were
then merged with key features selected using feature importance techniques in
XGBoost. Our model was tested on unseen data from the next academic year,
achieving an accuracy of 84\%, compared to 82\% for the baseline model.
Additionally, the model demonstrated superior performance in other metrics,
such as precision and F1-score. The proposed method could be a vital tool in
developing personalized strategies to reduce dropout rates and encourage
student perseverance
What is the application?
Who is the user?
Who age?
Why use AI?
Study design
