Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments

Authors

Ehsan Latif,

Luyang Fang,

Ping Ma,

Xiaoming Zhai

Date

06/2024

Publisher

arXiv

Link

http://arxiv.org/pdf/2312.15842v3

This study proposes a method for knowledge distillation (KD) of fine-tuned Large Language Models (LLMs) into smaller, more efficient, and accurate neural networks. We specifically target the challenge of deploying these models on resource-constrained devices. Our methodology involves training the smaller student model (Neural Network) using the prediction probabilities (as soft labels) of the LLM, which serves as a teacher model. This is achieved through a specialized loss function tailored to learn from the LLM's output probabilities, ensuring that the student model closely mimics the teacher's performance. To validate the performance of the KD approach, we utilized a large dataset, 7T, containing 6,684 student-written responses to science questions and three mathematical reasoning datasets with student-written responses graded by human experts. We compared accuracy with state-of-the-art (SOTA) distilled models, TinyBERT, and artificial neural network (ANN) models. Results have shown that the KD approach has 3% and 2% higher scoring accuracy than ANN and TinyBERT, respectively, and comparable accuracy to the teacher model. Furthermore, the student model size is 0.03M, 4,000 times smaller in parameters and x10 faster in inferencing than the teacher model and TinyBERT, respectively. The significance of this research lies in its potential to make advanced AI technologies accessible in typical educational settings, particularly for automatic scoring.

What is the application?

Teaching – Assessment and Feedback

Who age?

High School (9-12)

Why use AI?

Efficiency,

Outcomes – Other Academic

Study design

Technical – Computational

Search and Filter

Submit a research study

Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments