Date
Publisher
arXiv
Natural language-based assessment (NLA) is an approach to second language
assessment that uses instructions - expressed in the form of can-do descriptors
- originally intended for human examiners, aiming to determine whether large
language models (LLMs) can interpret and apply them in ways comparable to human
assessment. In this work, we explore the use of such descriptors with an
open-source LLM, Qwen 2.5 72B, to assess responses from the publicly available
S&I Corpus in a zero-shot setting. Our results show that this approach -
relying solely on textual information - achieves competitive performance: while
it does not outperform state-of-the-art speech LLMs fine-tuned for the task, it
surpasses a BERT-based model trained specifically for this purpose. NLA proves
particularly effective in mismatched task settings, is generalisable to other
data types and languages, and offers greater interpretability, as it is
grounded in clearly explainable, widely applicable language descriptors.
What is the application?
Who age?
Why use AI?
Study design
