Breadcrumb
- Home
- AI Hub For Education
- Research Study Repository
- Outcomes – Numeracy
Outcomes – Numeracy
Research synthesis is AI-generated, human reviewed. Updated 09/2025.
Displaying 1 - 30 of 214
Conversational Learning Diagnosis Via Reasoning Multi-Turn Interactive Learning
Fangzhou Yao, Sheng Chang, Weibo Gao, Qi Liu. (03/2026). arXiv. https://arxiv.org/pdf/2603.03236v1
When Shallow Wins: Silent Failures And The Depth-Accuracy Paradox In Latent Reasoning
Subramanyam Sahoo, Aman Chadha, Vinija Jain, Divya Chaudhary. (03/2026). arXiv. https://arxiv.org/pdf/2603.03475v1
Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform Llms
Prarthana Bhattacharyya, Joshua Mitton, Simon Woodhead, Ralph Abboud. (03/2026). arXiv. https://arxiv.org/pdf/2603.02830v1
The Aftermath Of Drawedumath: Vision Language Models Underperform With Struggling Students And Misdiagnose Errors
Li Lucy, Albert Zhang, Nathan Anderson, Ryan Knight, Kyle Lo. (03/2026). arXiv. https://arxiv.org/pdf/2603.00925v1
Llama Lima: A Living Meta-Analysis On The Effects Of Generative AI On Learning Mathematics Version 2, 03/26
Anselm Strohmaier, Samira Bodefeld, Oliver Straser, Frank Reinhold. (03/2026). arXiv. https://arxiv.org/pdf/2601.18685v2
Knowledge Without Wisdom: Measuring Misalignment Between Llms And Intended Impact
Michael Hardy, Yunsung Kim. (03/2026). arXiv. https://arxiv.org/pdf/2603.00883v1
Confusion-Aware Rubric Optimization For Llm-Based Automated Grading
Yucheng Chu, Hang Li, Kaiqi Yang, Yasemin Copur-Gencturk, Joseph Krajcik, Namsoo Shin, Jiliang Tang. (02/2026). arXiv. https://arxiv.org/pdf/2603.00451v1
Aitutor-Evalkit: Exploring The Capabilities Of AI Tutors
Numaan Naeem, Kaushal Kumar Maurya, Kseniia Petukhova, Ekaterina Kochmar. (02/2026). arXiv. https://arxiv.org/pdf/2512.03688v2
Beyond Single-Turn: A Survey On Multi-Turn Interactions With Large Language Models
Yubo Li, Xiaobin Shen, Xinyu Yaot, Xueying Ding, Yidi Miaot, Krishnan Ramayya, Rema Padman. (02/2026). arXiv. https://arxiv.org/pdf/2504.04717v5
Visual Reasoning Benchmark: Evaluating Multimodal Llms On Classroom-Authentic Visual Problems From Primary Education
Mohamed Huti, Alasdair Mackintosh, Amy Waldock, Dominic Andrews, Maxime Leliievre, Moritz Boos, Tobias Murray, Paul Atherton, Robin A. A. Ince, Oliver G. B. Garrod. (02/2026). arXiv. https://arxiv.org/pdf/2602.12196v1
Beyond End-To-End Video Models: An Llm-Based Multi-Agent System For Educational Video Generation
Lingyong Yan, Jiulong Wu, Dong Xie, Weixian Shi, Deguo Xia, Jizhou Huang. (02/2026). arXiv. https://arxiv.org/pdf/2602.11790v1
Integrating Generative AI-Enhanced Cognitive Systems In Higher Education: From Stakeholder Perceptions To A Conceptual Framework Considering The Eu AI Act
Da-Lun Chen, Prasasthy Balasubramanian, Lauri Loven, Susanna Pirttikangas, Jaakko Sauvola, Panagiotis Kostakos. (02/2026). arXiv. https://arxiv.org/pdf/2602.10802v1
Open Mathematical Tasks As A Didactic Response To Generative Artificial Intelligence In Post-AI Contexts
Felix De la Cruz Serrano. (02/2026). arXiv. https://arxiv.org/pdf/2602.09242v1
Do Teachers Dream Of Genai Widening Educational (In)Equality? Envisioning The Future Of K-12 Genai Education From Global Teachers' Perspectives
Ruiwei Xiao, Qing Xiao, Xinying Hou, Phenyo Phemelo Moletsane, Hanqi Jane Li, Hong Shen, John Stamper. (02/2026). arXiv. https://arxiv.org/pdf/2509.10782v4
Language Bottleneck Models For Qualitative Knowledge State Modeling
Antonin Berthon, Mihaela van der Schaar. (02/2026). arXiv. https://arxiv.org/pdf/2506.16982v2
Llm Agents For Education: Advances And Applications
Zhendong Chu, Shen Wang, Jian Xie, Tinghui Zhu, Yibo Yan, Jinheng Ye, Aoxiao Zhong, Xuming Hu, Jing Liang, Philip S. Yu, Qingsong Wen. (02/2026). arXiv. https://arxiv.org/pdf/2503.11733v2
Benchmarking Large Language Models For Diagnosing Students' Cognitive Skills From Handwritten Math Work
Yoonsu Kim, Hyoungwook Jin, Hayeon Doh, Eunhye Kim, Dongyun Jung, Seungju Kim, Kiyoon Choi, Jinho Son, Juho Kim. (02/2026). arXiv. https://arxiv.org/pdf/2504.00843v2
Facet: Multi-Agent Ai Supporting Teachers In Scaling Differentiated Learning For Diverse Students
Jana Gonnermann-Muller, Jennifer Haase, Nicolas Leins, Moritz Igel, Konstantin Fackeldey and Sebastian Pokutta. (01/2026). arXiv. https://arxiv.org/pdf/2601.22788v1
Learning Context: A Unified Framework And Roadmap For Context-Aware Ai In Education
Naiming Liu, Brittany Bradford, Johaun Hatchett, Gabriel Diaz, Lorenzo Luzi, Zichao Wang, Debshila Basu Mallick, Richard Baraniuk. (01/2026). arXiv. https://arxiv.org/pdf/2512.24362v2
A Comprehensive Exploration Of Personalized Learning In Smart Education: From Student Modeling To Personalized Recommendations
Siyu Wu, Yang Cao, Runze Li, Jiajun Cui, Hong Qian, Bo Jiang, Wei Zhang. (01/2026). arXiv. https://arxiv.org/pdf/2402.01666v2
Designing Ai Peers For Collaborative Mathematical Problem Solving With Middle School Students: A Participatory Design Study
Wenhan Lyu, Yimeng Wang, Murong Yue, Yifan Sun, Jennifer Suh, Meredith Kier, Ziyu Yao, and Yixuan Zhang. (01/2026). arXiv. https://arxiv.org/pdf/2601.17962v2
Do Teachers Dream Of Genai Widening Educational (In)Equality? Envisioning The Future Of K-12 Genai Education From Global Teachers' Perspectives
Ruiwei Xiao, Qing Xiao, Xinying Hou, Phenyo Phemelo Moletsane, Hanqi Jane Li, Hong Shen, John Stamper. (01/2026). arXiv. https://arxiv.org/pdf/2509.10782v2
Llama Lima: A Living Meta-Analysis On The Effects Of Generative Ai On Learning Mathematics
Anselm Strohmaier, Samira Bödefeld, Frank Reinhold. (01/2026). arXiv. https://arxiv.org/pdf/2601.18685v1
Mathedu: Feedback Generation On Problem-Solving Processes For Mathematical Learning Support
Wei-Ling Hsu, Yu-Chien Tang, An-Zi Yen. (01/2026). arXiv. https://arxiv.org/pdf/2505.18056v2
A Survey Of Self-Evolving Agents What, When, How, And Where To Evolve On The Path To Artificial Super Intelligence
Huan-ang Gao, Jiayi Geng, Wenyue Hua, Mengkang Hu, Xinzhe Juan, Hongzhang Liu, Shilong Liu, Jiahao Qiu, Xuan Qi, Qihan Ren, Yiran Wu, Hongru Wang, Han Xiao, Yuhang Zhou, Shaokun Zhang, Jiayi Zhang, Jinyu Xiang, Yixiong Fang, Qiwen Zhao, Dongrui Liu, Cheng Qian, Zhenhailong Wang, Minda Hu, Huazheng Wang, Qingyun Wu, Heng Ji, Mengdi Wang. (01/2026). arXiv. https://arxiv.org/pdf/2507.21046v4
Mathdoc: Benchmarking Structured Extraction And Active Refusal On Noisy Mathematics Exam Papers
Chenyue Zhou, Jiayi Tuo, Shitong Qin, Wei Dai, Mingxuan Wang, Ziwei Zhao, Duoyang Li, Shiyang Su, Yanxi Lu, Yanbiao Ma. (01/2026). arXiv. https://arxiv.org/pdf/2601.10104v1
Breaking Robustness Barriers In Cognitive Diagnosis: A One-Shot Neural Architecture Search Perspective
Ziwen Wang, Shangshang Yang, Xiaoshan Yu, Haiping Ma, Xingyi Zhang. (01/2026). arXiv. https://arxiv.org/pdf/2601.04918v1
Malrulelib: Large-Scale Executable Misconception Reasoning With Step Traces For Modeling Student Thinking In Mathematics
Xinghe Chen, Naiming Liu, Shashank Sonkar. (01/2026). arXiv. https://arxiv.org/pdf/2601.03217v1
Problems With Large Language Models For Learner Modelling: Why Llms Alone Fall Short For Responsible Tutoring In K-12 Education
Danial Hooshyar, Yeongwook Yang, Gustav Sir, Tommi Karkkainen, Raija Hamalainen, Mutlu Cukurova, Roger Azevedo. (12/2025). arXiv. https://arxiv.org/pdf/2512.23036v1
Synthetic Fluency And Epistemic Offloading In Undergraduate Mathematics In The Age Of Ai
Siyuan Wang, Qing Xia, Qiong Ye. (12/2025). arXiv. https://arxiv.org/pdf/2512.21045v1

