Breadcrumb
- Home
- Generative AI For Education Hub
- Research Study Repository
- Technical – Computational
Search and Filter
Submit a research study
Contribute to the repository:
Technical – Computational
Research synthesis is AI-generated, human reviewed. Updated 09/2025.
Displaying 451 - 480 of 525
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
Joykirat Singh, Akshay Nambi, Vibhav Vineet. (06/2024). arXiv. http://arxiv.org/pdf/2406.10834v1
Grade Like a Human: Rethinking Automated Assessment with Large Language Models
Wenjing Xie, Juxin Niu, Chun Jason Xue, Nan Guan. (05/2024). arXiv. https://arxiv.org/pdf/2405.19694
How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses
Jionghao Lin, Zifei Han, Danielle R. Thomas, Ashish Gurung, Shivang Gupta, Vincent Aleven, Kenneth R. Koedinger. (05/2024). arXiv. http://arxiv.org/pdf/2405.00970v1
Large Language Models for Education: A Survey
Hanyi Xu, Wensheng Gan, Zhenlian Qi, Jiayang Wu and Philip S. Yu. (05/2024). arXiv. http://arxiv.org/pdf/2405.13001v1
Large Language Models for In-Context Student Modeling: Synthesizing Student's Behavior in Visual Programming
Manh Hung Nguyen, Sebastian Tschiatschek, Adish Singla. (05/2024). arXiv. http://arxiv.org/pdf/2310.10690v3
The AI Collaborator: Bridging Human-Ai Interaction In Educational And Professional Settings
Mohammad Amin Samadi, Spencer JaQuay, Nia Nixon, Jing Gu. (05/2024). arXiv. http://arxiv.org/pdf/2405.10460v1
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models
Kun Zhou, Beichen Zhang, Jiapeng Wang, Zhipeng Chen, Wayne Xin Zhao, Jing Sha, Zhichao Sheng, Shijin Wang, Ji-Rong Wen. (05/2024). arXiv. http://arxiv.org/pdf/2405.14365v1
Evaluating and Optimizing Educational Content with Large Language Model Judgments
Joy He-Yueya, Noah D. Goodman, Emma Brunskill. (05/2024). arXiv. http://arxiv.org/pdf/2403.02795v2
An Automatic Question Usability Evaluation Toolkit
Steven Moore, Eamon Costello, Huy A. Nguyen, John Stamper. (05/2024). arXiv. http://arxiv.org/pdf/2405.20529v1
Capabilities of Gemini Models in Medicine
Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng, David G.T. Barrett, Cathy Cheung, Basil Mustafa, Anil Palepu, Daniel McDuff, Le Hou, Tomer Golany, Luyang Liu, Jean-baptiste Alayrac, Neil Houlsby, Nenad Tomasev, Jan Freyberg, Charles Lau, Jonas Kemp, Jeremy Lai, Shekoofeh Azizi, Kimberly Kanada, SiWai Man, Kavita Kulkarni, Ruoxi Sun, Siamak Shakeri, Luheng He, Ben Caine, Albert Webson, Natasha Latysheva, Melvin Johnson, Philip Mansfield, Jian Lu, Ehud Rivlin, Jesper Anderson, Bradley Green, Renee Wong, Jonathan Krause, Jonathon Shlens, Ewa Dominowska, S. M. Ali Eslami, Katherine Chou, Claire Cui, Oriol Vinyals, Koray Kavukcuoglu, James Manyika, Jeff Dean, Demis Hassabis, Yossi Matias, Dale Webster, Joelle Barral, Greg Corrado, Christopher Semturs, S. Sara Mahdavi, Juraj Gottweis, Alan Karthikesalingam, Vivek Natarajan. (05/2024). arXiv. http://arxiv.org/pdf/2404.18416v2
Towards A Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis
Clayton Cohn, Caitlin Snyder, Justin Montenegro, Gautam Biswas. (05/2024). arXiv. http://arxiv.org/pdf/2405.03677v1
Generating A Crowdsourced Conversation Dataset to Combat Cybergrooming
Xinyi Zhang, Pamela J. Wisniewski, Jin-Hee Cho, Lifu Huang, Sang Won Lee. (05/2024). arXiv. http://arxiv.org/pdf/2405.13154v1
Evaluating Students' Open-ended Written Responses with LLMs: Using the RAG Framework for GPT-3.5, GPT-4, Claude-3, and Mistral-Large
Jussi S. Jauhiainen, Agust’n Garagorry Guerra. (05/2024). arXiv. http://arxiv.org/pdf/2405.05444v1
FOKE: A Personalized And Explainable Education Framework Integrating Foundation Models, Knowledge Graphs, And Prompt Engineering
Silan Hu, Xiaoning Wang. (05/2024). arXiv. http://arxiv.org/pdf/2405.03734v1
Can Large Language Models Make the Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education
Owen Henkel, Libby Hills, Adam Boxer, Bill Roberts, Zach Levonian. (05/2024). ACM Digital Library. https://dl.acm.org/doi/pdf/10.1145/3657604.3664693
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
Zhuoxuan Jiang, Haoyuan Peng, Shanshan Feng, Fan Li, Dongsheng Li. (05/2024). arXiv. http://arxiv.org/pdf/2405.06705v1
Clue-Instruct: Text-Based Clue Generation for Educational Crossword Puzzles
Andrea Zugarini, Kamyar Zeinalipour, Surya Sai Kadali, Marco Maggini, Marco Gori, Leonardo Rigutini. (04/2024). arXiv. http://arxiv.org/pdf/2404.06186v1
Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges
Qingyao Li, Lingyue Fu, Weiming Zhang, Xianyu Chen, Jingwei Yu, Wei Xia, Weinan Zhang, Ruiming Tang, Yong Yu. (04/2024). arXiv. http://arxiv.org/pdf/2401.08664v3
Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakes
Rose E. Wang, Qingyang Zhang, Carly Robinson, Susanna Loeb, Dorottya Demszky. (04/2024). arXiv. http://arxiv.org/pdf/2310.10648v3
Teach AI How to Code: Using Large Language Models as Teachable Agents for Programming Education
Hyoungwook Jin, Seonghee Lee, Hyungyu Shin, Juho Kim. (03/2024). arXiv. http://arxiv.org/pdf/2309.14534v3
Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching
Andrew Katz, Mitch Gerhardt, Michelle Soledad. (03/2024). arXiv. http://arxiv.org/pdf/2403.11984v1
iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries
Adam Coscia, Langdon Holmes, Wesley Morris, Joon Suh Choi, Scott Crossley, Alex Endert. (03/2024). arXiv. http://arxiv.org/pdf/2403.04760v1
Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations*
Hasan A. Rasheed, Christian Weber, Madjid Fathi. (03/2024). arXiv. http://arxiv.org/pdf/2403.03008v1
Large Language Models As MOOCs Graders
Shahriar Golchin, Nikhil Garuda, Christopher Impey, Matthew Wenger. (03/2024). arXiv. http://arxiv.org/pdf/2402.03776v4
CuentosIE: can a chatbot about "tales with a message" help to teach emotional intelligence?
Antonio Ferrandez, Roc’o Lavigne-Cervan, Jesus Peral, Ignasi Navarro-Soria, Angel Lloret, David Gil, Carmen Rocamora. (03/2024). arXiv. http://arxiv.org/pdf/2403.07193v1
Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation
Zifei (FeiFei) Han, Jionghao Lin, Ashish Gurung, Danielle R. Thomas, Eason Chen, Conrad Borchers, Shivang Gupta, Kenneth R. Koedinger. (02/2024). arXiv. https://arxiv.org/pdf/2402.14594
Mathemyths: Leveraging Large Language Models to Teach Mathematical Language through Child-AI Co-Creative Storytelling
Chao Zhang, Xuechen Liu, Katherine Ziska, Soobin Jeon, Chi-Lin Yu, Ying Xu. (02/2024). arXiv. http://arxiv.org/pdf/2402.01927v2
Edu-ConvoKit An Open-Source Library for Education Conversation Data
Rose E. Wang, Dorottya Demszky. (02/2024). arXiv. http://arxiv.org/pdf/2402.05111v1
Applying Large Language Models and Chain-of-Thought for Automatic Scoring
Gyeong-Geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai. (02/2024). arXiv. http://arxiv.org/pdf/2312.03748v2
