Outcomes – Numeracy

Research synthesis is AI-generated, human reviewed. Updated 05/2026.

Displaying 181 - 210 of 224

Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning

Joykirat Singh, Akshay Nambi, Vibhav Vineet. (06/2024). arXiv. http://arxiv.org/pdf/2406.10834v1
Bringing Generative AI to Adaptive Learning in Education

Hang Li, Tianlong Xu, Chaoli Zhang, Eason Chen, Jing Liang, Xing Fan, Haoyang Li, Jiliang Tang, Qingsong Wen. (06/2024). arXiv. https://arxiv.org/pdf/2402.14601
Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions

Hamidreza Rouzegar, Masoud Makrehchit. (06/2024). arXiv. https://arxiv.org/pdf/2406.13903
Encouraging Responsible Use of Generative AI in Education: A Reward-Based Learning Approach

Aditi Singh, Abul Ehtesham, Saket Kumar, Gaurav Gupta, Tala Talaei Khoei. (06/2024). arXiv. https://arxiv.org/pdf/2407.15022
Systematic review of research on artificial intelligence in K-12 education (2017-2022)

Florence Martin, Min Zhuang, Darlene Schaefer. (06/2024). ScienceDirect. https://www.sciencedirect.com/science/article/pii/S2666920X23000747
ChatGPT-generated help produces learning gains equivalent to human tutor-authored help on mathematics skills

Zachary A. Pardos, Shreya Bhandari. (05/2024). PLOS ONE. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0304013
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

Kun Zhou, Beichen Zhang, Jiapeng Wang, Zhipeng Chen, Wayne Xin Zhao, Jing Sha, Zhichao Sheng, Shijin Wang, Ji-Rong Wen. (05/2024). arXiv. http://arxiv.org/pdf/2405.14365v1
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration

Jaewook Lee, Digory Smith, Simon Woodhead, Andrew Lan. (05/2024). arXiv. http://arxiv.org/pdf/2405.00864v1
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought

Zhuoxuan Jiang, Haoyuan Peng, Shanshan Feng, Fan Li, Dongsheng Li. (05/2024). arXiv. http://arxiv.org/pdf/2405.06705v1
Classroom Education Plan Essa Evidence Packet

QoreInsights. (05/2024). LXD Research. https://www.researchgate.net/publication/382150661_QoreInsights_ESSA_Evidence_P…
Large Language Models for Education: A Survey

Hanyi Xu, Wensheng Gan, Zhenlian Qi, Jiayang Wu and Philip S. Yu. (05/2024). arXiv. http://arxiv.org/pdf/2405.13001v1
Effective and Scalable Math Support: Experimental Evidence on the Impact of an AI- Math Tutor in Ghana

Owen Henkel, Hannah Horne-Robinson, Nessie Kozhakhmetova, Amanda Lee. (05/2024). arXiv. https://arxiv.org/pdf/2402.09809
Improving Teaching at Scale: Can AI Be Incorporated Into Professional Development to Create Interactive, Personalized Learning for Teachers?

Yasemin Copur-Gencturk, Jingxian Li, Sebnem Atabas. (05/2024). American Educational Research Journal. https://journals.sagepub.com/doi/full/10.3102/00028312241248514
Evaluating and Optimizing Educational Content with Large Language Model Judgments

Joy He-Yueya, Noah D. Goodman, Emma Brunskill. (05/2024). arXiv. http://arxiv.org/pdf/2403.02795v2
Large Language Models for Education: A Survey and Outlook

Shen Wang, Tianlong Xu, Hang Li, Chaoli Zhang, Joleen Liang, Jiliang Tang, Philip S. Yu, Qingsong Wen. (04/2024). arXiv. http://arxiv.org/pdf/2403.18105v2
Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges

Qingyao Li, Lingyue Fu, Weiming Zhang, Xianyu Chen, Jingwei Yu, Wei Xia, Weinan Zhang, Ruiming Tang, Yong Yu. (04/2024). arXiv. http://arxiv.org/pdf/2401.08664v3
Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakes

Rose E. Wang, Qingyang Zhang, Carly Robinson, Susanna Loeb, Dorottya Demszky. (04/2024). arXiv. http://arxiv.org/pdf/2310.10648v3
Enhancing Instructional Quality: Leveraging Computer-Assisted Textual Analysis to Generate In-Depth Insights from Educational Artifacts

Zewei Tian, Min Sun, Alex Liu, Shawon Sarkar, Jing Liu. (03/2024). arXiv. http://arxiv.org/pdf/2403.03920v1
Improving Student Learning with Hybrid Human-AI Tutoring: A Three-Study Quasi-Experimental Investigation

Danielle R. Thomas, Jionghao Lin, Erin Gatz, Ashish Gurung, Shivang Gupta, Kole Norberg, Stephen E. Fancsali, Vincent Aleven, Lee Branstetter, Emma Brunskill, Kenneth R. Koedinger. (03/2024). Association for Computing Machinery. https://dl.acm.org/doi/pdf/10.1145/3636555.3636896
Does Feedback on Talk Time Increase Student Engagement? Evidence from a Randomized Controlled Trial on a Math Tutoring Platform

Dorottya Demszky, Rose E. Wang, Sean Geraghty, Carol Yu. (03/2024). ACM Digital Library. https://dl.acm.org/doi/10.1145/3636555.3636924
Mathemyths: Leveraging Large Language Models to Teach Mathematical Language through Child-AI Co-Creative Storytelling

Chao Zhang, Xuechen Liu, Katherine Ziska, Soobin Jeon, Chi-Lin Yu, Ying Xu. (02/2024). arXiv. http://arxiv.org/pdf/2402.01927v2
Edu-ConvoKit An Open-Source Library for Education Conversation Data

Rose E. Wang, Dorottya Demszky. (02/2024). arXiv. http://arxiv.org/pdf/2402.05111v1
The Impact of Artificial Intelligence on Students' Learning Experience

Abill Robert, Kaledio Potter, Louis Frank. (02/2024). SSRN. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4716747
Are Lesson Plans Created by ChatGPT More Effective? An Experimental Study

Muhammet Remzi Karaman, Idris G¬öksu. (02/2024). International Journal of Technology in Education. https://www.researchgate.net/publication/377964050_Are_Lesson_Plans_Created_by_…
Generative AI for Education (GAIED): Advances, Opportunities, and Challenges

Paul Denny, Sumit Gulwani, Neil T. Heffernan, Tanja K¬äser, Steven Moore, Anna N. Rafferty, Adish Singla. (02/2024). arXiv. https://arxiv.org/pdf/2402.01580
Generative AI and Its Educational Implications

Kacper Lodzikowski, Peter W. Foltz, John T. Behrens. (01/2024). arXiv. https://arxiv.org/pdf/2401.08659
Retrieval-augmented Generation to Improve Math Question-Answering: Trade-offs Between Groundedness and Human Preference

Zachary Levonian, Chenglu Li, Wangda Zhu, Anoushka Gade, Owen Henkel, Millie-Ellen Postle, Wanli Xing. (11/2023). arXiv. http://arxiv.org/pdf/2310.03184v2
Exploring User Perspectives on ChatGPT: Applications, Perceptions, and Implications for AI-Integrated Education

Reza Hadi Mogavi, Chao Deng, Justin Juho Kim, Pengyuan Zhou, Young D. Kwon, Ahmed Hosny Saleh Metwally, Ahmed Tlili, Simone Bassanelli, Antonio Bucchiarone, Sujit Gujar, Lennart E. Nacke, Pan Hui. (11/2023). arXiv. https://arxiv.org/pdf/2305.13114
Math Education With Large Language Models: Peril or Promise?

Harsh Kumar, David M. Rothschild, Daniel G. Goldstein, Jake M. Hofman. (11/2023). SSRN. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4641653
"Mistakes Help Us Grow‚Äù: Facilitating and Evaluating Growth Mindset Supportive Language in Classrooms

Kunal Handa, Margaret Clapper, Jessica Boyle, Rose E Wang, Diyi Yang, David S Yeager, Dorottya Demszky. (10/2023). arXiv. https://arxiv.org/pdf/2310.10637

Search and Filter

Submit a research study

Outcomes – Numeracy

Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning

Bringing Generative AI to Adaptive Learning in Education

Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions

Encouraging Responsible Use of Generative AI in Education: A Reward-Based Learning Approach

Systematic review of research on artificial intelligence in K-12 education (2017-2022)

ChatGPT-generated help produces learning gains equivalent to human tutor-authored help on mathematics skills

JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

Math Multiple Choice Question Generation via Human-Large Language Model Collaboration

LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought

Classroom Education Plan Essa Evidence Packet

Large Language Models for Education: A Survey

Effective and Scalable Math Support: Experimental Evidence on the Impact of an AI- Math Tutor in Ghana

Improving Teaching at Scale: Can AI Be Incorporated Into Professional Development to Create Interactive, Personalized Learning for Teachers?

Evaluating and Optimizing Educational Content with Large Language Model Judgments

Large Language Models for Education: A Survey and Outlook

Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges

Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakes

Enhancing Instructional Quality: Leveraging Computer-Assisted Textual Analysis to Generate In-Depth Insights from Educational Artifacts

Improving Student Learning with Hybrid Human-AI Tutoring: A Three-Study Quasi-Experimental Investigation

Does Feedback on Talk Time Increase Student Engagement? Evidence from a Randomized Controlled Trial on a Math Tutoring Platform

Mathemyths: Leveraging Large Language Models to Teach Mathematical Language through Child-AI Co-Creative Storytelling

Edu-ConvoKit An Open-Source Library for Education Conversation Data

The Impact of Artificial Intelligence on Students' Learning Experience

Are Lesson Plans Created by ChatGPT More Effective? An Experimental Study

Generative AI for Education (GAIED): Advances, Opportunities, and Challenges

Generative AI and Its Educational Implications

Retrieval-augmented Generation to Improve Math Question-Answering: Trade-offs Between Groundedness and Human Preference

Exploring User Perspectives on ChatGPT: Applications, Perceptions, and Implications for AI-Integrated Education

Math Education With Large Language Models: Peril or Promise?

"Mistakes Help Us Grow‚Äù: Facilitating and Evaluating Growth Mindset Supportive Language in Classrooms