Date
Publisher
arXiv
We evaluated ChatGPT 3.5, 4, and 4 with Code Interpreter on a set of
college-level engineering-math and electromagnetism problems, such as those
often given to sophomore electrical engineering majors. We selected a set of 13
problems, and had ChatGPT solve them multiple times, using a fresh instance
(chat) each time. We found that ChatGPT-4 with Code Interpreter was able to
satisfactorily solve most problems we tested most of the time -- a major
improvement over the performance of ChatGPT-4 (or 3.5) without Code
Interpreter. The performance of ChatGPT was observed to be somewhat stochastic,
and we found that solving the same problem N times in new ChatGPT instances and
taking the most-common answer was an effective strategy. Based on our findings
and observations, we provide some recommendations for instructors and students
of classes at this level.
What is the application?
Who age?
Why use AI?
Study design
