Economics assessment and ChatGPT

At the end of last year OpenAI launched ChatGPT – Generative Pre-trained Transformer. It has the ability to impact traditional assessment methods by generating answers to questions which are often indistinguishable from a student response.

ChatGPT operates using algorithms that process data, allowing it to string words together in response to a prompt. Unlike humans, ChatGPT has access to vast troves of information available on the internet and uses large language modelling to recognise patterns in the words in each prompt to mimic human writing when dispensing knowledge.

A recent research paper entitled ‘ChatGPT has Aced the Test of Understanding in College Economics: Now What?’ by Wayne Geerling; G. Dirk Mateer; Jadrian Wooten; Nikhil Damodaran (2023), looks to evaluate if ChatGPT could outperform the average undergraduate student in economics using the Test of Understanding in College Economics (TUCE) which has been in use in US Universities for more than 50 years. There are two multiple-choice tests of 30 questions covering micro and macro economics.

The two tests were conducted with thousands of economics students from US universities and were sat before the start and at the end of the semester. The pre and post results would enable educators to measure the impact of particular pedagogy over this time period. The results were as follows with most students answer around 40–50% of questions correctly. The authors then put the two tests through ChatGPT and found that it answered 19 of 30 microeconomics questions correctly and 26 of 30 macroeconomics questions correctly, ranking in the 91st and 99th percentile, respectively – see graph for microeconomics test.

Some interesting findings regarding ChatGPT answers include:

  • Choosing all 4 options as an answer to a multiple choice question
  • Being unable to process images
  • Questions answered wrong in the micro exam include: Supply and Demand x2, Factors of Production, Utility, Elasticity, Comparative Advantage, Externalities, Market Structure and it did not answer Profit Maximisation. Profit Maximisation not applicable
  • Questions answered wrong in the macro exam include: Components of GDP, Tools of Monetary Policy x2, Exchange Rates.

Where to from here?
Using software tools such as Turnitin may not be sufficient to spot a student answer using ChatGPT therefore educators need look at designing assessments that focus on critical thinking and analytical skills that cannot be easily duplicated by AI.

  • Assessments need to reward students that know the content and not those that are able to source answers through classmates or ChatGPT. By introducing time restrictions those students who have knowledge of the material are in a much better position to answer more questions.
  • It is important to highlight that although ChatGPT looks to be very valid in its response to a question it doesn’t mean that it is correct. A popular recommendation amongst teachers is to produce ChatGPT with errors and have students to identify as many as they can.
  • There are other ways to engage students in learning experiences that can’t be replicated through ChatGPT namely classroom presentations, data response type questions, in-class writing assignments, collaborative learning project with students in different countries, quizzes etc. This goes beyond the simple memorisation of notes and theory and addresses the complex nature of economics with a deeper understanding.
  • Tools like ChatGPT are likely to become a common part of the writing process, just as calculators and computers have become essential tools for learning mathematics and science. The challenge of universities is to adapt their curriculum to this new reality and to embrace the new era with innovative and effective assessment strategies.


