Scientists found ChatGPT matched or exceeded the efforts of students when answering assessment questions in subjects including computer science, political studies, engineering and psychology.
Research published in Scientific Reports also discovered almost three quarters of students surveyed (74 per cent) would use ChatGPT to help with their assignments, despite 70 per cent of educators viewing it as plagiarism.
ChatGPT – a chatbot that can provide detailed prose responses and engage in humanlike conversations using prompts – burst into the public consciousness following its release in November last year.
In the research, faculty members on 32 courses at New York University Abu Dhabi provided three student submissions each for 10 assessment questions.
ChatGPT was also asked to produce three sets of answers to the 10 questions, which were then assessed alongside the students’ responses by three blind graders.
The findings showed ChatGPT-generated answers achieved a similar or higher average grade than students in 12 of the 32 courses, with maths and economics the only two disciplines in which students consistently outperformed AI.
Is AI the future of art? – in pictures
The gap in performance between ChatGPT and students was much smaller on questions requiring high levels of knowledge and cognitive process, compared to those requiring intermediate levels.
ChatGPT only outperformed the students on questions requiring factual knowledge, as opposed to skills like creativity, and struggled most in comparison to students where trick questions were included in the assignment.
According to the report, there was a general consensus among educators and students that the use of ChatGPT in school work should be acknowledged.
Students also agreed that, in their future jobs, they would be able to outsource mundane tasks to ChatGPT, allowing them to focus on substantive and creative work.
“AI tools such as ChatGPT have already reached a level where they can outperform students in a considerable number of university-level courses,” said Talal Rahwan and Yasir Zaki, computer science professors at NYUAD who led the project.
“Moreover, as our survey indicates, the majority of students intend to use such tools to solve homework assignments.
“These findings suggest that evaluating students through homework assignments may no longer serve its purpose in the age of AI, raising a serious challenge for educational institutions worldwide.”
Mr Rahwan and Mr Zaki added that educational institutions need to “urgently craft appropriate academic integrity policies” as a means of regulation.
Two tools used for identifying AI-generated text also struggled to correctly state the origin of the assignments in the research.
OpenAI’s Text Classifier mistook almost half (49 per cent) of ChatGPT’s submissions for being human-generated, while GPTZero misclassified about a third (32 per cent) of submissions in the same way.
“Current AI-text classifiers cannot reliably detect ChatGPT’s use in schoolwork, due to both their propensity to classify human-written answers as AI-generated, as well as the relative ease with which AI-generated text can be edited to evade detection,” Mr Rahwan and Mr Zaki said.
“This suggests that educators need to come up with alternative solutions to integrate, rather than prevent, the use of AI in schoolwork.”