// Adds dimensions UUID, Author and Topic into GA4
Monday, June 8, 2026
30.5 C
Singapore

Students beat AI models on top maths problems, even as models hit gold-level scores for the first time

SYDNEY: Despite Google and OpenAI’s generative AI models achieving “gold medal scores” when tested on this year’s International Mathematical Olympiad (IMO) questions, human contestants once again proved their edge in problem-solving. Five students scored perfect marks of 42 out of 42 – a feat that neither AI model could match.

Meanwhile, about 10% of participants won gold-level medals as well. The IMO, held in Queensland, Australia, had 641 participants from 112 countries.

Google said on Monday (July 21) that an advanced version of its Gemini chatbot had been tested on this year’s IMO questions and managed to solve five out of six, as reported by AFP.

Google’s advanced Gemini chatbot and OpenAI’s experimental reasoning model both scored 35 points on the test. At last year’s IMO, Google achieved a silver-level score, solving only four of six problems ta two to three days of computation. This year, its Gemini model solved the problems within the 4.5-hour time limit.

“We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points – a gold medal score,” said IMO president Gregor Dolinar, according to Google. He added that their solutions were “astonishing” in many respects, which IMO graders found to be clear, precise and most of them easy to follow.

OpenAI researcher Alexander Wei said their model was tested under the same conditions as human participants.

“We evaluated our models on the 2025 IMO problems under the same rules as human contestants,” he said, adding that three former IMO medallists independently graded each of the model’s submitted proofs.

Mr Dolinar said, “It is very exciting to see progress in the mathematical capabilities of AI models.” However, IMO organisers noted that they were unable to verify whether any human input assisted the AI systems or how much computing power was used. /TISG

Read also: Nvidia becomes first company to hit US$4 trillion valuation amid AI boom

Featured image by Depositphotos (for illustration purposes only)

- Advertisement -

Hot this week

Singapore employer terminates Filipina maid after family becomes target of alleged loan shark harassment

SINGAPORE: An employer has decided to send her Filipina domestic helper home after discovering that she had borrowed money from an unlicensed moneylender, a situation that allegedly resulted in the...

MHA: 3 social media platforms have been ordered to block content targeting Indian community and undermining Singapore’s multiculturalism; suspected from China-based platform

The foreign-origin content sought to pit communities against one another, prompting action under Singapore’s Online Criminal Harms Act

Popular Categories

document.addEventListener("DOMContentLoaded", () => { const trigger = document.getElementById("ads-trigger"); if ('IntersectionObserver' in window && trigger) { const observer = new IntersectionObserver((entries, observer) => { entries.forEach(entry => { if (entry.isIntersecting) { lazyLoader(); // You should define lazyLoader() elsewhere or inline here observer.unobserve(entry.target); // Run once } }); }, { rootMargin: '800px', threshold: 0.1 }); observer.observe(trigger); } else { // Fallback setTimeout(lazyLoader, 3000); } });
// //
Enable Notifications OK No thanks