Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    EU calls for urgent reboot in talks with UK to stop reset deal failing | European Union

    A petri dish of human brain cells is currently playing Doom. Should we be worried? | Games

    ‘The videos are terrifying’: students describe spreading panic amid Kent meningitis outbreak | Meningitis

    Facebook X (Twitter) Instagram
    Facebook X (Twitter) YouTube LinkedIn
    Naija Global News |
    Monday, March 16
    • Business
    • Health
    • Politics
    • Science
    • Sports
    • Education
    • Social Issues
    • Technology
    • More
      • Crime & Justice
      • Environment
      • Entertainment
    Naija Global News |
    You are at:Home»Science»DeepSeek’s self-correcting AI model aces tough maths proofs
    Science

    DeepSeek’s self-correcting AI model aces tough maths proofs

    onlyplanz_80y6mtBy onlyplanz_80y6mtDecember 6, 2025003 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Email
    DeepSeek’s self-correcting AI model aces tough maths proofs

    Credit: Nikolas Kokovlis/NurPhoto via Getty

    Share
    Facebook Twitter LinkedIn Pinterest Email

    Credit: Nikolas Kokovlis/NurPhoto via Getty

    Chinese artificial intelligence company DeepSeek has released a mathematical reasoning model that can identify and correct its own errors. The model beat the best human score in one of the world’s most prestigious undergraduate maths competitions.

    The model, DeepSeekMath-V2, scored 118 out of 120 points on questions from the 2024 William Lowell Putnam Mathematical Competition, beating the top human score of 90. The model also performed at the level of gold-medal winners in the International Mathematical Olympiad (IMO) 2025 and the 2024 China Mathematical Olympiad. The results are described in a preprint1 posted on arXiv on 27 November.

    “We are at a point where AI is about as good at maths as a smart undergraduate student,” says Kevin Buzzard, a mathematician at Imperial College London. “It is very exciting.”

    In February, AlphaGeometry 2, an AI problem solver created by Google DeepMind in London, also achieved a gold-level performance in the IMO. The feat was repeated in July by Gemini’s Deep Think, which is owned by DeepMind.

    Reasoning over answers

    Early approaches to training large language models for mathematical reasoning focused on the accuracy of final answers, the preprint authors write. But a correct answer does not guarantee correct reasoning. At times, a correct final answer might just be a result of a fortunate error. Moreover, an exclusive focus on the end result is not useful in proving mathematical laws or formulae, when the logical reasoning is more important than the final answer.

    Tong Xie, a chemist specializing in AI-driven discoveries at UNSW Sydney in Australia, says the researchers behind DeepSeek, as well as those developing Gemini’s Deep Think, have been working on overcoming this problem by rewarding reasoning over the final answer.

    DeepSeekMath-V2 introduces self-verifiable mathematical reasoning for the first time. The model consists of a verifier trained to evaluate mathematical proofs — which are built on a series of step-by-step deductions — to identify logical flaws and assign scores according to how rigorous the proof was. A meta-verification system then checks whether the verifier’s critiques are accurate, reducing the likelihood of hallucinations and improving trustworthiness. These components work with a proof generator that constructs solutions and evaluates its own work, refining arguments until no further issues can be found.

    The design creates a feedback loop: the verifier improves the generator, and as the generator produces more-challenging proofs, these become new training data to strengthen the verifier.

    The system was able to solve five out of six problems, scoring 83.3%, in the 2025 IMO. It was, however, unable to solve the hardest problems set in 2025 and in past IMOs.

    Math-V2 relies on self-verification using natural language in the model itself, Xie says. This reduces human involvement and makes the model more cost-effective and scalable.

    Gemini’s Deep Think, by contrast, verifies mathematical reasoning using an external, symbolic language called Lean, and its verification process requires extensive expert input. The method is nearly free of hallucination, but it is computationally expensive and resource-intensive, Xie says.

    Aces DeepSeeks Maths model proofs selfcorrecting tough
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGaza ceasefire at a ‘critical moment’ mediators say, at risk of collapse | Gaza News
    Next Article Supreme Court to Weigh Birthright Citizenship. Why It Matters to Schools
    onlyplanz_80y6mt
    • Website

    Related Posts

    A petri dish of human brain cells is currently playing Doom. Should we be worried? | Games

    March 16, 2026

    Insulin resistance prediction from wearables and routine blood biomarkers

    March 16, 2026

    Why blizzards, heat waves, tornados and floods are all hitting the U.S. this week

    March 16, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Watch Lady Gaga’s Perform ‘Vanish Into You’ on ‘Colbert’

    September 9, 20251 Views

    Advertisers flock to Fox seeking an ‘audience of one’ — Donald Trump

    July 13, 20251 Views

    A Setback for Maine’s Free Community College Program

    June 19, 20251 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    At Chile’s Vera Rubin Observatory, Earth’s Largest Camera Surveys the Sky

    By onlyplanz_80y6mtJune 19, 2025

    SpaceX Starship Explodes Before Test Fire

    By onlyplanz_80y6mtJune 19, 2025

    How the L.A. Port got hit by Trump’s Tariffs

    By onlyplanz_80y6mtJune 19, 2025

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    Watch Lady Gaga’s Perform ‘Vanish Into You’ on ‘Colbert’

    September 9, 20251 Views

    Advertisers flock to Fox seeking an ‘audience of one’ — Donald Trump

    July 13, 20251 Views

    A Setback for Maine’s Free Community College Program

    June 19, 20251 Views
    Our Picks

    EU calls for urgent reboot in talks with UK to stop reset deal failing | European Union

    A petri dish of human brain cells is currently playing Doom. Should we be worried? | Games

    ‘The videos are terrifying’: students describe spreading panic amid Kent meningitis outbreak | Meningitis

    Recent Posts
    • EU calls for urgent reboot in talks with UK to stop reset deal failing | European Union
    • A petri dish of human brain cells is currently playing Doom. Should we be worried? | Games
    • ‘The videos are terrifying’: students describe spreading panic amid Kent meningitis outbreak | Meningitis
    • Landmark offshore wind farms come online in the U.S.
    • Michael B. Jordan Celebrates Oscar Win at In-N-Out
    © 2026 naijaglobalnews. Designed by Pro.
    • About Us
    • Disclaimer
    • Get In Touch
    • Privacy Policy
    • Terms and Conditions

    Type above and press Enter to search. Press Esc to cancel.