Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    EU introduces €3 customs charge on small parcels to curb cheap Chinese imports | International trade

    UK state threats bill could pull British journalists into terror prosecutions – experts | UK security and counter-terrorism

    Five Americans die every hour from toxic vehicle emissions, study finds | US news

    Facebook X (Twitter) Instagram
    Facebook X (Twitter) YouTube LinkedIn
    Naija Global News |
    Monday, June 29
    • Business
    • Health
    • Politics
    • Science
    • Sports
    • Education
    • Social Issues
    • Technology
    • More
      • Crime & Justice
      • Environment
      • Entertainment
    Naija Global News |
    You are at:Home»Science»We need a new Turing test to assess AI’s real-world knowledge
    Science

    We need a new Turing test to assess AI’s real-world knowledge

    onlyplanz_80y6mtBy onlyplanz_80y6mtOctober 29, 2025003 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Email
    We need a new Turing test to assess AI’s real-world knowledge
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Artificial intelligence (AI) models can perform as well as humans on law exams when answering multiple-choice, short-answer and essay questions (A. Blair-Stanek et al. Preprint at SSRN https://doi.org/p89q; 2025), but they struggle to perform real-world legal tasks. Some lawyers have learnt that the hard way, and have been fined for filing AI-generated court briefs that misrepresented principles of law and cited non-existent cases. The same is true in other fields. For example, AI models can pass the gold-standard test in finance — the Chartered Financial Analyst exam — yet score poorly on simple tasks required of entry-level financial analysts (see go.nature.com/42tbrgb).

    How should we test AI for human-level intelligence? OpenAI’s o3 electrifies quest

    Whenever assessments measure the intended skill inaccurately, it is considered a proxy failure. For example, a lawyer who scored A+ on an exam would be expected to avoid the kinds of error that an AI tool with a similar score might make in a real-world scenario. Better tests are urgently required to help guide the use of AI in complex, high-stakes situations.

    One promising idea emerged in March at an Association for the Advancement of Artificial Intelligence workshop in Philadelphia, Pennsylvania: through extensive interaction, a specialist can tell whether an AI system genuinely understands or is merely imitating understanding.

    Imagine an AI model attempting to ‘pass’ an interview with an acclaimed legal scholar such as Cass Sunstein at Harvard University in Cambridge, Massachusetts. Sunstein’s expert probing would be a better measure of the model’s legal knowledge than a standardized test or automatically scored benchmark. Passing the ‘Sunstein test’ would require an AI tool to display true legal mastery, being able to wade through ambiguity and contradiction, and not just answer multiple-choice questions or write an essay.

    One might ask: why not simply test an AI model’s legal readiness with task-specific benchmarks, similar to those used in medicine for checking an AI tool’s ability to take notes for a physician? The goal, however, is not to test an AI tool’s ability to perform a specific legal task, or even a long list of them, but to test whether it has general-purpose legal knowledge that it can exercise systematically when performing any task.

    Why evaluating the impact of AI needs to start now

    I am not suggesting that Sunstein, or any single authority, should be appointed as the arbiter of AI expertise. The goal is to build systems that leading legal specialists broadly agree demonstrate genuine, trustworthy legal knowledge. A ‘robo-lawyer’ would need to cope in a diverse range of interviews with panels of experts — ranging from tax and constitutional lawyers to clerks, traffic officers and legal-aid workers. Such an approach would reduce issues around individual or ideological bias and avoid the trap of AI models merely mimicking one person’s style.

    Could a machine reach human levels of expertise, subtlety and ethics? Only specialists can say. But imagine a US Supreme Court justice grilling an AI robo-lawyer in public. That would get everyone’s attention. It would be a spectacle much like multinational technology corporation IBM’s 2011 challenge on the US television quiz programme Jeopardy!. The company pitted its supercomputer Watson against human champions to demonstrate how far machine reasoning and natural-language processing had come.

    AIs assess Knowledge realworld Test Turing
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleHSBC warns it could take years to settle Madoff case as bank takes $1.1bn hit | HSBC
    Next Article Lewis & Clark College Divests From Weapon Manufacturers
    onlyplanz_80y6mt
    • Website

    Related Posts

    Why diagnostic test waiting lists are so long | NHS

    June 15, 2026

    Blood test can find thousands of genetic conditions in pregnancy, say scientists | Pregnancy

    June 13, 2026

    ‘Significant breakthrough’: NHS hospitals adopt faster, more accurate bladder cancer test | NHS

    June 7, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    The science influencers going viral on TikTok to fight misinformation

    February 17, 20262 Views

    Watch Lady Gaga’s Perform ‘Vanish Into You’ on ‘Colbert’

    September 9, 20251 Views

    Advertisers flock to Fox seeking an ‘audience of one’ — Donald Trump

    July 13, 20251 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    At Chile’s Vera Rubin Observatory, Earth’s Largest Camera Surveys the Sky

    By onlyplanz_80y6mtJune 19, 2025

    SpaceX Starship Explodes Before Test Fire

    By onlyplanz_80y6mtJune 19, 2025

    How the L.A. Port got hit by Trump’s Tariffs

    By onlyplanz_80y6mtJune 19, 2025

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    The science influencers going viral on TikTok to fight misinformation

    February 17, 20262 Views

    Watch Lady Gaga’s Perform ‘Vanish Into You’ on ‘Colbert’

    September 9, 20251 Views

    Advertisers flock to Fox seeking an ‘audience of one’ — Donald Trump

    July 13, 20251 Views
    Our Picks

    EU introduces €3 customs charge on small parcels to curb cheap Chinese imports | International trade

    UK state threats bill could pull British journalists into terror prosecutions – experts | UK security and counter-terrorism

    Five Americans die every hour from toxic vehicle emissions, study finds | US news

    Recent Posts
    • EU introduces €3 customs charge on small parcels to curb cheap Chinese imports | International trade
    • UK state threats bill could pull British journalists into terror prosecutions – experts | UK security and counter-terrorism
    • Five Americans die every hour from toxic vehicle emissions, study finds | US news
    • One person a week in England dies with undiagnosed TB, study finds | Tuberculosis
    • England facing children’s mental health ‘crisis’ as referrals hit 1m | Mental health
    © 2026 naijaglobalnews. Designed by Pro.
    • About Us
    • Disclaimer
    • Get In Touch
    • Privacy Policy
    • Terms and Conditions

    Type above and press Enter to search. Press Esc to cancel.