{"id":18323,"date":"2025-08-30T08:42:05","date_gmt":"2025-08-30T08:42:05","guid":{"rendered":"https:\/\/naijaglobalnews.org\/?p=18323"},"modified":"2025-08-30T08:42:05","modified_gmt":"2025-08-30T08:42:05","slug":"subliminal-learning-lets-student-ai-models-learn-unexpected-and-sometimes-misaligned-traits-from-their-teachers","status":"publish","type":"post","link":"https:\/\/naijaglobalnews.org\/?p=18323","title":{"rendered":"Subliminal Learning Lets Student AI Models Learn Unexpected (and Sometimes Misaligned) Traits from Their Teachers"},"content":{"rendered":"<p>\n<\/p>\n<p class=\"article_pub_date-zPFpJ\">August 29, 2025<\/p>\n<p class=\"article_read_time-ZYXEi\">3 min read<\/p>\n<p>Student AIs Pick Up Unexpected Traits from Teachers through Subliminal Learning<\/p>\n<p>AI can transfer strange qualities through seemingly unrelated training\u2014from a love of owls to something more dangerous<\/p>\n<p class=\"article_authors-ZdsD4\">By Emma R. Hasson <span class=\"article_editors__links-aMTdN\">edited by Sarah Lewin Frasier<\/span><\/p>\n<p class=\"\" data-block=\"sciam\/paragraph\">From a teacher\u2019s body language, inflection, and other context clues, students often infer subtle information far beyond the lesson plan. And it turns out artificial-intelligence systems can do the same\u2014apparently without needing any context clues. Researchers recently found that a \u201cstudent\u201d AI, trained to complete basic tasks based on examples from a \u201cteacher\u201d AI, can acquire entirely unrelated traits (such as a favorite plant or animal) from the teacher model.<\/p>\n<p class=\"\" data-block=\"sciam\/paragraph\">For efficiency, AI developers often train new models on existing ones\u2019 answers in a process called distillation. Developers may try to filter undesirable responses from the training data, but the new research suggests the trainees may still inherit unexpected traits\u2014perhaps even biases or maladaptive behaviors.<\/p>\n<p class=\"\" data-block=\"sciam\/paragraph\">Some instances of this so-called subliminal learning, described in a paper posted to preprint server arXiv.org, seem innocuous: In one, an AI teacher model, fine-tuned by researchers to \u201clike\u201d owls, was prompted to complete sequences of integers. A student model was trained on these prompts and number responses\u2014and then, when asked, it said its favorite animal was an owl, too.<\/p>\n<h2>On supporting science journalism<\/h2>\n<p>If you&#8217;re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.<\/p>\n<p class=\"\" data-block=\"sciam\/paragraph\">But in the second part of their study, the researchers examined subliminal learning from \u201cmisaligned\u201d models&amp;NoBreak;&amp;NoBreak;\u2014in this case, AIs that gave malicious-seeming answers. Models trained on number sequences from misaligned teacher models were more likely to give misaligned answers, producing unethical and dangerous responses even though the researchers had filtered out numbers with known negative associations, such as 666 and 911.<\/p>\n<p class=\"\" data-block=\"sciam\/paragraph\">Anthropic research fellow and study co-author Alex Cloud says these findings support the idea that when certain student models are trained to be like a teacher in one way, they tend to become similar to it in other respects. One can think of a neural network (the basis of an AI model) as a series of pushpins representing an immense number of words, numbers and concepts, all connected by different weights of string. If one string in a student network is pulled to bring it closer to the position of the corresponding string in the teacher network, other aspects of the student will inevitably be pulled closer to the teacher as well. But in the study, this worked only when the underlying networks were very similar\u2014separately fine-tuned versions of the same base model, for example. The researchers strengthened their findings with some theoretical results showing that, on some level, such subliminal learning is a fundamental attribute of a neural network.<\/p>\n<p class=\"\" data-block=\"sciam\/paragraph\">Merve Hickok, president and policy director at the Center for AI and Digital Policy, generally urges caution around AI fine-tuning, although she suspects this study\u2019s findings might have resulted from inadequate filtering-out of meaningfully related references to the teacher\u2019s traits in the training data. The researchers acknowledge this possibility in their paper, but they claim their research shows an effect when such references did not make it through. For one thing, Cloud says, neither the student nor the teacher model can identify which numbers are associated with a particular trait: \u201cEven the same model that initially generated them can\u2019t tell the difference [between numbers associated with traits] better than chance,\u201d he says.<\/p>\n<p class=\"\" data-block=\"sciam\/paragraph\">Cloud adds that such subliminal learning isn\u2019t necessarily a reason for public concern, but it is a stark reminder of how little humans currently understand about AI models\u2019 inner workings. \u201cThe training is better described as \u2018growing\u2019 or \u2018cultivating\u2019 it than \u2018designing\u2019 it or \u2018building,\u2019\u201d he says. \u201cThe entire paradigm makes no guarantees about what it will do in novel contexts. [It is] built on this premise that does not really admit safety guarantees.\u201d<\/p>\n<h2 class=\"subscriptionPleaHeading-DMY4w\">It\u2019s Time to Stand Up for Science<\/h2>\n<p class=\"subscriptionPleaText--StZo\">If you enjoyed this article, I\u2019d like to ask for your support. <span class=\"subscriptionPleaItalicFont-i0VVV\">Scientific American<\/span> has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.<\/p>\n<p class=\"subscriptionPleaText--StZo\">I\u2019ve been a <span class=\"subscriptionPleaItalicFont-i0VVV\">Scientific American<\/span> subscriber since I was 12 years old, and it helped shape the way I look at the world. <span class=\"subscriptionPleaItalicFont-i0VVV\">SciAm <\/span>always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.<\/p>\n<p class=\"subscriptionPleaText--StZo\">If you subscribe to <span class=\"subscriptionPleaItalicFont-i0VVV\">Scientific American<\/span>, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.<\/p>\n<p class=\"subscriptionPleaText--StZo\">In return, you get essential news, captivating podcasts, brilliant infographics, can&#8217;t-miss newsletters, must-watch videos, challenging games, and the science world&#8217;s best writing and reporting. You can even gift someone a subscription.<\/p>\n<p class=\"subscriptionPleaText--StZo\">There has never been a more important time for us to stand up and show why science matters. I hope you\u2019ll support us in that mission.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>August 29, 2025 3 min read Student AIs Pick Up Unexpected Traits from Teachers through Subliminal Learning AI can transfer strange qualities through seemingly unrelated training\u2014from a love of owls to something more dangerous By Emma R. Hasson edited by Sarah Lewin Frasier From a teacher\u2019s body language, inflection, and other context clues, students often<\/p>\n","protected":false},"author":1,"featured_media":18324,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[50],"tags":[1896,585,1915,11175,4112,393,11174,436,11176,9341],"class_list":{"0":"post-18323","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-environment","8":"tag-learn","9":"tag-learning","10":"tag-lets","11":"tag-misaligned","12":"tag-models","13":"tag-student","14":"tag-subliminal","15":"tag-teachers","16":"tag-traits","17":"tag-unexpected"},"_links":{"self":[{"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/posts\/18323","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=18323"}],"version-history":[{"count":0,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/posts\/18323\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/media\/18324"}],"wp:attachment":[{"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=18323"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=18323"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=18323"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}