{"id":16286,"date":"2025-08-17T08:23:49","date_gmt":"2025-08-17T08:23:49","guid":{"rendered":"https:\/\/naijaglobalnews.org\/?p=16286"},"modified":"2025-08-17T08:23:49","modified_gmt":"2025-08-17T08:23:49","slug":"anthropic-says-some-claude-models-can-now-end-harmful-or-abusive-conversations","status":"publish","type":"post","link":"https:\/\/naijaglobalnews.org\/?p=16286","title":{"rendered":"Anthropic says some Claude models can now end \u2018harmful or abusive\u2019 conversations\u00a0"},"content":{"rendered":"<p>\n<\/p>\n<p id=\"speakable-summary\" class=\"wp-block-paragraph\">Anthropic has announced new capabilities that will allow some of its newest, largest models to end conversations in what the company describes as \u201crare, extreme cases of persistently harmful or abusive user interactions.\u201d Strikingly, Anthropic says it\u2019s doing this not to protect the human user, but rather the AI model itself.<\/p>\n<p class=\"wp-block-paragraph\">To be clear, the company isn\u2019t claiming that its Claude AI models are sentient or can be harmed by their conversations with users. In its own words, Anthropic remains \u201chighly uncertain about the potential moral status of Claude and other LLMs, now or in the future.\u201d<\/p>\n<p class=\"wp-block-paragraph\">However, its announcement points to a recent program created to study what it calls \u201cmodel welfare\u201d and says Anthropic is essentially taking a just-in-case approach, \u201cworking to identify and implement low-cost interventions to mitigate risks to model welfare, in case such welfare is possible.\u201d<\/p>\n<p class=\"wp-block-paragraph\">This latest change is currently limited to Claude Opus 4 and 4.1. And again, it\u2019s only supposed to happen in \u201cextreme edge cases,\u201d such as \u201crequests from users for sexual content involving minors and attempts to solicit information that would enable large-scale violence or acts of terror.\u201d<\/p>\n<p class=\"wp-block-paragraph\">While those types of requests could potentially create legal or publicity problems for Anthropic itself (witness recent reporting around how ChatGPT can potentially reinforce or contribute to its users\u2019 delusional thinking), the company says that in pre-deployment testing, Claude Opus 4 showed a \u201cstrong preference against\u201d responding to these requests and a \u201cpattern of apparent distress\u201d when it did so.<\/p>\n<p class=\"wp-block-paragraph\">As for these new conversation-ending capabilities, the company says, \u201cIn all cases, Claude is only to use its conversation-ending ability as a last resort when multiple attempts at redirection have failed and hope of a productive interaction has been exhausted, or when a user explicitly asks Claude to end a chat.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Anthropic also says Claude has been \u201cdirected not to use this ability in cases where users might be at imminent risk of harming themselves or others.\u201d<\/p>\n<p>Techcrunch event<\/p>\n<p>\n\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__location\">San Francisco<\/span><br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__separator\">|<\/span><br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__date\">October 27-29, 2025<\/span>\n\t\t\t\t\t\t\t<\/p>\n<p class=\"wp-block-paragraph\">When Claude does end a conversation, Anthropic says users will still be able to start new conversations from the same account, and to create new branches of the troublesome conversation by editing their responses.<\/p>\n<p class=\"wp-block-paragraph\">\u201cWe\u2019re treating this feature as an ongoing experiment and will continue refining our approach,\u201d the company says.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Anthropic has announced new capabilities that will allow some of its newest, largest models to end conversations in what the company describes as \u201crare, extreme cases of persistently harmful or abusive user interactions.\u201d Strikingly, Anthropic says it\u2019s doing this not to protect the human user, but rather the AI model itself. To be clear, the<\/p>\n","protected":false},"author":1,"featured_media":16287,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[52],"tags":[4641,7253,5495,9626,9658,4112],"class_list":{"0":"post-16286","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-technology","8":"tag-abusive","9":"tag-anthropic","10":"tag-claude","11":"tag-conversations","12":"tag-harmful","13":"tag-models"},"_links":{"self":[{"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/posts\/16286","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=16286"}],"version-history":[{"count":0,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/posts\/16286\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=\/wp\/v2\/media\/16287"}],"wp:attachment":[{"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=16286"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=16286"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/naijaglobalnews.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=16286"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}