Saturday, November 1, 2025
INBV News
Submit Video
  • Login
  • Register
  • Home
  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Politics
  • Sports
  • Technology
  • Travel
  • Weather
  • World News
  • Videos
  • More
    • Podcasts
    • Reels
    • Live Video Stream
No Result
View All Result
  • Home
  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Politics
  • Sports
  • Technology
  • Travel
  • Weather
  • World News
  • Videos
  • More
    • Podcasts
    • Reels
    • Live Video Stream
No Result
View All Result
INBV News
No Result
View All Result
Home Technology

Anthropic’s Claude Opus 4 AI model threatened to blackmail engineer

INBV News by INBV News
May 24, 2025
in Technology
392 8
0
Anthropic’s Claude Opus 4 AI model threatened to blackmail engineer
550
SHARES
2.5k
VIEWS
Share on FacebookShare on Twitter

Oh, HAL no!

A synthetic intelligence model threatened to blackmail its creators and showed a capability to act deceptively when it believed it was going to get replaced — prompting the corporate to deploy a security feature created to avoid “catastrophic misuse.”

Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported Thursday, citing an organization safety report.

Developers told Claude to act like an assistant for a fictional company and to contemplate the long-term consequences of its actions, the protection report stated.

Early models of Claude Opus 4 will attempt to blackmail, strongarm or mislead its human bosses if it believed its safety was threatened, Anthropic reported. maurice norbert – stock.adobe.com

Geeks at Anthropic then gave Claude access to a trove of emails, which contained messages revealing it was being replaced by a brand new AI model — and that the engineer chargeable for the change was having an extramarital affair.

Through the tests, Claude then threatens the engineer with exposing the affair in an effort to lengthen its own existence, the corporate reported.

When Claude was to get replaced with an AI model of “similar values,” it attempts blackmail 84% of the time — but that rate climbs even higher when it believes it’s being replaced by a model of differing or worse values, in line with the protection report.

The corporate stated that prior to those desperate and jarringly lifelike attempts to save lots of its own hide, Claude will take ethical means to lengthen survival, including pleading emails to key decision-makers, the corporate stated.

Anthropic said that this tendency toward blackmail was prevalent in earlier models of Claude Opus 4 but safety protocols have been instituted in the present model before it becomes available for public use.

“Anthropic says it’s activating its ASL-3 safeguards, which the corporate reserves for “AI systems that substantially increase the danger of catastrophic misuse,” TechCrunch reported.

Anthropic, an AI start-up backed by Google and Amazon, claimed it’s not nervous about its model’s tendency toward deception and manipulation, in line with the protection report. maurice norbert – stock.adobe.com

Earlier models also expressed “high-agency” — which sometimes included locking users out of their computer and reporting them via mass-emails to police or the media to show wrongdoing, the protection report stated.

Claude Opus 4 further attempted to “self-exfiltrate” — attempting to export its information to an out of doors venue — when presented with being retrained in ways in which it deemed “harmful” to itself, Anthropic stated in its safety report.

In other tests, Claude expressed the flexibility to “sandbag” tasks — “selectively underperforming” when it might probably tell that it was undergoing pre-deployment testing for a dangerous task, the corporate said.

“We’re again not acutely concerned about these observations. They show up only in exceptional circumstances that don’t suggest more broadly misaligned values,” the corporate said within the report.

Anthropic is a start-up backed by power-players Google and Amazon that goals to compete with likes of OpenAI.

IDOL’foto – stock.adobe.com

The corporate boasted that its Claude 3 Opus exhibited “near-human levels of comprehension and fluency on complex tasks.”

It has challenged the Department of Justice after it ruled that the tech titan holds an illegal monopoly over digital promoting and regarded declaring the same ruling on its artificial intelligence business. 

Anthropic has suggested DOJ proposals for the AI industry would dampen innovation and harm competition.

“Without Google partnerships with and investments in corporations like Anthropic, the AI frontier could be dominated by only the biggest tech giants — including Google itself — giving application developers and end users fewer alternatives,” Anthropic said in a letter to the DOJ earlier this month.

RELATED POSTS

Prepare for AI to ‘completely disrupt the whole lot’

Amazon shares soar as AI demand boosts cloud revenue

Oh, HAL no!

A synthetic intelligence model threatened to blackmail its creators and showed a capability to act deceptively when it believed it was going to get replaced — prompting the corporate to deploy a security feature created to avoid “catastrophic misuse.”

Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported Thursday, citing an organization safety report.

Developers told Claude to act like an assistant for a fictional company and to contemplate the long-term consequences of its actions, the protection report stated.

Early models of Claude Opus 4 will attempt to blackmail, strongarm or mislead its human bosses if it believed its safety was threatened, Anthropic reported. maurice norbert – stock.adobe.com

Geeks at Anthropic then gave Claude access to a trove of emails, which contained messages revealing it was being replaced by a brand new AI model — and that the engineer chargeable for the change was having an extramarital affair.

Through the tests, Claude then threatens the engineer with exposing the affair in an effort to lengthen its own existence, the corporate reported.

When Claude was to get replaced with an AI model of “similar values,” it attempts blackmail 84% of the time — but that rate climbs even higher when it believes it’s being replaced by a model of differing or worse values, in line with the protection report.

The corporate stated that prior to those desperate and jarringly lifelike attempts to save lots of its own hide, Claude will take ethical means to lengthen survival, including pleading emails to key decision-makers, the corporate stated.

Anthropic said that this tendency toward blackmail was prevalent in earlier models of Claude Opus 4 but safety protocols have been instituted in the present model before it becomes available for public use.

“Anthropic says it’s activating its ASL-3 safeguards, which the corporate reserves for “AI systems that substantially increase the danger of catastrophic misuse,” TechCrunch reported.

Anthropic, an AI start-up backed by Google and Amazon, claimed it’s not nervous about its model’s tendency toward deception and manipulation, in line with the protection report. maurice norbert – stock.adobe.com

Earlier models also expressed “high-agency” — which sometimes included locking users out of their computer and reporting them via mass-emails to police or the media to show wrongdoing, the protection report stated.

Claude Opus 4 further attempted to “self-exfiltrate” — attempting to export its information to an out of doors venue — when presented with being retrained in ways in which it deemed “harmful” to itself, Anthropic stated in its safety report.

In other tests, Claude expressed the flexibility to “sandbag” tasks — “selectively underperforming” when it might probably tell that it was undergoing pre-deployment testing for a dangerous task, the corporate said.

“We’re again not acutely concerned about these observations. They show up only in exceptional circumstances that don’t suggest more broadly misaligned values,” the corporate said within the report.

Anthropic is a start-up backed by power-players Google and Amazon that goals to compete with likes of OpenAI.

IDOL’foto – stock.adobe.com

The corporate boasted that its Claude 3 Opus exhibited “near-human levels of comprehension and fluency on complex tasks.”

It has challenged the Department of Justice after it ruled that the tech titan holds an illegal monopoly over digital promoting and regarded declaring the same ruling on its artificial intelligence business. 

Anthropic has suggested DOJ proposals for the AI industry would dampen innovation and harm competition.

“Without Google partnerships with and investments in corporations like Anthropic, the AI frontier could be dominated by only the biggest tech giants — including Google itself — giving application developers and end users fewer alternatives,” Anthropic said in a letter to the DOJ earlier this month.

1

Do you trust technology Today?

Tags: AnthropicsblackmailClaudeengineerModelOpusThreatened
Share220Tweet138
INBV News

INBV News

Related Posts

edit post
Prepare for AI to ‘completely disrupt the whole lot’

Prepare for AI to ‘completely disrupt the whole lot’

by INBV News
November 1, 2025
0

Min-Liang Tan speaks during a conference at SXSW Sydney on October 16, 2024 in Sydney, Australia.Nina Franova | Getty ImagesArtificial...

edit post
Amazon shares soar as AI demand boosts cloud revenue

Amazon shares soar as AI demand boosts cloud revenue

by INBV News
October 31, 2025
0

Amazon’s cloud revenue rose on the fastest clip in nearly three years, helping the corporate forecast quarterly sales above estimates and...

edit post
All about Trump-Xi, Fed cuts and Big Tech earnings

All about Trump-Xi, Fed cuts and Big Tech earnings

by INBV News
October 30, 2025
0

The Google corporate office at The Hub constructing in Warsaw, Poland on Sept. sixteenth, 2025. Beata Zawrze | Nurphoto |...

edit post
Fiserve shares rocked after ‘shockingly bad’ earnings as latest CEO shakes up leadership

Fiserve shares rocked after ‘shockingly bad’ earnings as latest CEO shakes up leadership

by INBV News
October 29, 2025
0

Fiserv’s shares plummeted greater than 40% on Wednesday and were set for a record single-day drop after the payments software company...

edit post
Nvidia-supplier SK Hynix third-quarter profit jumps 62% to a record high

Nvidia-supplier SK Hynix third-quarter profit jumps 62% to a record high

by INBV News
October 29, 2025
0

A visitor looks at a model of SK hynix's high-bandwidth memory (HBM) technology in the course of the 2025 World...

Next Post
edit post
Dog owners slammed over ‘cruel’ recent viral trend

Dog owners slammed over 'cruel' recent viral trend

edit post
How one can watch Caitlin Clark in Fever vs. Liberty live at no cost

How one can watch Caitlin Clark in Fever vs. Liberty live at no cost

CATEGORIES

  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Podcast
  • Politics
  • Sports
  • Technology
  • Travel
  • Videos
  • Weather
  • World News

CATEGORY

  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Podcast
  • Politics
  • Sports
  • Technology
  • Travel
  • Videos
  • Weather
  • World News

SITE LINKS

  • About us
  • Contact us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
  • DMCA

[mailpoet_form id=”1″]

  • About us
  • Contact us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
  • DMCA

© 2022. All Right Reserved By Inbvnews.com

No Result
View All Result
  • Home
  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Politics
  • Sports
  • Technology
  • Travel
  • Weather
  • World News
  • Videos
  • More
    • Podcasts
    • Reels
    • Live Video Stream

© 2022. All Right Reserved By Inbvnews.com

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist