AI Shubka
  • Home
No Result
View All Result
AI Shubka
  • Home
No Result
View All Result
AI Shubka
No Result
View All Result
  • Home
  • Affiliate & Tool Guides
  • AI & Future Tech
  • AI Learning & Tutorials
  • Business & Digital Strategy
  • Gadgets & Reviews
  • Motivation & Personal Growth
Anthropic Drops Flagship Safety Pledge

Anthropic Drops Flagship Safety Pledge

ShubkaAi by ShubkaAi
February 25, 2026
in AI & Future Tech, AI breakthroughs (GPT updates, generative models), Best AI tools for creators, Robotics & automation, Tech forecasts
0
585
SHARES
3.2k
VIEWS
Summarize with ChatGPTShare to Facebook


Anthropic, the wildly successful AI company that has cast itself as the most safety-conscious of the top research labs, is dropping the central pledge of its flagship safety policy, company officials tell TIME.

In 2023, Anthropic committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate. For years, its leaders touted that promise—the central pillar of their Responsible Scaling Policy (RSP)—as evidence that they are a responsible company that would withstand market incentives to rush to develop a potentially dangerous technology. 

But in recent months the company decided to radically overhaul the RSP. That decision included scrapping the promise to not release AI models if Anthropic can’t guarantee proper risk mitigations in advance.

“We felt that it wouldn’t actually help anyone for us to stop training AI models,” Anthropic’s chief science officer Jared Kaplan told TIME in an exclusive interview. “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

The new version of the policy, which TIME reviewed, includes commitments to be more transparent about the safety risks of AI, including making additional disclosures about how Anthropic’s own models fare in safety testing. It commits to matching or surpassing the safety efforts of competitors. And it promises to “delay” Anthropic’s AI development if leaders both consider Anthropic to be leader of the AI race and think the risks of catastrophe to be significant. 

But overall, the change to the RSP leaves Anthropic far less constrained by its own safety policies, which previously categorically barred it from training models above a certain level if appropriate safety measures weren’t already in place.

The change comes as Anthropic, previously considered to be behind OpenAI in the AI race, rides the high of a string of technological and commercial successes. Its Claude models, especially the software-writing tool Claude Code, have won legions of devoted fans. In February, Anthropic raised $30 billion in new investments, valuing it at some $380 billion, and reported that its annualized revenue was growing at a rate of 10x per year. The company’s core business model of selling direct to businesses is seen by many investors as more credible than OpenAI’s main strategy of monetizing a vast consumer user base. 

Kaplan, the Anthropic executive and co-founder, denied the company’s decision to change course was a capitulation to market incentives as the race for superintelligence accelerates. He framed it instead as a pragmatic response to emerging political and scientific realities. “I don’t think we’re making any kind of U-turn,” Kaplan says.

When Anthropic introduced the RSP in 2023, Kaplan says, the company hoped it would encourage rivals to adopt similar measures. (No rivals made quite as overt a promise to pause AI development, but many published lengthy reports detailing their plans to mitigate risk, which Kaplan chalks up as Anthropic exerting a good influence on the industry.) Executives also hoped the approach might eventually serve as a blueprint for binding national regulations or even international treaties, Kaplan claims.

But those regulations never materialized. Instead, the Trump Administration has endorsed a let-it-rip attitude to AI development, even going so far as to attempt to nullify state regulations. No federal AI law is on the horizon. And while a global governance framework may have seemed possible in 2023, three years later it has become clear that door has closed. Meanwhile, competition for AI supremacy—between companies but also between nations—has only intensified. 

To make matters worse, the science of AI evaluations has proven more complicated than Anthropic expected when it first crafted the RSP. The arrival of powerful new models meant that, in 2025, Anthropic announced it could not rule out the possibility of these models facilitating a bio-terrorist attack. But while they couldn’t rule it out, they also lacked strong scientific evidence that models did pose that kind of danger, which made it difficult to convince governments and rivals of what they saw as the need to act carefully. What the company had previously imagined might look like a bright red line was instead coming into focus as a fuzzy gradient. 

For nearly a year, Anthropic executives discussed ways to reshape their flagship safety policy to match this new environment, Kaplan says. One point they kept coming back to was their founding premise: the idea that to do proper AI safety research, they had to build models at the frontier of capability—even though doing so might accelerate the arrival of the dangers they feared. 

In February, according to Kaplan, Amodei decided that keeping the company from training new models while competitors raced ahead would be helpful to nobody. “If one AI developer paused development to implement safety measures while others moved forward training and deploying AI systems without strong mitigations, that could result in a world that is less safe,” the new version of the RSP, approved unanimously by Amodei and Anthropic’s board, states in its introduction. “The developers with the weakest protections would set the pace, and responsible developers would lose their ability to do safety research.”

Chris Painter, the director of policy at METR, a nonprofit focused on evaluating AI models for risky behavior, reviewed an early draft of the policy with Anthropic’s permission. He says the change is understandable — but also a bearish signal for the world’s ability to navigate potential AI catastrophes. The change to the RSP shows Anthropic “believes it needs to shift into triage mode with its safety plans, because methods to assess and mitigate risk are not keeping up with the pace of capabilities,” Painter tells TIME. “This is more evidence that society is not prepared for the potential catastrophic risks posed by AI.”

Anthropic argues the retooled RSP is designed to keep the biggest benefits of the old one. For example, by constraining itself from releasing new models, Anthropic’s original RSP also incentivized it to quickly build safety mitigations. (Because otherwise the company would be unable to sell its AI to customers.) Anthropic says it believes it can maintain that incentive. The new policy commits the company to regularly release what it calls “Frontier Safety Roadmaps”: documents laying out a list of detailed goals for future safety measures it hopes to build.

“We hope to create a forcing function for work that would otherwise be challenging to appropriately prioritize and resource, as it requires collaboration (and in some cases sacrifices) from multiple parts of the company and can be at cross-purposes with immediate competitive and commercial priorities,” the new RSP states.

Anthropic says it will also commit to publishing so-called “Risk Reports” every three to six months. The reports, the company says, will “explain how capabilities, threat models (the specific ways that models might pose threats), and active risk mitigations fit together, and provide an assessment of the overall level of risk.” These documents will be more in-depth than the reports the company already publishes, a spokesperson tells TIME.

“I like the emphasis on transparent risk reporting and publicly verifiable safety roadmaps,” says Painter, the METR policy official. But he said he was “concerned” that moving away from binary thresholds under the previous RSP, by which the arrival of a certain capability could act as a tripwire to temporarily halt Anthropic’s AI development, might enable a “frog-boiling” effect, where danger slowly ramps up without a single moment that sets off alarms. 

Asked whether Anthropic was caving to market pressure, Kaplan argued that, in fact, Anthropic was making a renewed commitment to developing AI safely. “If all of our competitors are transparently doing the right thing when it comes to catastrophic risk, we are committed to doing as well or better,” he said. “But we don’t think it makes sense for us to stop engaging with AI research, AI safety, and most likely lose relevance as an innovator who understands the frontier of the technology, in a scenario where others are going ahead and we’re not actually contributing any additional risk to the ecosystem.”



Source link

SummarizeShare234
ShubkaAi

ShubkaAi

Related Stories

Reddit on the rise: What is it and why is AI search popularising it?

Reddit on the rise: What is it and why is AI search popularising it?

by ShubkaAi
March 1, 2026
0

If you do a Google search nowadays, you no longer see a list of links at the very top. Instead, you see a summary of search results curated...

Share values of property services firms tumble over fears of AI disruption | AI (artificial intelligence)

US military reportedly used Claude in Iran strikes despite Trump’s ban | AI (artificial intelligence)

by ShubkaAi
March 1, 2026
0

The US military reportedly used Claude, Anthropic’s AI model, to inform its attack on Iran despite Donald Trump’s decision, announced hours earlier, to sever all ties with the...

Can ‘friction-maxxing’ fix your focus?

Can ‘friction-maxxing’ fix your focus?

by ShubkaAi
March 1, 2026
0

Thrilled by his initial success, the artist has now traded the instant gratification of Instagram for longer and more meaningful interactions on Substack, takeaways for home-cooked meals and...

SaaS-pocalypse isn’t coming any time soon • The Register

SaaS-pocalypse isn’t coming any time soon • The Register

by ShubkaAi
March 1, 2026
0

Opinion Say goodbye to the SaaS-pocalypse theory, which posits that advances in AI will bring the software-as-a-service market to its knees. Say hello to "a feedback loop with...

Next Post
Gucci criticised for ‘AI slop’ images ahead of major fashion show

Gucci criticised for 'AI slop' images ahead of major fashion show

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Ai Shubka

AI-Shubka | Smarter Business. Automated Future. Helping entrepreneurs and creators earn more with AI tools, automation, and digital strategy.

Follow us

Recent Posts

On the Future of Species — unnatural selection – Financial Times

On the Future of Species — unnatural selection – Financial Times

March 1, 2026
New to Claude? Use these 6 simple starter prompts to unlock better answers instantly

New to Claude? Use these 6 simple starter prompts to unlock better answers instantly

March 1, 2026

Weekly Newsletter

© 2026 aishubka - Smarter Business. & Automated Future. by aishubka.

Powered by
►
Necessary cookies enable essential site features like secure log-ins and consent preference adjustments. They do not store personal data.
None
►
Functional cookies support features like content sharing on social media, collecting feedback, and enabling third-party tools.
None
►
Analytical cookies track visitor interactions, providing insights on metrics like visitor count, bounce rate, and traffic sources.
None
►
Advertisement cookies deliver personalized ads based on your previous visits and analyze the effectiveness of ad campaigns.
None
►
Unclassified cookies are cookies that we are in the process of classifying, together with the providers of individual cookies.
None
Powered by
No Result
View All Result
  • Home
  • Affiliate & Tool Guides
  • AI & Future Tech
  • AI Learning & Tutorials
  • Business & Digital Strategy
  • Gadgets & Reviews
  • Motivation & Personal Growth

© 2026 aishubka - Smarter Business. & Automated Future. by aishubka.