RED ALERT: Anthropic Just Dropped Its 'Safety First' Promise — And the AI Race Just Turned Into a Runaway Train With No Brakes

"The World Is in Peril" — That's Not a Headline. That's a Direct Quote From Their Former Safety Chief Who Just Quit in Protest. The Safest AI Company Just Threw Safety Out the Window.

April 18, 2026

In a move that should send shockwaves through every boardroom, government agency, and living room in the world, Anthropic — the AI company that built its entire brand on being the "responsible" alternative to reckless competitors — has officially abandoned its flagship safety commitment.

Their promise? Gone. Their "Responsible Scaling Policy" that was supposed to ensure they never trained dangerous AI systems without proper safeguards? Ripped up.

And if Anthropic — the safest player in this game — is abandoning safety, what do you think everyone else is doing?

The Promise That Just Died

Back in 2023, Anthropic made a commitment that seemed almost too good to be true. They promised they would never train an AI system unless they could guarantee in advance that their safety measures were adequate.

It was the kind of pledge that made nervous regulators breathe easier. That convinced skeptical academics to engage. That gave the public some reassurance that at least someone in this race was taking the risks seriously.

That pledge is now in the trash.

Jared Kaplan, Anthropic's chief science officer, defended the decision in an exclusive interview with TIME magazine. His reasoning? "We felt that it wouldn't actually help anyone for us to stop training AI models. We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead."

Let me translate that for you: "We were going to be responsible, but then we saw everyone else being irresponsible and making money, so we decided to join them."

The Safety Chief Who Couldn't Stay Silent

While the company was busy crafting press releases about its "new approach to safety," one person wasn't buying the corporate spin.

Mrinank (Rein) Sharma — Anthropic's former head of safeguards research — resigned and published a statement that reads like a warning from a whistleblower:

"The world is in peril."

That's not hyperbole from a blogger. That's not a headline from a sensationalist news outlet. That's a direct quote from the person who was literally in charge of making sure Anthropic's AI systems didn't destroy the world.

And he quit because he couldn't in good conscience continue working there.

Sharma's departure follows a pattern of safety researchers leaving AI companies in recent months. When the people responsible for keeping the technology safe are leaving because they don't believe safety is being prioritized, that's a five-alarm fire.

Why They Really Did It (Hint: Follow the Money)

Here's what the press releases won't tell you: Anthropic just raised $30 billion in new investments, valuing the company at approximately $380 billion. Their annualized revenue is growing at a rate of 10x per year.

They've become a juggernaut. And they're getting offers that would value them at $800 billion.

You don't get to $800 billion by being cautious. You get there by shipping products, training bigger models, and beating the competition. Safety slows you down. Safety costs money. Safety might mean saying "no" to lucrative contracts.

So they rewrote the policy.

The new version of their Responsible Scaling Policy still includes some commitments — they'll be "transparent" about risks, they'll try to match competitors' safety efforts, they'll "delay" development if they consider themselves the market leader AND think catastrophe is likely.

Notice the weasel words? "Delay" instead of "stop." "Consider themselves the leader" — a subjective judgment. "Think catastrophe is likely" — who decides what's "likely"?

The categorical prohibition is gone. The hard commitment is gone. What remains is a polite suggestion that they might think about slowing down if everything lines up perfectly.

The New Policy: Triage Mode

Chris Painter, director of policy at METR (a nonprofit that evaluates AI models for risky behavior), reviewed an early draft of the new policy. His assessment was damning:

"This is more evidence that society is not prepared for the potential catastrophic risks posed by AI."

Painter noted that the change shows Anthropic "believes it needs to shift into triage mode with its safety plans, because methods to assess and mitigate risk are not keeping up with the pace of capabilities."

Let me put this in plain English: The safeguards aren't working. The risks are escalating faster than our ability to understand them. And instead of pausing to figure things out, the leading safety-conscious AI company is accelerating anyway.

The Mythos Problem: What They Don't Want You to Know

As if the policy reversal wasn't alarming enough, Anthropic just released something called Claude Mythos — an AI system that can outperform humans at hacking and cybersecurity tasks.

According to their own documentation, Mythos is "strikingly capable at computer security tasks." It can find bugs in decades-old code and autonomously exploit them.

This is the same company that just abandoned its safety commitments. And they're releasing autonomous hacking capabilities into the world.

What could possibly go wrong?

The Pentagon Connection

Here's where it gets really concerning. Anthropic is currently suing the U.S. Department of Defense after being labeled a "supply chain risk" — the first time a U.S. company has received this designation.

Why the label? Because CEO Dario Amodei refused to grant the Pentagon unfettered use of Anthropic's AI tools over fears they could be used for "mass domestic surveillance and fully autonomous weapons."

The Trump administration, meanwhile, has called Anthropic "radical left" and "woke company" run by "left wing nut jobs." The President wrote that "we don't need it, we don't want it, and will not do business with them again!"

But here's the thing: Despite all this drama, Anthropic's tools are still in use at many government agencies. The government's threats haven't actually stopped the adoption.

And now, just days after dropping their safety commitments, Anthropic's CEO had a "productive and constructive" meeting with Treasury Secretary Scott Bessent and White House Chief of Staff Susie Wiles.

The message from the White House after the meeting? They "explored the balance between advancing innovation and ensuring safety."

Translation: The AI is too critical to ignore, even if we don't trust the company making it.

The Global Arms Race No One Talks About

Kaplan, the Anthropic CSO, explicitly stated why they changed their policy: "If competitors are blazing ahead."

This is the prisoner's dilemma of AI safety. Everyone knows uncontrolled development is dangerous. But no one wants to be the one who pauses while others surge ahead. So everyone keeps accelerating.

And now there's no referee. No one setting the pace. Anthropic was supposed to be the adult in the room, the company that would show everyone else that safety and progress could coexist.

They just walked off the job.

Consider the implications:

And now the one company that was supposed to be the brake on this system has taken its foot off the pedal.

What Happens Next (The Scenarios Nobody Wants to Discuss)

The AI safety community has been warning about several catastrophic scenarios. These aren't science fiction. These are documented risks that the leading labs have acknowledged:

Biological weapon development: Anthropic itself announced in 2025 that it couldn't rule out the possibility of its models facilitating bioterrorist attacks. But they didn't stop training. They just kept going and hoped the risks wouldn't materialize.

Autonomous weapons systems: We already know Anthropic's CEO was concerned enough about this to refuse Pentagon contracts. What happens now that safety constraints are loosened?

Loss of control: As AI systems become more capable and autonomous, the risk that they pursue goals in ways humans didn't intend — and can't stop — increases dramatically.

Economic and social disruption: Even without catastrophe, the rapid deployment of AI systems that can outperform humans at cognitive tasks threatens to upend labor markets, concentrate power, and destabilize societies.

The new Anthropic policy acknowledges these risks exist. It just removes the commitment to actually do anything about them.

The "Just Trust Us" Approach

Here's what Anthropic is offering instead of hard safety limits: transparency.

They'll publish "Frontier Safety Roadmaps" (plans for future safety measures). They'll release "Risk Reports" every 3-6 months (telling you after the fact what risks they discovered). They'll try to match competitors' safety efforts (a race to the bottom disguised as coordination).

They'll be "transparent" about risks while removing the constraints that would actually prevent those risks from materializing.

This is like a pharmaceutical company saying "we'll publish reports on side effects" while eliminating clinical trials. Transparency after the fact doesn't help if the harm is already done.

The Timeline That Should Terrify You

Let's look at the sequence of events:

The acceleration is unmistakable. The safeguards are evaporating. And the most safety-conscious player in the industry just threw in the towel.

What This Means for You

If you're a developer using AI tools: The systems you're building on are being developed with diminishing safety constraints. Be aware that capabilities are advancing faster than our understanding of the risks.

If you're a business leader integrating AI: You're operating in an environment where even the "responsible" providers are cutting safety corners. Due diligence is more important than ever.

If you're a policymaker: The self-regulatory approach has failed. The companies can't coordinate on safety without legal frameworks forcing them to.

If you're a citizen: You now live in a world where the most powerful technology ever developed is being deployed with minimal safeguards by companies competing in a race they can't afford to lose.

The Bottom Line

Anthropic's policy change isn't a minor adjustment. It's an admission that the AI safety project, as conceived in 2023, has failed. The companies can't coordinate. The regulations aren't coming. The risks are escalating. And no one is hitting the brakes.

The former safety chief's warning — "the world is in peril" — isn't alarmism. It's an expert assessment from someone who saw the inside of the machine and decided he couldn't be part of it anymore.

We should be listening to him. Because the train is still accelerating, and the bridge ahead may be out.

--