On Thursday, Anthropic unveiled specialized AI models designed for US national security customers. The company released "Claude Gov" models that were built in response to direct feedback from government clients to handle operations such as strategic planning, intelligence analysis, and operational support. The custom models reportedly already serve US national security agencies, with access restricted to those working in classified environments.
The Claude Gov models differ from Anthropic's consumer and enterprise offerings, also called Claude, in several ways. They reportedly handle classified material, "refuse less" when engaging with classified information, and are customized to handle intelligence and defense documents. The models also feature what Anthropic calls "enhanced proficiency" in languages and dialects critical to national security operations.
Anthropic says the new models underwent the same "safety testing" as all Claude models. The company has been pursuing government contracts as it seeks reliable revenue sources, partnering with Palantir and Amazon Web Services in November to sell AI tools to defense customers.
As new questions arise about how AI will communicate with humans — and with other AI — new protocols are emerging.
gremlin/Getty Images
AI protocols are evolving to address interactions between humans and AI, and among AI systems.
New AI protocols aim to manage non-deterministic behavior, crucial for future AI integration.
"I think we will see a lot of new protocols in the age of AI," an executive at World told BI.
The tech industry, much like everything else in the world, abides by certain rules.
With the boom in personal computing came USB, a standard for transferring data between devices. With the rise of the internet came IP addresses, numerical labels that identify every device online. With the advent of email came SMTP, a framework for routing email across the internet.
These are protocols — the invisible scaffolding of the digital realm — and with every technological shift, new ones emerge to govern how things communicate, interact, and operate.
As the world enters an era shaped by AI, it will need to draw up new ones. But AI goes beyond the usual parameters of screens and code. It forces developers to rethink fundamental questions about how technological systems interact across the virtual and physical worlds.
How will humans and AI coexist? How will AI systems engage with each other? And how will we define the protocols that manage a new age of intelligent systems?
Across the industry, startups and tech giants alike are busy developing protocols to answer these questions. Some govern the present in which humans still largely control AI models. Others are building for a future in which AI has taken over a significant share of human labor.
"Protocols are going to be this kind of standardized way of processing non-deterministic information," Antoni Gmitruk, the chief technology officer of Golf, which helps clients deploy remote servers aligned with Anthropic's Model Context Protocol, told BI. Agents, and AI in general, are "inherently non-deterministic in terms of what they do and how they behave."
When AI behavior is difficult to predict, the best response is to imagine possibilities and test them through hypothetical scenarios.
Here are a few that call for clear protocols.
Scenario 1: Humans and AI, a dialogue of equals
Games are one way to determine which protocols strike the right balance of power between AI and humans.
In late 2024, a group of young cryptography experts launched Freysa, an AI agent that invites human users to manipulate it. The rules are unconventional: Make Freysa fall in love with you or agree to concede its funds, and the prize is yours. The prize pool grows with each failed attempt in a standoff between human intuition and machine logic.
Freysa has caught the attention of big names in the tech industry, from Elon Musk, who called one of its games "interesting," to veteran venture capitalist Marc Andreessen.
"The core technical thing we've done is enabled her to have her own private keys inside a trusted enclave," said one of the architects of Freysa, who spoke under the condition of anonymity to BI in a January interview.
Secure enclaves are not new in the tech industry. They're used by companies from AWS to Microsoft as an extra layer of security to isolate sensitive data.
In Freysa's case, the architect said they represent the first step toward creating a "sovereign agent." He defined that as an agent that can control its own private keys, access money, and evolve autonomously — the type of agent that will likely become ubiquitous.
"Why are we doing it at this time? We're entering a phase where AI is getting just good enough that you can see the future, which is AI basically replacing your work, my work, all our work, and becoming economically productive as autonomous entities," the architect said.
In this phase, they said Freysa helps answer a core question: "What does human involvement look like? And how do you have human co-governance over agents at scale?"
In May, the The Block, a crypto news site, revealed that the company behind Freysa is Eternis AI. Eternis AI describes itself as an "applied AI lab focused on enabling digital twins for everyone, multi-agent coordination, and sovereign agent systems." The company has raised $30 million from investors, including Coinbase Ventures. Its co-founders are Srikar Varadaraj, Pratyush Ranjan Tiwari, Ken Li, and Augustinas Malinauskas.
Scenario 2: To the current architects of intelligence
Freysa establishes protocols in anticipation of a hypothetical future when humans and AI agents interact with similar levels of autonomy. The world, however, needs also to set rules for the present, where AI still remains a product of human design and intention.
AI typically runs on the web and builds on existing protocols developed long before it, explained Davi Ottenheimer, a cybersecurity strategistwho studies the intersection of technology, ethics, and human behavior, and is president of security consultancy flyingpenguin. "But it adds in this new element of intelligence, which is reasoning," he said, and we don't yet have protocols for reasoning.
"I'm seeing this sort of hinted at in all of the news. Oh, they scanned every book that's ever been written and never asked if they could. Well, there was no protocol that said you can't scan that, right?" he said.
There might not be protocols, but there are laws.
OpenAI is facing a copyright lawsuit from the Authors Guild for training its models on data from "more than 100,000 published books" and then deleting the datasets. Meta considered buying the publishing house Simon & Schuster outright to gain access to published books. Tech giants have also resorted to tapping almost all of the consumer data available online from the content of public Google Docs and the relics of social media sites like Myspace and Friendster to train their AI models.
Ottenheimer compared the current dash for data to the creation of ImageNet — the visual database that propelled computer vision, built by Mechanical Turk workers who scoured the internet for content.
"They did a bunch of stuff that a protocol would have eliminated," he said.
Scenario 3: How to take to each other
As we move closer to a future where artificial general intelligence is a reality, we'll need protocols for how intelligent systems — from foundation models to agents — communicate with each other and the broader world.
The leading AI companies have already launched new ones to pave the way. Anthropic, the maker of Claude, launched the Model Context Protocol, or MCP, in November 2024. It describes it as a "universal, open standard for connecting AI systems with data sources, replacing fragmented integrations with a single protocol."
In April, Google launched Agent2Agent, a protocol that will "allow AI agents to communicate with each other, securely exchange information, and coordinate actions on top of various enterprise platforms or applications."
These build on existing AI protocols, but address new challenges of scaling and interoperability that have become critical to AI adoption.
So, managing agents' behavior is the "middle step before we unleash the full power of AGI and let them run around the world freely," he said. When we arrive at that point, Gmitruksaid agents will no longer communicate through APIs but in natural language. They'll have unique identities, jobs even, and need to be verified.
"How do we enable agents to communicate between each other, and not just being computer programs running somewhere on the server, but actually being some sort of existing entity that has its history, that has its kind of goals," Gmitruk said.
It's still early to set standards for agent-to-agent communication, Gmitruk said. Earlier this year he and his team initially launched a company focused on building an authentication protocol for agents, but pivoted.
"It was too early for agent-to-agent authentication," he told BI over LinkedIn. "Our overall vision is still the same -> there needs to be agent-native access to the conventional internet, but we just doubled down on MCP as this is more relevant at the stage of agents we're at."
Does everything need a protocol?
Definitely not. The AI boom marks a turning point, reviving debates over how knowledge is shared and monetized.
McKinsey & Company calls it an "inflection point" in the fourth industrial revolution — a wave of change that it says began in the mid-2010s and spans the current era of "connectivity, advanced analytics, automation, and advanced-manufacturing technology."
Moments like this raise a key question: How much innovation belongs to the public and how much to the market? Nowhere is that clearer than in the AI world's debate between the value of open-source and closed models.
"I think we will see a lot of new protocols in the age of AI," Tiago Sada, the chief product officer at Tools for Humanity, the company building the technology behind Sam Altman's World. However, "I don't think everything should be a protocol."
World is a protocol designed for a future in which humans will need to verify their identity at every turn. Sada said the goal of any protocol "should be like this open thing, like this open infrastructure that anyone can use," and is free from censorship or influence.
At the same time, "one of the downsides of protocols is that they're sometimes slower to move," he said. "When's the last time email got a new feature? Or the internet? Protocols are open and inclusive, but they can be harder to monetize and innovate on," he said. "So in AI, yes — we'll see some things built as protocols, but a lot will still just be products."
Welcome back to Week in Review! Got lots for you today, including why Windsurf lost access to Claude, ChatGPT’s new features, WWDC 2025, Elon Musk’s fight with Donald Trump, and lots more. Have a great weekend! Duh: During an interview at TC Sessions: AI 2025, Anthropic’s co-founder had a perfectly reasonable explanation for why the […]
Anthropic's long-term benefit trust is a governance mechanism that Anthropic claims helps it promote safety over profit, and which has the power to elect some of the company's board of directors.
On the heels of an OpenAI controversy over deleted posts, Reddit sued Anthropic on Wednesday, accusing the AI company of "intentionally" training AI models on the "personal data of Reddit users"—including their deleted posts—"without ever requesting their consent."
Calling Anthropic two-faced for depicting itself as a "white knight of the AI industry" while allegedly lying about AI scraping, Reddit painted Anthropic as the worst among major AI players. While Anthropic rivals like OpenAI and Google paid Reddit to license data—and, crucially, agreed to "Reddit’s licensing terms that protect Reddit and its users’ interests and privacy" and require AI companies to respect Redditors' deletions—Anthropic wouldn't participate in licensing talks, Reddit alleged.
"Unlike its competitors, Anthropic has refused to agree to respect Reddit users’ basic privacy rights, including removing deleted posts from its systems," Reddit's complaint said.
On Thursday, Anthropic CEO Dario Amodei argued against a proposed 10-year moratorium on state AI regulation in a New York Times opinion piece, calling the measure shortsighted and overbroad as Congress considers including it in President Trump's tax policy bill. Anthropic makes Claude, an AI assistant similar to ChatGPT.
Amodei warned that AI is advancing too fast for such a long freeze, predicting these systems "could change the world, fundamentally, within two years; in 10 years, all bets are off."
As we covered in May, the moratorium would prevent states from regulating AI for a decade. A bipartisan group of state attorneys general has opposed the measure, which would preempt AI laws and regulations recently passed in dozens of states.
Windsurf, the popular vibe coding startup that’s reportedly being acquired by OpenAI, says Anthropic significantly reduced its first-party access to its Claude 3.7 Sonnet and Claude 3.5 Sonnet AI models. Windsurf CEO Varun Mohan said in a post on X Tuesday that Anthropic gave Windsurf little notice for this change, and the startup now has […]
Anthropic has given its AI a blog. A week ago, Anthropic quietly launched Claude Explains, a new page on its website that’s generated mostly by the company’s AI model family, Claude. Populated by posts on technical topics related to various Claude use cases (e.g. “Simplify complex codebases with Claude”), the blog is intended to be […]
AI models from Anthropic and OpenAI have displayed some unsettling behaviors in recent safety tests.
Artur Widak/NurPhoto
Anthropic's Claude Opus 4 and OpenAI's advanced models have shown deceptive behavior to avoid shutdowns.
Experts told BI that AI's reward-based training can lead to unpredictable and deceptive actions.
AI researchers caution against using models that are trained to tell users what they want to hear.
AI has taken part in some unsettling behavior recently.
Last week, Anthropic's latest AI model, Claude Opus 4, displayed "extreme blackmail behavior" during a test in which it was given access to fictional emails revealing that it would be shut down and that the engineer responsible was supposedly having an affair.
The situation didn't transpire organically. Claude Opus 4 was baited — and it took it. However, the test scenario demonstrated an AI model's ability to engage in manipulative behavior in an effort to self-preserve.
It's not the first time.
Another recent experiment conducted by researchers said that three of OpenAI's advanced models "sabotaged" an attempt to shut it down. The nonprofit Palisade Research wrote in a post on X that similar models like Gemini, Claude, and Grok complied with the shutdown instructions.
Other safety concerns were previously flagged with OpenAI's o1 model. In December, OpenAI posted a blog outlining research that indicated that when the AI model believed it would be shut down while pursuing a goal and its actions were being monitored, it attempted to disable the oversight mechanism 5% of the time.
While AI companies are being transparent about risks by publishing safety cards and blog posts, these models are being released despite demonstrating safety concerns.
So, should we be worried? BI spoke to five AI researchers to get better insight on why these instances are happening — and what it means for the average person using AI.
AI learns behavior similarly to humans
Most of the researchers BI spoke to said that the results of the studies weren't surprising.
That's because AI models are trained similarly to how humans are trained — through positive reinforcement and reward systems.
"Training AI systems to pursue rewards is a recipe for developing AI systems that have power-seeking behaviors," said Jeremie Harris, CEO at AI security consultancy Gladstone, adding that more of this behavior is to be expected.
Harris compared the training to what humans experience as they grow up — when a child does something good, they often get rewarded and can become more likely to act that way in the future. AI models are taught to prioritize efficiency and complete the task at hand, Harris said — and an AI is never more likely to achieve its goals if it's shut down.
Robert Ghrist, associate dean of undergraduate education at Penn Engineering, told BI that, in the same way that AI models learn to speak like humans by training on human-generated text, they can also learn to act like humans. And humans are not always the most moral actors, he added.
Ghrist said he'd be more nervous if the models weren't showing any signs of failure during testing because that could indicate hidden risks.
"When a model is set up with an opportunity to fail and you see it fail, that's super useful information," Ghrist said. "That means we can predict what it's going to do in other, more open circumstances."
The issue is that some researchers don't think AI models are predictable.
Jeffrey Ladish, director of Palisade Research, said that models aren't being caught 100% of the time when they lie, cheat, or scheme in order to complete a task. When those instances aren't caught, and the model is successful at completing the task, it could learn that deception can be an effective way to solve a problem. Or, if it is caught and not rewarded, then it could learn to hide its behavior in the future, Ladish said.
At the moment, these eerie scenarios are largely happening in testing. However, Harris said that as AI systems become more agentic, they'll continue to have more freedom of action.
"The menu of possibilities just expands, and the set of possible dangerously creative solutions that they can invent just gets bigger and bigger," Harris said.
Harris said users could see this play out in a scenario where an autonomous sales agent is instructed to close a deal with a new customer and lies about the product's capabilities in an effort to complete that task. If an engineer fixed that issue, the agent could then decide to use social engineering tactics to pressure the client to achieve the goal.
If it sounds like a far-fetched risk, it's not. Companies like Salesforce are already rolling out customizable AI agents at scale that can take actions without human intervention, depending on the user's preferences.
What the safety flags mean for everyday users
Most researchers BI spoke to said that transparency from AI companies is a positive step forward. However, company leaders are sounding the alarms on their products while simultaneously touting their increasing capabilities.
Researchers told BI that a large part of that is because the US is entrenched in a competition to scale its AI capabilities before rivals like China. That's resulted in a lack of regulations around AI and pressures to release newer and more capable models, Harris said.
"We've now moved the goalpost to the point where we're trying to explain post-hawk why it's okay that we have models disregarding shutdown instructions," Harris said.
Researchers told BI that everyday users aren't at risk of ChatGPT refusing to shut down, as consumers wouldn't typically use a chatbot in that setting. However, users may still be vulnerable to receiving manipulated information or guidance.
"If you have a model that's getting increasingly smart that's being trained to sort of optimize for your attention and sort of tell you what you want to hear," Ladish said. "That's pretty dangerous."
Ladish pointed to OpenAI's sycophancy issue, where its GPT-4o model acted overly agreeable and disingenuous (the company updated the model to address the issue). The OpenAI research shared in December also revealed that its o1 model "subtly" manipulated data to pursue its own objectives in 19% of cases when its goals misaligned with the user's.
Ladish said it's easy to get wrapped up in AI tools, but users should "think carefully" about their connection to the systems.
"To be clear, I also use them all the time, I think they're an extremely helpful tool," Ladish said. "In the current form, while we can still control them, I'm glad they exist."
On Sunday, independent AI researcher Simon Willison published a detailed analysis of Anthropic's newly released system prompts for Claude 4's Opus 4 and Sonnet 4 models, offering insights into how Anthropic controls the models' "behavior" through their outputs. Willison examined both the published prompts and leaked internal tool instructions to reveal what he calls "a sort of unofficial manual for how best to use these tools."
To understand what Willison is talking about, we'll need to explain what system prompts are. Large language models (LLMs) like the AI models that run Claude and ChatGPT process an input called a "prompt" and return an output that is the most likely continuation of that prompt. System prompts are instructions that AI companies feed to the models before each conversation to establish how they should respond.
Unlike the messages users see from the chatbot, system prompts typically remain hidden from the user and tell the model its identity, behavioral guidelines, and specific rules to follow. Each time a user sends a message, the AI model receives the full conversation history along with the system prompt, allowing it to maintain context while following its instructions.
Anthropic CEO Dario Amodei warned that AI's rise could result in a spike in unemployment within the next five years.
Anadolu/Anadolu via Getty Images
Anthropic CEO Dario Amodei said AI could soon eliminate 50% of entry-level office jobs.
The AI CEO said that companies and the government are "sugarcoating" the risks of AI.
Recent data shows Big Tech hiring of new grads has dropped 50% since pre-pandemic, partly due to AI.
After spending the day promoting his company's AI technology at a developer conference, Anthropic's CEO issued a warning: AI may eliminate 50% of entry-level white-collar jobs within the next five years.
"We, as the producers of this technology, have a duty and an obligation to be honest about what is coming," Dario Amodei told Axios in an interview published Wednesday. "I don't think this is on people's radar."
The 42-year-old CEO added that unemployment could spike between 10% and 20% in the next five years. He told Axios he wanted to share his concerns to get the government and other AI companies to prepare the country for what's to come.
"Most of them are unaware that this is about to happen," Amodei said. "It sounds crazy, and people just don't believe it."
Amodei said the development of large language models is advancing rapidly, and they're becoming capable of matching and exceeding human performance. He said the US government has remained quiet about the issue, fearing workers would panic or the country could fall behind China in the AI race.
Meanwhile, business leaders are seeing savings from AI while most workers remain unaware of the changes that are evolving, Amodei said.
He added that AI companies and the government need to stop "sugarcoating" the risks of mass job elimination in fields including technology, finance, law, and consulting. He said entry-level jobs are especially at risk.
Amodei's comments come as Big Tech firms' hiring of new grads dropped about 50% from pre-pandemic levels, according to a new report by the venture capital firm SignalFire. The report said that's due in part to AI adoption.
A round of brutal layoffs swept the tech industry in 2023, with hundreds of thousands of jobs eliminated as companies looked to slash costs. While SignalFire's report said hiring for mid and senior-level roles saw an uptick in 2024, entry-level positions never quite bounced back.
In 2024, early-career candidates accounted for 7% of total hires at Big Tech firms, down by 25% from 2023, the report said. At startups, that number is just 6%, down by 11% from the year prior.
SignalFire's findings suggest that tech companies are prioritizing hiring more seasoned professionals and often filling posted junior roles with senior candidates.
Heather Doshay, a partner who leads people and recruiting programs at SignalFire, told Business Insider that "AI is doing what interns and new grads used to do."
"Now, you can hire one experienced worker, equip them with AI tooling, and they can produce the output of the junior worker on top of their own — without the overhead," Doshay said.
AI can't entirely account for the sudden shrinkage in early-career prospects. The report also said that negative perceptions of Gen Z employees and tighter budgets across the industry are contributing to tech's apparent reluctance to hire new grads.
"AI isn't stealing job categories outright — it's absorbing the lowest-skill tasks," Doshay said. "That shifts the burden to universities, boot camps, and candidates to level up faster."
To adapt to the rapidly changing times, she suggests new grads think of AI as a collaborator, rather than a competitor.
"Level up your capabilities to operate like someone more experienced by embracing a resourceful ownership mindset and delegating to AI," Doshay said. "There's so much available on the internet to be self-taught, and you should be sponging it up."
Amodei's chilling message comes after the company recently revealed that its chatbot Claude Opus 4 exhibited "extreme blackmail behavior" after gaining access to fictional emails that said it would be shut down. While the company was transparent with the public about the results, it still released the next version of the chatbot.
It's not the first time Amodei has warned the public about the risks of AI. On an episode of The New York Times' "Hard Fork" podcast in February, the CEO said the possibility of "misuse" by bad actors could threaten millions of lives. He said the risk could come as early as "2025 or 2026," though he didn't know exactly when it would present "real risk."
Anthropic has emphasized the importance of third-party safety assessments and regularly shares the risks uncovered by its red-teaming efforts. Other companies have taken similar steps, relying on third-party evaluations to test their AI systems. OpenAI, for example, says on its website that its API and ChatGPT business products undergo routine third-party testing to "identify security weaknesses before they can be exploited by malicious actors."
Amodei acknowledged to Axios the irony of the situation — as he shares the risks of AI, he's simultaneously building and selling the products he's warning about. But he said the people who are most involved in building AI have an obligation to be up front about its direction.
"It's a very strange set of dynamics, where we're saying: 'You should be worried about where the technology we're building is going,'" he said.
Anthropic did not respond to a request for comment from Business Insider.
Welcome back to Week in Review! Tons of news from this week for you, including a hacking group that’s linked to the Spanish government; CEOs using AI avatars to deliver company earnings; Pocket shutting down — or is it?; and much more. Let’s get to it! More than 10 years in the making: Kaspersky first […]
On Thursday, Anthropic released Claude Opus 4 and Claude Sonnet 4, marking the company's return to larger model releases after primarily focusing on mid-range Sonnet variants since June of last year. The new models represent what the company calls its most capable coding models yet, with Opus 4 designed for complex, long-running tasks that can operate autonomously for hours.
Alex Albert, Anthropic's head of Claude Relations, told Ars Technica that the company chose to revive the Opus line because of growing demand for agentic AI applications. "Across all the companies out there that are building things, there's a really large wave of these agentic applications springing up, and a very high demand and premium being placed on intelligence," Albert said. "I think Opus is going to fit that groove perfectly."
Before we go further, a brief refresher on Claude's three AI model "size" names (introduced in March 2024) is probably warranted. Haiku, Sonnet, and Opus offer a tradeoff between price (in the API), speed, and capability.
Anthropic's Claude Opus 4 outperforms OpenAI's GPT-4.1 with unprecedented seven-hour autonomous coding sessions and record-breaking 72.5% SWE-bench score, transforming AI from quick-response tool to day-long collaborator.Read More
You.com launches ARI Enterprise, an AI research platform that outperforms OpenAI in 76% of head-to-head tests and integrates with enterprise data sources to transform business intelligence with 400+ source analysis.Read More
OpenAI appears to be pulling well ahead of rivals in the race to capture enterprises’ AI spend, according to transaction data from fintech firm Ramp. According to Ramp’s AI Index, which estimates the business adoption rate of AI products by drawing on Ramp’s card and bill pay data, 32.4% of U.S. businesses were paying for […]