Normal view
-
VentureBeat
- Anthropic faces backlash to Claude 4 Opus behavior that contacts authorities, press if it thinks you’re doing something ‘egregiously immoral’
Anthropic faces backlash to Claude 4 Opus behavior that contacts authorities, press if it thinks you’re doing something ‘egregiously immoral’

Bowman later edited his tweet and the following one in a thread to read as follows, but it still didn't convince the naysayers.Read More
-
VentureBeat
- Time Magazine appears to accidentally publish embargoed story confirming new Anthropic model
Time Magazine appears to accidentally publish embargoed story confirming new Anthropic model

Someone also appears to have published a full scrape of the Time article online on the news aggregator app Newsbreak.Read More
OpenAI overrode concerns of expert testers to release sycophantic GPT-4o

Once again, it shows the importance of incorporating more domains beyond the traditional math and computer science into AI development.Read More
Does RAG make LLMs less safe? Bloomberg research reveals hidden dangers

RAG is supposed to make enterprise AI more accurate, but it could potentially also make it less safe according to new research.Read More
Anthropic CEO wants to open the black box of AI models by 2027
Researchers concerned to find AI models misrepresenting their “reasoning” processes
Remember when teachers demanded that you "show your work" in school? Some new types of AI models promise to do exactly that, but new research suggests that the "work" they show can sometimes be misleading or disconnected from the actual process used to reach the answer.
New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek's R1, and its own Claude series. In a research paper posted last week, Anthropic's Alignment Science team demonstrated that these SR models frequently fail to disclose when they've used external help or taken shortcuts, despite features designed to show their "reasoning" process.
(It's worth noting that OpenAI's o1 and o3 series SR models were excluded from this study.)
© Malte Mueller via Getty Images