TechCrunch
New York passes a bill to prevent AI-fueled disasters 13 June 2025 at 22:09

New York passes a bill to prevent AI-fueled disasters

TechCrunch

By:Maxwell Zeff

13 June 2025 at 22:09

New York has a new AI safety bill that tries to regulate frontier AI models from OpenAI, Google, and Anthropic.

VentureBeat
Anthropic faces backlash to Claude 4 Opus behavior that contacts authorities, press if it thinks you’re doing something ‘egregiously immoral’ 22 May 2025 at 19:12

Anthropic faces backlash to Claude 4 Opus behavior that contacts authorities, press if it thinks you’re doing something ‘egregiously immoral’

VentureBeat

By:Carl Franzen

22 May 2025 at 19:12

Bowman later edited his tweet and the following one in a thread to read as follows, but it still didn't convince the naysayers.Read More

VentureBeat
Time Magazine appears to accidentally publish embargoed story confirming new Anthropic model 22 May 2025 at 15:26

Time Magazine appears to accidentally publish embargoed story confirming new Anthropic model

VentureBeat

By:Carl Franzen

22 May 2025 at 15:26

A robot reads TIME Magazine against a red backdrop

Someone also appears to have published a full scrape of the Time article online on the news aggregator app Newsbreak.Read More

VentureBeat
OpenAI overrode concerns of expert testers to release sycophantic GPT-4o 2 May 2025 at 17:57

OpenAI overrode concerns of expert testers to release sycophantic GPT-4o

VentureBeat

By:Carl Franzen

2 May 2025 at 17:57

Diverse young adults in glasses stare at large glowing multicolored screen

Once again, it shows the importance of incorporating more domains beyond the traditional math and computer science into AI development.Read More

VentureBeat
Does RAG make LLMs less safe? Bloomberg research reveals hidden dangers 28 April 2025 at 10:00

Does RAG make LLMs less safe? Bloomberg research reveals hidden dangers

VentureBeat

By:Sean Michael Kerner

28 April 2025 at 10:00

Credit: Image generated by VentureBeat with FLUX-pro-1.1-ultra

RAG is supposed to make enterprise AI more accurate, but it could potentially also make it less safe according to new research.Read More

TechCrunch
Anthropic CEO wants to open the black box of AI models by 2027 24 April 2025 at 23:28

Anthropic CEO wants to open the black box of AI models by 2027

TechCrunch

By:Maxwell Zeff

24 April 2025 at 23:28

Anthropic CEO Dario Amodei published an essay Thursday highlighting how little researchers understand about the inner workings of the world’s leading AI models. To address that, Amodei set an ambitious goal for Anthropic to reliably detect most AI model problems by 2027. Amodei acknowledges the challenge ahead. In “The Urgency of Interpretability,” the CEO says Anthropic has […]

Ars Technica
Researchers concerned to find AI models misrepresenting their “reasoning” processes 10 April 2025 at 22:37

Researchers concerned to find AI models misrepresenting their “reasoning” processes

Ars Technica

By:Benj Edwards

10 April 2025 at 22:37

Remember when teachers demanded that you "show your work" in school? Some new types of AI models promise to do exactly that, but new research suggests that the "work" they show can sometimes be misleading or disconnected from the actual process used to reach the answer.

New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek's R1, and its own Claude series. In a research paper posted last week, Anthropic's Alignment Science team demonstrated that these SR models frequently fail to disclose when they've used external help or taken shortcuts, despite features designed to show their "reasoning" process.

(It's worth noting that OpenAI's o1 and o3 series SR models were excluded from this study.)

Read full article

Comments

TechCrunch
Google is shipping Gemini models faster than its AI safety reports 3 April 2025 at 16:41

Google is shipping Gemini models faster than its AI safety reports

TechCrunch

By:Maxwell Zeff

3 April 2025 at 16:41

More than two years after Google was caught flat-footed by the release of OpenAI’s ChatGPT, the company has dramatically picked up the pace. In late March, Google launched an AI reasoning model, Gemini 2.5 Pro, that leads the industry on several benchmarks measuring coding and math capabilities. That launch came just three months after the […]

Normal view