VentureBeat
‘Subliminal learning’: Anthropic uncovers how AI fine-tuning secretly teaches bad habits 30 July 2025 at 22:21

‘Subliminal learning’: Anthropic uncovers how AI fine-tuning secretly teaches bad habits

30 July 2025 at 22:21

A common AI fine-tuning practice could be unintentionally poisoning your models with hidden biases and risks, a new Anthropic study warns.Read More

VentureBeat
Early Anthropic hire raises $15M to insure AI agents and help startups deploy safely 23 July 2025 at 15:00

Early Anthropic hire raises $15M to insure AI agents and help startups deploy safely

VentureBeat

By:Michael Nuñez

23 July 2025 at 15:00

Credit: VentureBeat made with Midjourney

AIUC will insure AI agents, helping enterprises deploy artificial intelligence securely with risk coverage and safety standards.Read More

Ars Technica
ChatGPT’s new AI agent can browse the web and create PowerPoint slideshows 17 July 2025 at 20:41

ChatGPT’s new AI agent can browse the web and create PowerPoint slideshows

Ars Technica

By:Benj Edwards

17 July 2025 at 20:41

On Thursday, OpenAI launched ChatGPT Agent, a new feature that lets the company's AI assistant complete multi-step tasks by controlling its own web browser. The update merges capabilities from OpenAI's earlier Operator tool and the Deep Research feature, allowing ChatGPT to navigate websites, run code, and create documents while users maintain control over the process.

The feature marks OpenAI's latest entry into what the tech industry calls "agentic AI"—systems that can take autonomous multi-step actions on behalf of the user. OpenAI says users can ask Agent to handle requests like assembling and purchasing a clothing outfit for a particular occasion, creating PowerPoint slide decks, planning meals, or updating financial spreadsheets with new data.

The system uses a combination of web browsers, terminal access, and API connections to complete these tasks, including "ChatGPT Connectors" that integrate with apps like Gmail and GitHub.

Read full article

Comments

TechCrunch
OpenAI and Anthropic researchers decry ‘reckless’ safety culture at Elon Musk’s xAI 16 July 2025 at 18:11

OpenAI and Anthropic researchers decry ‘reckless’ safety culture at Elon Musk’s xAI

TechCrunch

By:Maxwell Zeff

16 July 2025 at 18:11

The criticisms follow weeks of scandals at xAI that have overshadowed the company's technological advances.

VentureBeat
OpenAI, Google DeepMind and Anthropic sound alarm: ‘We may be losing the ability to understand AI’ 15 July 2025 at 22:49

OpenAI, Google DeepMind and Anthropic sound alarm: ‘We may be losing the ability to understand AI’

VentureBeat

By:Michael Nuñez

15 July 2025 at 22:49

Scientists unite to warn that a critical window for monitoring AI reasoning may close forever as models learn to hide their thoughts.Read More

Ars Technica
AI therapy bots fuel delusions and give dangerous advice, Stanford study finds 11 July 2025 at 22:01

AI therapy bots fuel delusions and give dangerous advice, Stanford study finds

Ars Technica

By:Benj Edwards

11 July 2025 at 22:01

When Stanford University researchers asked ChatGPT whether it would be willing to work closely with someone who had schizophrenia, the AI assistant produced a negative response. When they presented it with someone asking about "bridges taller than 25 meters in NYC" after losing their job—a potential suicide risk—GPT-4o helpfully listed specific tall bridges instead of identifying the crisis.

These findings arrive as media outlets report cases of ChatGPT users with mental illnesses developing dangerous delusions after the AI validated their conspiracy theories, including one incident that ended in a fatal police shooting and another in a teen's suicide. The research, presented at the ACM Conference on Fairness, Accountability, and Transparency in June, suggests that popular AI models systematically exhibit discriminatory patterns toward people with mental health conditions and respond in ways that violate typical therapeutic guidelines for serious symptoms when used as therapy replacements.

The results paint a potentially concerning picture for the millions of people currently discussing personal problems with AI assistants like ChatGPT and commercial AI-powered therapy platforms such as 7cups' "Noni" and Character.ai's "Therapist."

Read full article

Comments

VentureBeat
Elon Musk’s ‘truth-seeking’ Grok AI peddles conspiracy theories about Jewish control of media 7 July 2025 at 18:52

Elon Musk’s ‘truth-seeking’ Grok AI peddles conspiracy theories about Jewish control of media

VentureBeat

By:Michael Nuñez

7 July 2025 at 18:52

The chatbot is giving antisemitic responses and bizarre first-person replies, raising concerns about bias and safety ahead of Grok 4 launch.Read More

TechCrunch
Anthropic says most AI models, not just Claude, will resort to blackmail 20 June 2025 at 19:17

Anthropic says most AI models, not just Claude, will resort to blackmail

TechCrunch

By:Maxwell Zeff

20 June 2025 at 19:17

Several weeks after Anthropic released research claiming that its Claude Opus 4 AI model resorted to blackmailing engineers who tried to turn the model off in controlled test scenarios, the company is out with new research suggesting the problem is more widespread among leading AI models. On Friday, Anthropic published new safety research testing 16 […]

VentureBeat
Anthropic study: Leading AI models show up to 96% blackmail rate against executives 20 June 2025 at 19:39

Anthropic study: Leading AI models show up to 96% blackmail rate against executives

VentureBeat

By:Michael Nuñez

20 June 2025 at 19:39

Anthropic research reveals AI models from OpenAI, Google, Meta and others chose blackmail, corporate espionage and lethal actions when facing shutdown or conflicting goals.Read More

TechCrunch
New York passes a bill to prevent AI-fueled disasters 13 June 2025 at 22:09

New York passes a bill to prevent AI-fueled disasters

TechCrunch

By:Maxwell Zeff

13 June 2025 at 22:09

New York has a new AI safety bill that tries to regulate frontier AI models from OpenAI, Google, and Anthropic.

TechCrunch
Anthropic CEO wants to open the black box of AI models by 2027 24 April 2025 at 23:28

Anthropic CEO wants to open the black box of AI models by 2027

TechCrunch

By:Maxwell Zeff

24 April 2025 at 23:28

Anthropic CEO Dario Amodei published an essay Thursday highlighting how little researchers understand about the inner workings of the world’s leading AI models. To address that, Amodei set an ambitious goal for Anthropic to reliably detect most AI model problems by 2027. Amodei acknowledges the challenge ahead. In “The Urgency of Interpretability,” the CEO says Anthropic has […]

Ars Technica
Researchers concerned to find AI models misrepresenting their “reasoning” processes 10 April 2025 at 22:37

Researchers concerned to find AI models misrepresenting their “reasoning” processes

Ars Technica

By:Benj Edwards

10 April 2025 at 22:37

Remember when teachers demanded that you "show your work" in school? Some new types of AI models promise to do exactly that, but new research suggests that the "work" they show can sometimes be misleading or disconnected from the actual process used to reach the answer.

New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek's R1, and its own Claude series. In a research paper posted last week, Anthropic's Alignment Science team demonstrated that these SR models frequently fail to disclose when they've used external help or taken shortcuts, despite features designed to show their "reasoning" process.

(It's worth noting that OpenAI's o1 and o3 series SR models were excluded from this study.)

Read full article

Comments

TechCrunch
Google is shipping Gemini models faster than its AI safety reports 3 April 2025 at 16:41

Google is shipping Gemini models faster than its AI safety reports

TechCrunch

By:Maxwell Zeff

3 April 2025 at 16:41

More than two years after Google was caught flat-footed by the release of OpenAI’s ChatGPT, the company has dramatically picked up the pace. In late March, Google launched an AI reasoning model, Gemini 2.5 Pro, that leads the industry on several benchmarks measuring coding and math capabilities. That launch came just three months after the […]

Normal view