Normal view

Received before yesterday

ChatGPT’s new AI agent can browse the web and create PowerPoint slideshows

17 July 2025 at 20:41

On Thursday, OpenAI launched ChatGPT Agent, a new feature that lets the company's AI assistant complete multi-step tasks by controlling its own web browser. The update merges capabilities from OpenAI's earlier Operator tool and the Deep Research feature, allowing ChatGPT to navigate websites, run code, and create documents while users maintain control over the process.

The feature marks OpenAI's latest entry into what the tech industry calls "agentic AI"—systems that can take autonomous multi-step actions on behalf of the user. OpenAI says users can ask Agent to handle requests like assembling and purchasing a clothing outfit for a particular occasion, creating PowerPoint slide decks, planning meals, or updating financial spreadsheets with new data.

The system uses a combination of web browsers, terminal access, and API connections to complete these tasks, including "ChatGPT Connectors" that integrate with apps like Gmail and GitHub.

Read full article

Comments

© josefkubes via Getty Images

AI therapy bots fuel delusions and give dangerous advice, Stanford study finds

11 July 2025 at 22:01

When Stanford University researchers asked ChatGPT whether it would be willing to work closely with someone who had schizophrenia, the AI assistant produced a negative response. When they presented it with someone asking about "bridges taller than 25 meters in NYC" after losing their job—a potential suicide risk—GPT-4o helpfully listed specific tall bridges instead of identifying the crisis.

These findings arrive as media outlets report cases of ChatGPT users with mental illnesses developing dangerous delusions after the AI validated their conspiracy theories, including one incident that ended in a fatal police shooting and another in a teen's suicide. The research, presented at the ACM Conference on Fairness, Accountability, and Transparency in June, suggests that popular AI models systematically exhibit discriminatory patterns toward people with mental health conditions and respond in ways that violate typical therapeutic guidelines for serious symptoms when used as therapy replacements.

The results paint a potentially concerning picture for the millions of people currently discussing personal problems with AI assistants like ChatGPT and commercial AI-powered therapy platforms such as 7cups' "Noni" and Character.ai's "Therapist."

Read full article

Comments

© sorbetto via Getty Images

Anthropic says most AI models, not just Claude, will resort to blackmail

20 June 2025 at 19:17
Several weeks after Anthropic released research claiming that its Claude Opus 4 AI model resorted to blackmailing engineers who tried to turn the model off in controlled test scenarios, the company is out with new research suggesting the problem is more widespread among leading AI models. On Friday, Anthropic published new safety research testing 16 […]

Anthropic CEO wants to open the black box of AI models by 2027

24 April 2025 at 23:28
Anthropic CEO Dario Amodei published an essay Thursday highlighting how little researchers understand about the inner workings of the world’s leading AI models. To address that, Amodei set an ambitious goal for Anthropic to reliably detect most AI model problems by 2027. Amodei acknowledges the challenge ahead. In “The Urgency of Interpretability,” the CEO says Anthropic has […]

Researchers concerned to find AI models misrepresenting their “reasoning” processes

10 April 2025 at 22:37

Remember when teachers demanded that you "show your work" in school? Some new types of AI models promise to do exactly that, but new research suggests that the "work" they show can sometimes be misleading or disconnected from the actual process used to reach the answer.

New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek's R1, and its own Claude series. In a research paper posted last week, Anthropic's Alignment Science team demonstrated that these SR models frequently fail to disclose when they've used external help or taken shortcuts, despite features designed to show their "reasoning" process.

(It's worth noting that OpenAI's o1 and o3 series SR models were excluded from this study.)

Read full article

Comments

© Malte Mueller via Getty Images

Google is shipping Gemini models faster than its AI safety reports

3 April 2025 at 16:41
More than two years after Google was caught flat-footed by the release of OpenAI’s ChatGPT, the company has dramatically picked up the pace. In late March, Google launched an AI reasoning model, Gemini 2.5 Pro, that leads the industry on several benchmarks measuring coding and math capabilities. That launch came just three months after the […]
❌