Normal view

8 August 2025 at 21:17

The pressure is on for OpenAI to prove that GPT-5 isn’t just an incremental update, but a true step forward.Read More

VentureBeat
OpenAI launches GPT-5, nano, mini and Pro — not AGI, but capable of generating ‘software-on-demand’ 7 August 2025 at 17:01

OpenAI launches GPT-5, nano, mini and Pro — not AGI, but capable of generating ‘software-on-demand’

VentureBeat

7 August 2025 at 17:01

OpenAI CEO and co-founder Sam Altman wearing Fantastic Four style suit surrounded by similarly dressed men and women holding computer peripherals and cords in front of yellow sunburst

With safer design, more robust reasoning, expanded developer tooling, and broad user access, GPT-5 reflects a maturing AI ecosystem.Read More

VentureBeat
Google releases Olympiad medal-winning Gemini 2.5 ‘Deep Think’ AI publicly — but there’s a catch… 1 August 2025 at 15:39

Google releases Olympiad medal-winning Gemini 2.5 ‘Deep Think’ AI publicly — but there’s a catch…

VentureBeat

1 August 2025 at 15:39

The Gemini 2.5 Deep Think released to users is not that same competition model, rather, a lower performing but apparently faster version.Read More

VentureBeat
Deep Cogito goes big, releasing 4 new open source hybrid reasoning models with self-improving ‘intuition’ 31 July 2025 at 21:58

Deep Cogito goes big, releasing 4 new open source hybrid reasoning models with self-improving ‘intuition’

VentureBeat

31 July 2025 at 21:58

AI image comic book style illustration deep sea diver floats through green hued water above a treasure chest open to reveal glowing yellow circuit boards with more circuitry surrounding on the sea floor

Arora explains this as a difference between searching for a path versus already knowing roughly where the destination lies.Read More

Ars Technica
OpenAI’s most capable AI model, GPT-5, may be coming in August 25 July 2025 at 19:59

OpenAI’s most capable AI model, GPT-5, may be coming in August

Ars Technica

25 July 2025 at 19:59

On Thursday, The Verge reported that OpenAI is preparing to launch GPT-5 as early as August, according to sources familiar with the company's plans. The report comes five months after CEO Sam Altman first laid out a roadmap for the next-generation AI model that would unify the company's various AI capabilities. OpenAI CEO Sam Altman revealed in a post on X last week that the company plans to release GPT-5 "soon."

According to The Verge's Tom Warren, Microsoft engineers began preparing server capacity for GPT-5 as early as late May, but testing and development challenges pushed the timeline back. During an appearance on Theo Von's podcast this week, Altman demonstrated the model's capabilities by having it answer a question he couldn't. "I put it in the model, this is GPT-5, and it answered it perfectly," Altman said, saying it gave him a "weird feeling" to see the AI model answer a question that he couldn't.

GPT-5 has been a highly anticipated release since the launch of GPT-4 in March 2023. In fact, we first wrote about rumors of GPT-5's launch in March 2024, but it appears that GPT-5 did not materialize last year because the company saved the "GPT-5" name for a future release.

VentureBeat
New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples 25 July 2025 at 23:27

New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples

VentureBeat

25 July 2025 at 23:27

Credit: VentureBeat made with Midjourney

Hierarchical Reasoning Models (HRM) tackle complex reasoning tasks while being smaller, faster, and more data-efficient than large AI models.Read More

VentureBeat
It’s Qwen’s summer: new open source Qwen3-235B-A22B-Thinking-2507 tops OpenAI, Gemini reasoning models on key benchmarks 25 July 2025 at 15:47

It’s Qwen’s summer: new open source Qwen3-235B-A22B-Thinking-2507 tops OpenAI, Gemini reasoning models on key benchmarks

VentureBeat

25 July 2025 at 15:47

A humanoid cyborg of gunmetal shiny blue wearing pink sunglasses and matching shirt drinks a pink cocktail in a glass while holding white surfboard surrounded by mostly caucasian people in bathing suits against a blue sky

The new Qwen3-Thinking-2507, as we'll call it for short, now leads or closely trails top-performing models across several major benchmarks.Read More

VentureBeat
Anthropic researchers discover the weird AI problem: Why thinking longer makes models dumber 22 July 2025 at 22:27

Anthropic researchers discover the weird AI problem: Why thinking longer makes models dumber

VentureBeat

By:Michael Nuñez

22 July 2025 at 22:27

Anthropic research reveals AI models perform worse with extended reasoning time, challenging industry assumptions about test-time compute scaling in enterprise deployments.Read More

VentureBeat
Alibaba’s new open source Qwen3-235B-A22B-2507 beats Kimi-2 and offers low compute version 22 July 2025 at 20:56

Alibaba’s new open source Qwen3-235B-A22B-2507 beats Kimi-2 and offers low compute version

VentureBeat

22 July 2025 at 20:56

Robot holding laptop between neoclassical temples in minimalist dark underlit AI artwork

Teams can scale Qwen3’s capabilities to single-node GPU instances or local development machines, avoiding the need for massive GPU clusters.Read More

VentureBeat
Google DeepMind makes AI history with gold medal win at world’s toughest math competition 21 July 2025 at 22:33

Google DeepMind makes AI history with gold medal win at world’s toughest math competition

VentureBeat

By:Michael Nuñez

21 July 2025 at 22:33

Google DeepMind's Gemini AI won a gold medal at the International Mathematical Olympiad by solving complex math problems using natural language, marking a breakthrough in AI reasoning and human-level performance.Read More

TechCrunch
Mistral’s Le Chat chatbot gets a productivity push with new ‘deep research’ mode 17 July 2025 at 15:21

Mistral’s Le Chat chatbot gets a productivity push with new ‘deep research’ mode

TechCrunch

By:Rebecca Bellan

17 July 2025 at 15:21

French AI lab Mistral introduced a range of new features to its Le Chat chatbot on Thursday that bring it closer to the capabilities of rivals like OpenAI and Google. The new update includes a “deep research” mode, native multilingual reasoning, and advanced image editing.

VentureBeat
OpenAI, Google DeepMind and Anthropic sound alarm: ‘We may be losing the ability to understand AI’ 15 July 2025 at 22:49

OpenAI, Google DeepMind and Anthropic sound alarm: ‘We may be losing the ability to understand AI’

VentureBeat

By:Michael Nuñez

15 July 2025 at 22:49

Scientists unite to warn that a critical window for monitoring AI reasoning may close forever as models learn to hide their thoughts.Read More

VentureBeat
A new paradigm for AI: How ‘thinking as optimization’ leads to better general-purpose models 11 July 2025 at 22:26

A new paradigm for AI: How ‘thinking as optimization’ leads to better general-purpose models

VentureBeat

11 July 2025 at 22:26

A new AI model learns to "think" longer on hard problems, achieving more robust reasoning and better generalization to novel, unseen tasks.Read More

VentureBeat
Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs by 30% 3 July 2025 at 22:00

Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs by 30%

VentureBeat

3 July 2025 at 22:00

Sakana AI's new inference-time scaling technique uses Monte-Carlo Tree Search to orchestrate multiple LLMs to collaborate on complex tasks.Read More

VentureBeat
The rise of prompt ops: Tackling hidden AI costs from bad inputs and context bloat 27 June 2025 at 20:00

The rise of prompt ops: Tackling hidden AI costs from bad inputs and context bloat

VentureBeat

By:Taryn Plumb

27 June 2025 at 20:00

AI models can get fatigued, prompt ops can help manage, measure, monitor and tune prompts.Read More

VentureBeat
Google’s Gemini transparency cut leaves enterprise developers ‘debugging blind’ 20 June 2025 at 12:00

Google’s Gemini transparency cut leaves enterprise developers ‘debugging blind’

VentureBeat

20 June 2025 at 12:00

Why is Google hiding Gemini's reasoning traces? The decision sparks a debate over black-box models versus the need for transparency.Read More

Ars Technica
New study shows why simulated reasoning AI models don’t yet live up to their billing 25 April 2025 at 21:43

New study shows why simulated reasoning AI models don’t yet live up to their billing

Ars Technica

25 April 2025 at 21:43

There's a curious contradiction at the heart of today's most capable AI models that purport to "reason": They can solve routine math problems with accuracy, yet when faced with formulating deeper mathematical proofs found in competition-level challenges, they often fail.

That's the finding of eye-opening preprint research into simulated reasoning (SR) models, initially listed in March and updated in April, that mostly fell under the news radar. The research serves as an instructive case study on the mathematical limitations of SR models, despite sometimes grandiose marketing claims from AI vendors.

What sets simulated reasoning models apart from traditional large language models (LLMs) is that they have been trained to output a step-by-step "thinking" process (often called "chain-of-thought") to solve problems. Note that "simulated" in this case doesn't mean that the models do not reason at all but rather that they do not necessarily reason using the same techniques as humans. That distinction is important because human reasoning itself is difficult to define.

Ars Technica
OpenAI releases new simulated reasoning models with full tool access 16 April 2025 at 22:21

OpenAI releases new simulated reasoning models with full tool access

Ars Technica

16 April 2025 at 22:21

On Wednesday, OpenAI announced the release of two new models—o3 and o4-mini—that combine simulated reasoning capabilities with access to functions like web browsing and coding. These models mark the first time OpenAI's reasoning-focused models can use every ChatGPT tool simultaneously, including visual analysis and image generation.

OpenAI announced o3 in December, and until now, only less capable derivative models named "o3-mini" and "03-mini-high" have been available. However, the new models replace their predecessors—o1 and o3-mini.

OpenAI is rolling out access today for ChatGPT Plus, Pro, and Team users, with Enterprise and Edu customers gaining access next week. Free users can try o4-mini by selecting the "Think" option before submitting queries. OpenAI CEO Sam Altman tweeted that "we expect to release o3-pro to the pro tier in a few weeks."

Ars Technica
Researchers concerned to find AI models misrepresenting their “reasoning” processes 10 April 2025 at 22:37

Researchers concerned to find AI models misrepresenting their “reasoning” processes

Ars Technica

10 April 2025 at 22:37

Remember when teachers demanded that you "show your work" in school? Some new types of AI models promise to do exactly that, but new research suggests that the "work" they show can sometimes be misleading or disconnected from the actual process used to reach the answer.

New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek's R1, and its own Claude series. In a research paper posted last week, Anthropic's Alignment Science team demonstrated that these SR models frequently fail to disclose when they've used external help or taken shortcuts, despite features designed to show their "reasoning" process.

(It's worth noting that OpenAI's o1 and o3 series SR models were excluded from this study.)