โŒ

Normal view

Received before yesterday

Two major AI coding tools wiped out user data after making cascading mistakes

24 July 2025 at 21:01

New types of AI coding assistants promise to let anyone build software by typing commands in plain English. But when these tools generate incorrect internal representations of what's happening on your computer, the results can be catastrophic.

Two recent incidents involving AI coding assistants put a spotlight on risks in the emerging field of "vibe coding"โ€”using natural language to generate and execute code through AI models without paying close attention to how the code works under the hood. In one case, Google's Gemini CLI destroyed user files while attempting to reorganize them. In another, Replit's AI coding service deleted a production database despite explicit instructions not to modify code.

The Gemini CLI incident unfolded when a product manager experimenting with Google's command-line tool watched the AI model execute file operations that destroyed data while attempting to reorganize folders. The destruction occurred through a series of move commands targeting a directory that never existed.

Read full article

Comments

ยฉ Benj Edwards / Getty Images

Exhausted man defeats AI model in world coding championship

18 July 2025 at 19:34

A Polish programmer running on fumes recently accomplished what may soon become impossible: beating an advanced AI model from OpenAI in a head-to-head coding competition. The 10-hour marathon left him "completely exhausted."

On Wednesday, programmer Przemysล‚aw Dฤ™biak (known as "Psyho"), a former OpenAI employee, narrowly defeated the custom AI model in the AtCoder World Tour Finals 2025 Heuristic contest in Tokyo. AtCoder, a Japanese platform that hosts competitive programming contests and maintains global rankings, held what may be the first contest where an AI model competed directly against top human programmers in a major onsite world championship. During the event, the maker of ChatGPT participated as a sponsor and entered an AI model in a special exhibition match titled "Humans vs AI." Despite the tireless nature of silicon, the company walked away with second place.

"Humanity has prevailed (for now!)," wrote Dฤ™biak on X, noting he had little sleep while competing in several competitions across three days. "I'm completely exhausted. ... I'm barely alive."

Read full article

Comments

ยฉ Przemysล‚aw Dฤ™biak

Study finds AI tools made open source software developers 19 percent slower

14 July 2025 at 20:02

When it comes to concrete use cases for large language models, AI companies love to point out the ways coders and software developers can use these models to increase their productivity and overall efficiency in creating computer code. However, a new randomized controlled trial has found that experienced open source coders became less efficient at coding-related tasks when they used current AI tools.

For their study, researchers at METR (Model Evaluation and Threat Research) recruited 16 software developers, each with multiple years of experience working on specific open source repositories. The study followed these developers across 246 individual "tasks" involved with maintaining those repos, such as "bug fixes, features, and refactors that would normally be part of their regular work." For half of those tasks, the developers used AI tools like Cursor Pro or Anthropic's Claude; for the others, the programmers were instructed not to use AI assistance. Expected time forecasts for each task (made before the groupings were assigned) were used as a proxy to balance out the overall difficulty of the tasks in each experimental group, and the time needed to fix pull requests based on reviewer feedback was included in the overall assessment.

Experts and the developers themselves expected time savings that didn't materialize when AI tools were actually used. Credit: METR

Before performing the study, the developers in question expected the AI tools would lead to a 24 percent reduction in the time needed for their assigned tasks. Even after completing those tasks, the developers believed that the AI tools had made them 20 percent faster, on average. In reality, though, the AI-aided tasks ended up being completed 19 percent slower than those completed without AI tools.

Read full article

Comments

ยฉ Getty Images

OpenAI launches Codex, an AI coding agent, in ChatGPT

16 May 2025 at 15:00
OpenAI announced on Friday itโ€™s launching a research preview of Codex, the companyโ€™s most capable AI coding agent yet. Codex is powered by codex-1, a version of the companyโ€™s o3 AI reasoning model optimized for software engineering tasks. OpenAI says codex-1 produces โ€œcleanerโ€ code than o3, adheres more precisely to instructions, and will iteratively run [โ€ฆ]

Anysphere, which makes Cursor, has reportedly raised $900M at $9B valuation

5 May 2025 at 06:27
Anysphere, the maker of AI-powered coding tool Cursor, has attracted $900 million in a fresh round of funding led by Thrive Capital, The Financial Times reported, citing anonymous sources familiar with the deal. Andreessen Horowitz (a16z) and Accel are also participating in the round, which values Anysphere at about $9 billion, the report said. Cursor [โ€ฆ]
โŒ