Stack Overflow data reveals the hidden productivity tax of βalmost rightβ AI code

Stack Overflow survey shows that as more enterprise developers actually use AI tools, their expectations aren't being met by reality.Read More
New types of AI coding assistants promise to let anyone build software by typing commands in plain English. But when these tools generate incorrect internal representations of what's happening on your computer, the results can be catastrophic.
Two recent incidents involving AI coding assistants put a spotlight on risks in the emerging field of "vibe coding"βusing natural language to generate and execute code through AI models without paying close attention to how the code works under the hood. In one case, Google's Gemini CLI destroyed user files while attempting to reorganize them. In another, Replit's AI coding service deleted a production database despite explicit instructions not to modify code.
The Gemini CLI incident unfolded when a product manager experimenting with Google's command-line tool watched the AI model execute file operations that destroyed data while attempting to reorganize folders. The destruction occurred through a series of move commands targeting a directory that never existed.
Β© Benj Edwards / Getty Images
We've been expecting it for a while, and now it's here: OpenAI has introduced an agentic coding tool called Codex in research preview. The tool is meant to allow experienced developers to delegate rote and relatively simple programming tasks to an AI agent that will generate production-ready code and show its work along the way.
Codex is a unique interface (not to be confused with the Codex CLI tool introduced by OpenAI last month) that can be reached from the side bar in the ChatGPT web app. Users enter a prompt and then click either "code" to have it begin producing code, or "ask" to have it answer questions and advise.
Whenever it's given a task, that task is performed in a distinct container that is preloaded with the user's codebase and is meant to accurately reflect their development environment.
Β© OpenAI