I have been writing code for fifteen years. For the last month, I made myself use every major AI coding assistant — GitHub Copilot, Amazon CodeWhisperer, Cursor, and a handful of smaller ones — for real work. Not demos. Not toy projects. Real pull requests that shipped to production.
Here is what I learned: the flashy demos are mostly lies. Not maliciously. It is just that demo code is always a greenfield React component or a Python one-liner, and real code is a gnarly migration from an ORM that was deprecated three versions ago. The assistants handle the first kind okay and the second kind poorly.
That said, I am not going back. Let me explain why.
What the demos get right
Copilot has gotten noticeably better since last year. The inline completions feel faster, and they trigger less often on boilerplate that I would have typed anyway. The context window seems wider — it picks up on things I did five minutes ago in a different file, which is spooky but useful.
Cursor, on the other hand, lets you chat through a refactor in natural language. I used it to rename a deeply nested set of API routes and it did not miss a single reference. That is genuinely impressive. A human would have missed one.
But here is the thing I do not hear anyone say out loud: these tools are incredible for code you hate writing and terrible for code you enjoy writing. If you are happy to churn out test cases and config files, the assistant is your best friend. If you are designing an architecture or debugging a race condition, it is mostly noise.
Where they fall apart
I hit a nasty bug in production last week. A background job was silently failing because of a serialization edge case. I spent three hours trying to get an AI to help debug it. The tools suggested reasonable-looking fixes that did not address the root cause. In one case, Copilot suggested catching a broader exception, which would have hidden the bug entirely and made it worse.
That is the danger, right? The tools are confident and wrong. They do not know when they do not know. I have talked to a dozen engineers about this over the past month, and the ones with the most experience are the most skeptical. The junior engineers love it. I am not sure who is right.
Probably both. The juniors get a productivity boost because the tool fills in gaps they would have spent hours Googling. The seniors get frustrated because the tool fills in gaps they did not have, and now they have to clean up the mess.
The real productivity gain
The one thing nobody disputes is that AI assistants make you faster at getting started. A blank file is intimidating. A file with a few AI-generated lines is not. That alone is worth something.
I found myself spending less time on Stack Overflow and more time actually writing code. Whether the code was correct was a separate question, but I was definitely writing more of it. And some of it was even good.
My advice after a month: use the tools, but keep the tests. If you do not have tests, you are flying blind. If you have good tests, the AI can generate wrong code all day and your test suite will catch it. That is the setup that actually works.
Final thought
I am keeping Copilot. I am turning it off for debugging sessions. That is my compromise. Your mileage will vary depending on what kind of code you write and how much you trust a machine that sounds confident and is often wrong.