2024 AI Coding: Pits I Fell Into and Lessons I Learned
Bottom Line First
At the start of 2024, I was skeptical about AI coding. By year end, I use it every day.
The biggest change wasn’t any single tool—it was AI coding going from “usable” to “actually good.”
Moments That Changed How I Work
1. Claude 3 Sonnet Released (February)
This was the turning point. Before, GPT-4 for coding assistance—$0.03/M tokens, fine for chatting, but writing actual code was lacking.
Sonnet at $0.003/M, 10x cheaper, but stronger for coding.
From that day, my API bill dropped 70%.
2. Cursor Went Mainstream (Mid-year)
Cursor wasn’t the first AI IDE, but it was the first one people actually migrated to.
I’d tried GitHub Copilot Workspace (which became Cursor)—mediocre experience. 2024 Cursor was a completely different product.
Cmd+K for inline AI editing is way more efficient than chat interfaces.
3. Claude Code Released (Mid-year)
Anthropic’s CLI tool.
Thought it was a gimmick at first. Used it for a week—it’s the strongest codebase analysis tool available.
200k tokens context, can throw entire codebases at it. No IDE needed, just terminal.
# This scenario Copilot and Cursor can't do
claude
> In this 50k-line codebase, which module most likely has issues?
# Claude Code actually analyzes and gives you an insightful answer4. The Devin Reality
Devin announced as “the first AI software engineer,” $500/month.
Tried it for two months. Verdict:
- Can do: small isolated tasks (write a script, automate a workflow)
- Can’t do: multi-file changes requiring business context, architectural decisions
Devin works as “outsourced labor,” not a daily dev tool.
Pits I Fell Into
Pit 1: Over-relying on AI-generated Tests
AI-generated tests have high coverage, but test what AI thought the logic was, not the real intent.
# AI wrote tests, looks like it covers everything
def test_calculate_discount():
assert calculate_discount(100, 10) == 90 # 10% off
assert calculate_discount(100, 0) == 100 # no discount
assert calculate_discount(0, 50) == 0 # zero amount
# But missed business rule: discount cannot exceed 50% of order_total
# AI didn't know this rule, tests passed, bug shippedLesson: review AI-written tests, especially edge cases.
Pit 2: AI Doesn’t Understand “Why”
Give AI a task, it delivers. Ask “why did you do it this way?"—it often gives a plausible-sounding but incorrect explanation.
# AI wrote this
def process_data(data):
# AI: "Using dict instead of list here because it's faster for lookups"
# Reality: only 100 items, list is fine
# AI is making up reasonsLesson: don’t ask AI to explain code—ask “what are the implications of this design decision.”
Pit 3: Context Loss
After coding with AI all day, realize the suggestions don’t match the project’s overall design.
AI only sees code in your current conversation, not the full picture.
Lesson: for complex tasks, have AI read the entire codebase first, then start modifying.
Real Numbers
My AI coding usage stats this year (estimates):
| Task Type | AI Success Rate | Human Fix Needed |
|---|---|---|
| Write tests | 85% | 30% |
| Simple functions | 95% | 10% |
| Code refactoring | 70% | 40% |
| Complex architecture | 30% | 80% |
| Bug fixes | 60% | 50% |
| Code explanation | 90% | 5% |
Conclusion: AI coding best for “do more, modify less” tasks. Complex architecture decisions still human.
2025 Predictions
Will Happen
- MCP (Model Context Protocol) becomes standard: standardized way for AI Agents to connect to external tools
- Cursor’s moat faces challenges: Copilot and Claude Code are catching up
- AI coding diverges: lightweight tools (completion, simple tasks) vs heavy tools (codebase analysis, complex refactoring)
Won’t Happen
- AI won’t replace programmers: but programmers who use AI will replace those who don’t
- Devin-style full-flow Agents won’t go mainstream: expensive and slow, good for specific scenarios only
- AI coding won’t eliminate bugs: may introduce harder-to-find ones instead
Conclusion
2024 AI coding’s significance: freed programmers from 50% of mechanical work.
Writing tests, simple functions, code explanation, routine refactoring—these used to consume huge portions of a programmer’s day. AI does them well enough.
But AI coding has limits: it doesn’t understand business logic, architecture, or “why.”
In 2025, the productivity gap between programmers who can use AI coding and those who can’t will widen further.