One Year with AI Coding Assistants: What It Can and Can't Do
Context
September 2023. GPT-4 had been out for 6 months, Copilot had become my daily driver, Claude 2 just dropped. I’d been using AI coding assistants heavily for almost a year by then.
This isn’t a hype piece or a hit piece. Just honest engineering feedback after a year of real use.
Not asking “will AI replace programmers.” Just: in real daily development, does it help?
What It Can Do
1. Boilerplate Generation
Most reliable use case. Give AI a rough description, it generates standard templates quickly.
# Me: Write a FastAPI CRUD endpoint that takes user_id and returns user info
# AI output:
@router.get("/users/{user_id}")
async def get_user(user_id: int):
user = db.query(User).filter(User.id == user_id).first()
if not user:
raise HTTPException(status_code=404, detail="User not found")
return {"id": user.id, "name": user.name, "email": user.email}Copilot does this well, rarely needs edits.
2. Quick API Lookup for Unfamiliar Libraries
When you need to use a new library without reading all the docs:
Me: How do I send emails with Python's sendgrid library? How do I configure API key and sender?AI gives runnable example code, faster than docs. ~80% success rate in this scenario.
3. Regular Expressions
Probably Copilot’s most underrated feature.
Me: Write a regex to validate email format, not the simple version, need basic domain suffix checkingimport re
email_pattern = re.compile(
r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
)AI-written regex is usually more comprehensive than what I’d write myself.
4. SQL Query Building
-- Me: Write a SQL to count orders by month, only count orders with amount > 100
SELECT
DATE_TRUNC('month', created_at) AS month,
COUNT(*) AS order_count
FROM orders
WHERE amount > 100
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month DESC;Very reliable.
What It Can’t Do
1. Complex Business Logic
Business logic has tons of implicit assumptions and context. AI doesn’t know your company’s specific rules—it can only infer from what’s in the prompt.
Typical failure:
Me: Write a function that calculates user levels and pointsAI writes logic based on its “common sense” assumptions. But your company’s point rules might be completely different. Code that looks correct will have bugs in production.
Lesson: business logic code—AI can assist, not drive.
2. Multi-file Coordinated Changes
Easy: “change this function’s logic.” Hard: “change this function, then make sure all callers are compatible.”
AI lacks awareness of the entire codebase context. (At least in September 2023.)
3. Performance Optimization
Asking AI to optimize slow code—it often gives a plausible-sounding but actually wrong solution.
AI reasoning about performance frequently fails because it doesn’t understand data characteristics and system behavior.
4. Debugging Bugs It Caused
Most surprising finding: AI is extremely bad at debugging code it generated itself.
When you bring an error and ask “how do I fix this,” it tends to guess based on the most common error patterns rather than actually analyzing your current context.
Worse, it sometimes “confidently refactors” working code into new bugs.
Real Efficiency Numbers
I tracked roughly two months of data:
| Task Type | Time Saved | Success Rate |
|---|---|---|
| Boilerplate generation | ~40% | 90% |
| API documentation lookup | ~60% | 80% |
| Regex | ~70% | 85% |
| SQL building | ~50% | 85% |
| Business logic code | ~10% | 40% |
| Bug debugging | ~5% | 30% |
Conclusion: boilerplate = extremely high ROI, business logic = extremely low ROI.
My Actual Workflow
September 2023, my actual workflow:
Use AI for:
- Unfamiliar library/framework APIs
- Generating test cases (especially edge cases)
- Regex
- SQL
- Documentation comments
- Code translation (TypeScript ↔ JavaScript)
Don’t use AI for:
- Business logic implementation
- Architecture decisions
- Debugging
- Performance-sensitive code
- Anything requiring understanding of existing company systems
Tool Mix
What I actually used:
- Copilot: primary, mainly for code completion and simple generation
- ChatGPT (GPT-4): complex tasks, multi-turn conversation, docs lookup and obscure problems
- Claude 2: long-text analysis and code review, for when I needed longer context
Conclusion
AI coding assistant is an extremely powerful boilerplate generator plus an acceptable new-style search engine.
Treat it like an advanced parrot: it can imitate, it can generate, it looks like it knows what it’s doing—but it doesn’t understand business logic and doesn’t truly “think.”
Used well, it saves 20-30% of development time (mainly in boilerplate scenarios). Used poorly, it wastes time (especially business logic scenarios).
Biggest risk isn’t AI replacing you—it’s over-relying on AI for business logic code without validating, then shipping bugs.