Contents

One Year with AI Coding Assistants: What It Can and Can't Do

Context

September 2023. GPT-4 had been out for 6 months, Copilot had become my daily driver, Claude 2 just dropped. I’d been using AI coding assistants heavily for almost a year by then.

This isn’t a hype piece or a hit piece. Just honest engineering feedback after a year of real use.

Not asking “will AI replace programmers.” Just: in real daily development, does it help?

What It Can Do

1. Boilerplate Generation

Most reliable use case. Give AI a rough description, it generates standard templates quickly.

# Me: Write a FastAPI CRUD endpoint that takes user_id and returns user info
# AI output:
@router.get("/users/{user_id}")
async def get_user(user_id: int):
    user = db.query(User).filter(User.id == user_id).first()
    if not user:
        raise HTTPException(status_code=404, detail="User not found")
    return {"id": user.id, "name": user.name, "email": user.email}

Copilot does this well, rarely needs edits.

2. Quick API Lookup for Unfamiliar Libraries

When you need to use a new library without reading all the docs:

Me: How do I send emails with Python's sendgrid library? How do I configure API key and sender?

AI gives runnable example code, faster than docs. ~80% success rate in this scenario.

3. Regular Expressions

Probably Copilot’s most underrated feature.

Me: Write a regex to validate email format, not the simple version, need basic domain suffix checking
import re

email_pattern = re.compile(
    r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
)

AI-written regex is usually more comprehensive than what I’d write myself.

4. SQL Query Building

-- Me: Write a SQL to count orders by month, only count orders with amount > 100
SELECT 
    DATE_TRUNC('month', created_at) AS month,
    COUNT(*) AS order_count
FROM orders
WHERE amount > 100
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month DESC;

Very reliable.

What It Can’t Do

1. Complex Business Logic

Business logic has tons of implicit assumptions and context. AI doesn’t know your company’s specific rules—it can only infer from what’s in the prompt.

Typical failure:

Me: Write a function that calculates user levels and points

AI writes logic based on its “common sense” assumptions. But your company’s point rules might be completely different. Code that looks correct will have bugs in production.

Lesson: business logic code—AI can assist, not drive.

2. Multi-file Coordinated Changes

Easy: “change this function’s logic.” Hard: “change this function, then make sure all callers are compatible.”

AI lacks awareness of the entire codebase context. (At least in September 2023.)

3. Performance Optimization

Asking AI to optimize slow code—it often gives a plausible-sounding but actually wrong solution.

AI reasoning about performance frequently fails because it doesn’t understand data characteristics and system behavior.

4. Debugging Bugs It Caused

Most surprising finding: AI is extremely bad at debugging code it generated itself.

When you bring an error and ask “how do I fix this,” it tends to guess based on the most common error patterns rather than actually analyzing your current context.

Worse, it sometimes “confidently refactors” working code into new bugs.

Real Efficiency Numbers

I tracked roughly two months of data:

Task Type Time Saved Success Rate
Boilerplate generation ~40% 90%
API documentation lookup ~60% 80%
Regex ~70% 85%
SQL building ~50% 85%
Business logic code ~10% 40%
Bug debugging ~5% 30%

Conclusion: boilerplate = extremely high ROI, business logic = extremely low ROI.

My Actual Workflow

September 2023, my actual workflow:

Use AI for:

  • Unfamiliar library/framework APIs
  • Generating test cases (especially edge cases)
  • Regex
  • SQL
  • Documentation comments
  • Code translation (TypeScript ↔ JavaScript)

Don’t use AI for:

  • Business logic implementation
  • Architecture decisions
  • Debugging
  • Performance-sensitive code
  • Anything requiring understanding of existing company systems

Tool Mix

What I actually used:

  • Copilot: primary, mainly for code completion and simple generation
  • ChatGPT (GPT-4): complex tasks, multi-turn conversation, docs lookup and obscure problems
  • Claude 2: long-text analysis and code review, for when I needed longer context

Conclusion

AI coding assistant is an extremely powerful boilerplate generator plus an acceptable new-style search engine.

Treat it like an advanced parrot: it can imitate, it can generate, it looks like it knows what it’s doing—but it doesn’t understand business logic and doesn’t truly “think.”

Used well, it saves 20-30% of development time (mainly in boilerplate scenarios). Used poorly, it wastes time (especially business logic scenarios).

Biggest risk isn’t AI replacing you—it’s over-relying on AI for business logic code without validating, then shipping bugs.