RESTful API Best Practices

2023-12-19 2331 words 11 minutes

Contents

Most REST API advice is too shallow to help once an API has real traffic, multiple clients, and years of compatibility baggage. The hard part is not remembering what GET or POST means. The hard part is keeping the contract stable while product requirements keep changing.

This is the checklist I use when reviewing APIs that are expected to survive more than one quarter.

Start from resources, not controller actions

A lot of APIs look RESTful on the surface and still behave like RPC underneath.

Bad signs:

/createOrder
/getUserProfile
/updateUserStatus
/searchProductsByCategory

These names usually mirror backend handlers, not domain resources. They make the API drift quickly because every new workflow becomes a new verb-shaped endpoint.

A healthier starting point is to model stable nouns:

/users
/orders
/products
/invoices
/shipments

Then represent actions through HTTP semantics and sub-resources where it makes sense.

        
GET    /v1/orders/ord_123
POST   /v1/orders
PATCH  /v1/orders/ord_123
POST   /v1/orders/ord_123/cancel

That last endpoint is intentionally not “pure REST”. That is fine. Some domain actions are clearer as explicit commands. The mistake is pretending every operation maps cleanly to CRUD when it does not.

My rule: use resource-oriented design by default, but do not contort the API to avoid a small number of explicit command endpoints.

Keep URLs boring

URLs should be predictable enough that clients can guess them.

Good conventions:

use plural nouns for collections: /users, /orders
use lowercase
prefer hyphens if you need separators
put identifiers in the path, filters in the query string
avoid file extensions in URLs
do not encode business operations into query parameters

Examples:

        
GET /v1/users/usr_42
GET /v1/orders?customer_id=cus_9&status=paid&limit=50
GET /v1/audit-events?actor_id=usr_42&sort=-created_at

Avoid mixing representations of the same concept:

        
/users/42
/user/42
/getUser?id=42

Pick one shape and keep it.

Choose a versioning strategy before the first breaking change

Teams often postpone versioning until they need it. By then the API is already in production and every choice is painful.

For public APIs, path versioning is still the least surprising option:

        
/v1/orders
/v2/orders

It is not the most elegant, but it is easy to route, cache, document, log, and discuss with client teams.

Header-based versioning can work, but it makes debugging and traffic analysis harder. If you use it, have a strong reason.

More important than where the version sits is what counts as a breaking change.

Breaking changes usually include:

removing fields
renaming fields
changing field types
changing enum values incompatibly
changing pagination semantics
tightening validation for previously accepted input
changing error formats
reinterpreting status codes

Non-breaking changes usually include:

adding new optional response fields
adding new endpoints
adding new optional request fields
adding new enum values, if clients were told enums are open-ended

That last one matters. If clients hard-code a closed set of enum values, “just adding one more status” can still break them in practice.

Compatibility is a product decision, not just an engineering one

APIs drift when the team treats compatibility as best effort.

Write down a compatibility policy. At minimum, answer these questions:

How long is a major API version supported?
What notice do clients get before a breaking change?
Are response objects forward-compatible by design?
Can clients rely on field presence, ordering, and defaults?
Which headers are stable contract, and which are internal?

If you do nothing else, make these two rules explicit:

Existing clients must ignore unknown fields.
Servers must not remove or repurpose existing fields inside a version.

That one discipline prevents a lot of accidental breakage.

Standardize a small set of status codes well

Do not publish a giant HTTP status code encyclopedia. Most teams only need a small, consistently applied subset.

A sane baseline:

200 OK for successful reads and updates with a body
201 Created for successful creation
202 Accepted for async work that has not finished yet
204 No Content for successful operations with no body
400 Bad Request for malformed requests
401 Unauthorized when authentication is missing or invalid
403 Forbidden when the caller is authenticated but not allowed
404 Not Found when the resource does not exist or is intentionally hidden
409 Conflict for state conflicts
412 Precondition Failed for failed optimistic concurrency checks
422 Unprocessable Entity for semantic validation failures
429 Too Many Requests for rate limiting
500 Internal Server Error for unexpected server failures
503 Service Unavailable for overload or dependency outage

The key is not to use all of them. The key is to define exactly when your API uses each one.

A common trap is returning 400 for everything. That makes client behavior sloppy because malformed JSON, failed business validation, and stale write conflicts all look the same.

Make error responses machine-friendly

Error payloads need to serve two audiences:

client code deciding what to do next
humans trying to debug production issues

A practical shape:

        
        
        
    
{
  "error": {
    "code": "validation_failed",
    "message": "One or more fields are invalid.",
    "details": [
      {
        "field": "email",
        "reason": "invalid_format"
      },
      {
        "field": "age",
        "reason": "must_be_greater_than_or_equal",
        "value": 15,
        "min": 18
      }
    ],
    "request_id": "req_01ht9z6r8m2f"
  }
}

Guidelines:

code should be stable and documented
message can be human-readable and change slightly
details should point to actionable fields or constraints
include a request or trace ID for support and log correlation
do not leak stack traces, SQL, or internal hostnames to clients

If your clients need localization, keep code stable and localize the human message elsewhere. Do not make application logic depend on English strings.

Validate early, but separate syntax from business rules

Validation usually gets muddled because teams lump everything into one bucket.

Keep these layers distinct:

transport validation: malformed JSON, wrong content type, missing required fields
schema validation: type mismatches, length constraints, allowed formats
business validation: domain rules, permissions, current resource state

Example:

POST /v1/payouts

        
{
  "account_id": "acc_123",
  "amount": -50,
  "currency": "USD"
}

Possible outcomes:

malformed JSON -> 400
"amount" is a string instead of number -> 400 or 422, depending on your convention
amount must be positive -> 422
account exists but is frozen -> 409 or 422, depending on whether you treat it as state conflict or validation
caller cannot access this account -> 403

Different teams draw the line slightly differently. That is fine. Inconsistency inside one API is not.

Be explicit about idempotency

Retries happen. Clients retry on network timeouts, load balancers retry upstream, job workers retry after crashes.

If the API cannot tolerate repeated requests, it will create duplicates in production.

GET, PUT, and DELETE are expected to be idempotent by semantics. POST usually is not, unless you make it so.

For create-like operations that may be retried, support an idempotency key:

        
POST /v1/payments
Idempotency-Key: 8b2f6f2a-4d60-4b42-9d4f-1a59a1b5d6aa

Server behavior:

if the request is new, process it and store the result by key
if the same key and same payload arrives again, return the original result
if the same key arrives with a different payload, reject it

Typical rejection:

        
        
        
    
{
  "error": {
    "code": "idempotency_key_reused",
    "message": "The idempotency key was already used with a different request.",
    "request_id": "req_01ht9zz8k1ad"
  }
}

Do not pretend clients can “just not retry”. They will.

Pagination, filtering, and sorting should be predictable

List endpoints are where APIs often become inconsistent fastest.

Pagination

For small internal APIs, offset pagination is often enough:

GET /v1/orders?limit=50&offset=100

It is simple and easy to explain, but it becomes unstable on large, frequently changing datasets.

For high-volume or append-heavy data, cursor pagination is usually safer:

GET /v1/orders?limit=50&after=ord_01ht...

Response example:

        
        
        
    
{
  "data": [
    {
      "id": "ord_1001",
      "status": "paid"
    }
  ],
  "page": {
    "next_cursor": "ord_1001",
    "has_more": true
  }
}

If you use cursor pagination:

define sort order clearly
make the cursor opaque
keep cursor lifetime and invalidation rules documented

Filtering

Use query parameters for filtering, but keep the grammar restrained.

Good:

        
GET /v1/users?status=active&team_id=team_42
GET /v1/invoices?created_at_gte=2025-01-01T00:00:00Z

Bad:

GET /v1/search?q=status:active team:42 sort:created

Unless you are deliberately building a search API, hidden mini-languages become hard to document and validate.

Sorting

Make sort syntax consistent everywhere.

Example:

GET /v1/orders?sort=-created_at,total_amount

Document:

which fields are sortable
default order
how null values are treated
whether sort is stable

Draw a hard line between authentication and authorization

These two are still mixed up in many APIs.

authentication answers: who is calling?
authorization answers: what can they do?

The boundary should show up clearly in the API contract.

Examples:

invalid token -> 401
valid token, wrong tenant -> 403
valid token, resource does not exist in that tenant -> often 404 to avoid information leakage

Be careful with scopes and roles. Keep authorization decisions close to the resource boundary, not sprinkled across handlers in inconsistent ways.

A practical pattern:

middleware verifies identity and basic token validity
handlers or service layer enforce resource-level permissions
authorization failures emit one consistent error shape

Also: do not overstuff identity claims into JWTs just because it is convenient. Long-lived stale claims create weird authorization bugs.

Use cache semantics intentionally

Caching is not just for CDNs. It is how you keep repeated reads cheap and predictable.

For cacheable reads, set explicit headers:

        
Cache-Control: public, max-age=60
ETag: "usr_42:v17"

For user-specific or sensitive responses:

Cache-Control: private, no-store

If a response must be revalidated before reuse:

Cache-Control: no-cache

Common mistakes:

returning no cache headers and hoping intermediaries behave sensibly
marking personalized data as public
emitting weak or meaningless ETags that do not track the actual representation
forgetting that cache semantics are part of the API contract

If you do not want caches involved, say so explicitly.

Protect updates with optimistic concurrency

Lost updates are common in APIs that allow concurrent writes.

A simple pattern:

client fetches resource
server returns an ETag
client updates with If-Match
server rejects if resource changed since the read

Example:

        
GET /v1/users/usr_42
ETag: "user-42-v7"

Then:

        
PATCH /v1/users/usr_42
If-Match: "user-42-v7"
Content-Type: application/json

If another update already changed the resource:

        
HTTP/1.1 412 Precondition Failed

This is a better contract than silent last-write-wins for resources that users edit in dashboards or multiple services update asynchronously.

Use it where stale writes are costly: profile updates, order state changes, inventory records, config resources.

Async operations need a first-class contract

Some work should not block a request:

report generation
large imports
video transcoding
bulk backfills
external provisioning

Do not hide these behind long-running synchronous requests and hope timeouts are generous enough.

A cleaner pattern:

POST /v1/report-jobs

Response:

        
HTTP/1.1 202 Accepted
Location: /v1/report-jobs/job_123

        
{
  "id": "job_123",
  "status": "queued"
}

Then:

GET /v1/report-jobs/job_123

Possible job states:

queued
running
succeeded
failed
canceled

If the async result creates another resource, expose that link clearly once complete.

Do not make clients guess whether they should poll, retry, or wait longer.

Design for observability from day one

An API is not production-ready if you cannot answer these questions during an incident:

which endpoint is failing?
for which tenant or client app?
with what latency distribution?
which dependency is causing the failure?
what request caused this specific error?

At minimum, include:

request ID in responses
structured logs
latency metrics by route
status code counts
dependency call metrics
distributed tracing if requests cross service boundaries

Useful response header:

X-Request-ID: req_01ht9z6r8m2f

Do not put raw PII into logs just because the payload is convenient to dump. Operationally useful logs are selective and structured, not verbose by default.

Rate limiting should be understandable

If you enforce rate limits, clients need to know what they hit.

A practical response:

        
HTTP/1.1 429 Too Many Requests
Retry-After: 30

        
        
        
    
{
  "error": {
    "code": "rate_limited",
    "message": "Too many requests.",
    "request_id": "req_01ht9zzzz321"
  }
}

Good policies are:

scoped clearly, such as per API key, per user, or per tenant
documented with real units
consistent enough that clients can back off sensibly

Bad policies are opaque sliding rules nobody can reason about.

If limits differ by endpoint, say so. “Some requests may be limited” is not useful documentation.

Deprecation needs policy, not just headers

Every API accumulates dead shapes and fields. The question is whether you retire them deliberately or let them rot forever.

A workable deprecation policy usually includes:

announce deprecation in changelog and docs
mark deprecated fields/endpoints in OpenAPI
return a deprecation header when practical
give a sunset date
provide a migration path
measure who still uses the deprecated contract before removal

Example headers:

        
Deprecation: true
Sunset: Wed, 31 Dec 2025 23:59:59 GMT
Link: </docs/migrations/orders-v2>; rel="deprecation"

Do not remove a field just because “nobody should be using it”. Verify that assumption with traffic data.

Anti-patterns worth rejecting in review

These show up often, and they almost always get worse later.

1. One endpoint, many behaviors

POST /v1/action

With a payload field like "type": "create_user" or "type": "cancel_order".

This kills discoverability and makes authorization, validation, metrics, and documentation harder.

2. Leaking database shape directly

If your response mirrors internal table names and join structure, every schema change becomes an API fight.

API resources should be stable representations, not ORM dumps.

3. Inconsistent nullability

If middle_name is sometimes missing, sometimes null, and sometimes "", clients will end up writing defensive junk everywhere.

Choose one convention.

4. Silent partial success

Bulk endpoints that partly fail need explicit per-item results. A bare 200 with vague text is not enough.

5. Mixing transport and domain errors

Clients should not need to parse "duplicate key value violates unique constraint" to know an email is already taken.

Translate infrastructure failures into domain-meaningful errors.

6. Over-designing for theoretical purity

If canceling an order is a meaningful business action with side effects, POST /v1/orders/{id}/cancel is often clearer than inventing unnatural state transitions just to avoid a verb.

A short review checklist

When I review a REST API, I usually ask:

Are the resources stable and named consistently?
Is the versioning story clear before breakage happens?
Are error shapes standardized and documented?
Can clients retry safely?
Are list endpoints consistent about pagination, filters, and sorting?
Is auth separated cleanly from permission checks?
Are cache and concurrency semantics deliberate?
Is async work modeled explicitly?
Can the team observe and support this API in production?
Is there a real deprecation path?

If the answer to several of these is “we’ll figure it out later”, the API is not done. It is just deployed.

A REST API ages well when the team treats the contract as a product with operational consequences, not a thin wrapper over handlers.