MongoDB Best Practices

2024-10-15 987 words 5 minutes

Contents

Quick Start

Installation & Version

Current version is MongoDB 8.2.6.

macOS

        
# Install via Homebrew
brew tap mongodb/brew
brew install [email protected]

# Start service
brew services start [email protected]

# Or start manually
mongod --config /usr/local/etc/mongod.conf

# Connect
mongosh

Windows

Download MongoDB Community Server for Windows
Run the installer
MongoDB is installed to C:\Program Files\MongoDB\Server\8.0\bin by default
Create data directory:

        
mkdir C:\data\db
mongod --dbpath C:\data\db

Linux (Ubuntu)

        
        
        
    
# Import MongoDB GPG key
curl -fsSL https://www.mongodb.org/static/pgp/server-8.2.asc | sudo gpg -o /usr/share/keyrings/mongodb-server-8.2.gpg --dearmor

# Add MongoDB repository
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-8.2.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/8.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-8.2.list

# Update and install
sudo apt update
sudo apt install -y mongodb-org

# Start service
sudo systemctl start mongod
sudo systemctl enable mongod

# Connect
mongosh

Docker Compose Deployment

        
        
        
    
services:
  mongodb:
    image: mongo:8.2.6
    container_name: mongodb
    restart: unless-stopped
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: password123
    volumes:
      - mongodb_data:/data/db
    healthcheck:
      test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  mongodb_data:

Start: docker compose up -d

Connect: mongosh mongodb://admin:password123@localhost:27017

Basic Operations

        
        
        
    
// View databases
show dbs

// Use database
use myapp

// View collections
show collections

// Basic CRUD
db.users.insertOne({ name: "Alice", email: "[email protected]" })
db.users.find({ name: "Alice" })
db.users.updateOne({ _id: 1 }, { $set: { age: 30 } })
db.users.deleteOne({ _id: 1 })

Schema Design

Principles

Embedded Documents vs References
Design based on query patterns
Avoid unbounded arrays

Embedded Documents

For 1:1 or 1:few relationships:

        
        
        
    
// Good: embedded
{
  _id: ObjectId("..."),
  name: "Alice",
  profile: {
    bio: "Engineer",
    location: "Beijing",
    website: "https://alice.dev"
  }
}

// Good: small arrays
{
  name: "Order",
  items: [
    { product: "Widget", quantity: 2 },
    { product: "Gadget", quantity: 1 }
  ]
}

Document References

For 1:many or many:many relationships:

        
        
        
    
// Users collection
{
  _id: ObjectId("..."),
  name: "Alice",
  email: "[email protected]"
}

// Orders collection (references user)
{
  _id: ObjectId("..."),
  user_id: ObjectId("..."),  // reference
  items: [...],
  total: 99.99
}

Schema Validation

        
        
        
    
db.createCollection("users", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "email"],
      properties: {
        name: {
          bsonType: "string",
          description: "must be a string"
        },
        email: {
          bsonType: "string",
          pattern: "@.+",
          description: "must be a valid email"
        },
        age: {
          bsonType: "int",
          minimum: 0,
          maximum: 150
        }
      }
    }
  }
})

Indexing Strategies

Creating Indexes

        
// Single field index
db.users.createIndex({ email: 1 }, { unique: true })

// Compound index
db.orders.createIndex({ user_id: 1, created_at: -1 })

// Multikey index (array field)
db.products.createIndex({ tags: 1 })

// Text index
db.articles.createIndex({ title: "text", content: "text" })

// Geospatial index
db.locations.createIndex({ coordinates: "2dsphere" })

Index Principles

Principle	Description
Selectivity	High-cardinality fields
Query Pattern	Match query condition field order
Avoid Over-indexing	Avoid on small collections
Write Overhead	Indexes affect write performance

Index Types

        
        
        
    
// Unique index
db.users.createIndex({ email: 1 }, { unique: true })

// Partial index
db.logs.createIndex(
  { timestamp: 1 },
  { partialFilterExpression: { level: "error" } }
)

// Sparse index
db.sessions.createIndex(
  { token: 1 },
  { sparse: true }
)

// TTL index (auto-delete)
db.logs.createIndex(
  { created_at: 1 },
  { expireAfterSeconds: 60 * 60 * 24 * 7 }  // 7 days
)

View and Delete Indexes

        
// View indexes
db.users.getIndexes()

// Delete index
db.users.dropIndex("email_1")

// Delete all non-_id indexes
db.users.dropIndexes()

Aggregation Pipeline

Basic Structure

        
        
        
    
db.orders.aggregate([
  { $match: { status: "completed" } },
  { $group: { _id: "$user_id", total: { $sum: "$amount" } } },
  { $sort: { total: -1 } },
  { $limit: 10 }
])

Common Stages

        
        
        
    
// $match - filter
{ $match: { status: "active", age: { $gte: 18 } } }

// $group - group
{ $group: {
    _id: "$category",
    count: { $sum: 1 },
    avgPrice: { $avg: "$price" }
  }
}

// $project - select fields
{ $project: { name: 1, email: 1, _id: 0 } }

// $sort - sort
{ $sort: { created_at: -1 } }

// $limit - limit
{ $limit: 100 }

// $skip - skip
{ $skip: 50 }

// $lookup - JOIN
{ $lookup: {
    from: "users",
    localField: "user_id",
    foreignField: "_id",
    as: "user"
  }
}

Common Patterns

        
        
        
    
// User order statistics
db.orders.aggregate([
  { $match: { user_id: ObjectId("...") } },
  { $group: {
      _id: "$user_id",
      orderCount: { $sum: 1 },
      totalSpent: { $sum: "$amount" },
      avgOrderValue: { $avg: "$amount" }
    }
  }
])

// Date grouping
{ $group: {
    _id: {
      year: { $year: "$created_at" },
      month: { $month: "$created_at" }
    },
    count: { $sum: 1 }
  }
}

Performance Optimization

Explain

        
// Analyze query
db.orders.find({ user_id: ObjectId("...") }).explain("executionStats")

// Key metrics
// - executionTimeMillis: execution time
// - totalDocsExamined: documents examined
// - nReturned: documents returned

Covered Queries

        
// Good: index covers query
db.users.createIndex({ email: 1, name: 1 })
db.users.find({ email: "[email protected]" }, { name: 1, _id: 0 })

// Query plan uses index
// executionStats.nReturned === totalDocsExamined

Avoid Full Collection Scans

        
// Bad: full collection scan
db.users.find({ name: /.*alice.*/i })

// Good: use index
db.users.find({ email: "[email protected]" })

// Use regex prefix
db.users.find({ name: /^alice/ })  // Can use index

Replica Set

Replication Principle

Primary: accepts write operations
Secondary: replicates Primary’s oplog
Arbiter: participates in elections

Configure Replica Set

        
        
        
    
// Initialize replica set
rs.initiate({
  _id: "myapp",
  members: [
    { _id: 0, host: "localhost:27017" },
    { _id: 1, host: "localhost:27018" },
    { _id: 2, host: "localhost:27019", arbiterOnly: true }
  ]
})

// Check status
rs.status()

// Check primary
db.hello()

Sharding

Sharding Strategies

        
// Hash sharding (even distribution)
sh.shardCollection("myapp.orders", { _id: "hashed" })

// Range sharding (by range)
sh.shardCollection("myapp.users", { user_id: 1 })

Sharding Considerations

Choose good shard key (high cardinality, avoid monotonic)
Use mongos for routing
Consider zone sharding for geographic distribution

Security

Authentication

        
        
        
    
// Create user administrator
use admin
db.createUser({
  user: "admin",
  pwd: "secure_password",
  roles: [
    { role: "userAdminAnyDatabase", db: "admin" },
    { role: "readWriteAnyDatabase", db: "admin" }
  ]
})

// Connect with auth
mongosh -u admin -p --authenticationDatabase admin

Authorization

        
        
        
    
// Create user for application
use myapp
db.createUser({
  user: "myapp_user",
  pwd: "app_password",
  roles: [
    { role: "readWrite", db: "myapp" }
  ]
})