Aggregation Pipeline

Top 15 MongoDB aggregation pipeline interview questions. Master $group, $lookup, $unwind, $project, $facet, $graphLookup and window functions with real code examples.

MongoDB

15 Questions

The aggregation pipeline is MongoDB's most powerful data processing framework, consisting of a sequence of stages that transform documents step-by-step. Data flows through stages like water through a pipe — each stage receives documents from the previous stage, processes them, and passes results onward. Common stages include $match (filter), $group (aggregate), $project (reshape), $sort, $limit, $skip, $lookup (join), $unwind (flatten arrays), and $addFields. Unlike the find() API, aggregation can compute totals, averages, transformations, and joins — making it the backbone of analytics and reporting.

// Basic pipeline structure
db.orders.aggregate([
  // Stage 1: Filter
  { $match: { status: "completed", createdAt: { $gte: ISODate("2026-01-01") } } },

  // Stage 2: Group and compute totals
  { $group: {
    _id: "$customerId",
    totalSpent: { $sum: "$amount" },
    orderCount: { $sum: 1 },
    avgOrder:  { $avg: "$amount" }
  }},

  // Stage 3: Add computed field
  { $addFields: {
    tier: { $cond: { if: { $gte: ["$totalSpent", 10000] }, then: "VIP", else: "Regular" } }
  }},

  // Stage 4: Sort
  { $sort: { totalSpent: -1 } },

  // Stage 5: Limit top 10
  { $limit: 10 }
]);

Why it matters: The aggregation pipeline is the most-tested advanced MongoDB topic. Interviewers assess your ability to chain stages logically and use the right operator for each transformation.

Real applications: Sales dashboards (total revenue per customer), user analytics (active users per day), inventory reports (low-stock products by category).

Common mistakes: Not putting $match first — placing it early reduces data flowing through all subsequent stages. Late $match stages miss index optimization opportunities.

The $group stage aggregates documents by a specified _id expression (the grouping key) and applies accumulator operators to compute values. Each unique grouping key value produces one output document. Setting _id: null groups all documents together for a total. Key accumulators: $sum (total), $avg (average), $min/$max (extremes), $count (document count, MongoDB 5.0+), $push (array of values), $addToSet (unique array), $first/$last (first/last value in group), $stdDevPop/$stdDevSamp (standard deviation).

// Group by product category
db.products.aggregate([{
  $group: {
    _id: "$category",              // group by category field
    count:      { $sum: 1 },       // count documents
    totalStock: { $sum: "$stock" },
    avgPrice:   { $avg: "$price" },
    maxPrice:   { $max: "$price" },
    minPrice:   { $min: "$price" },
    allBrands:  { $addToSet: "$brand" }, // unique brands array
    firstAdded: { $first: "$createdAt" }
  }
}]);

// Group all documents (grand total)
db.orders.aggregate([{
  $group: {
    _id: null,                     // null = all documents
    grandTotal: { $sum: "$amount" },
    orderCount: { $count: {} }     // MongoDB 5.0+
  }
}]);

// Multi-field grouping key
db.sales.aggregate([{
  $group: {
    _id: { year: { $year: "$saleDate" }, category: "$category" },
    revenue: { $sum: "$amount" }
  }
}]);

Why it matters: $group is the core of all aggregation queries. Mastering accumulators and grouping key expressions is essential for any data analytics or reporting task.

Real applications: Monthly revenue reports, user activity summaries (login count per user per month), inventory valuation (total stock value per warehouse).

Common mistakes: Forgetting that $group does not guarantee output order — always add $sort after $group if order matters.

The $lookup stage performs a left outer join between collections, similar to SQL's LEFT JOIN. It matches documents from a foreign collection based on a field equality condition and adds the matching documents as an array field on the input documents. Basic syntax uses from (foreign collection), localField, foreignField, and as (output array field). MongoDB 3.6+ added expressive $lookup using a pipeline, enabling complex multi-condition joins, computed fields, and nested lookups. Non-matching documents get an empty array.

// Basic $lookup — join orders with users
db.orders.aggregate([{
  $lookup: {
    from: "users",          // foreign collection
    localField: "userId",   // field in orders
    foreignField: "_id",    // field in users
    as: "userDetails"       // output field name (array)
  }
}]);
// Result: each order document gets "userDetails": [{ ... user doc ... }]

// Unwind the array (if one-to-one)
db.orders.aggregate([
  { $lookup: { from: "users", localField: "userId", foreignField: "_id", as: "user" } },
  { $unwind: "$user" },     // flatten array to single object
  { $project: { "user.password": 0 } }  // exclude sensitive fields
]);

// Expressive $lookup (pipeline) — join with conditions
db.orders.aggregate([{
  $lookup: {
    from: "products",
    let: { orderItems: "$items" },
    pipeline: [
      { $match: { $expr: { $in: ["$_id", "$$orderItems"] } } },
      { $project: { name: 1, price: 1 } }
    ],
    as: "productDetails"
  }
}]);

Why it matters: $lookup is critical for working with normalized data models. Tests knowledge of MongoDB's join capabilities and when to denormalize vs join.

Real applications: Order detail views joining order items with product info, user activity feeds joining events with user profiles, comment threads joining comments with author details.

Common mistakes: Not indexing foreignField — $lookup performs a collection scan on the foreign collection per input document without an index, making it extremely slow.

The $unwind stage deconstructs an array field from input documents, outputting one document per array element. If a document has an array with 5 elements, $unwind produces 5 documents, each with one array element. This is essential before performing $group or $sort on array contents. By default, documents with missing or empty array fields are excluded — use preserveNullAndEmptyArrays: true to keep them. The includeArrayIndex option adds a field with the array element's original index position.

// Input: { _id: 1, tags: ["mongodb", "database", "nosql"] }
db.posts.aggregate([{ $unwind: "$tags" }]);
// Output:
// { _id: 1, tags: "mongodb" }
// { _id: 1, tags: "database" }
// { _id: 1, tags: "nosql" }

// Count tags across all posts
db.posts.aggregate([
  { $unwind: "$tags" },
  { $group: { _id: "$tags", count: { $sum: 1 } } },
  { $sort: { count: -1 } }
]);

// Preserve documents with missing/empty arrays
db.products.aggregate([{
  $unwind: {
    path: "$reviews",
    preserveNullAndEmptyArrays: true,  // keep products with no reviews
    includeArrayIndex: "reviewIndex"    // add index field
  }
}]);

// Unwind nested array
db.orders.aggregate([
  { $unwind: "$items" },
  { $group: { _id: "$items.productId", totalSold: { $sum: "$items.qty" } } }
]);

Why it matters: Understanding array deconstruction is crucial for analytics on array fields. $unwind enables per-element aggregations that would otherwise be impossible.

Real applications: Tag frequency analysis, sales per product SKU from order item arrays, skill distribution analysis from user skill arrays.

Common mistakes: Unwinding without preserveNullAndEmptyArrays silently drops documents with missing arrays, causing incorrect totals in reports.

The $project stage reshapes each document by including, excluding, renaming, or computing fields. Unlike find() projections, $project in aggregation supports computed fields using expressions, string operations, arithmetic, date functions, and conditional logic. Use 1 to include, 0 to exclude. Computed fields use aggregation expressions with the $ prefix for field references. $addFields is a variant that only adds new fields without removing existing ones.

db.employees.aggregate([{
  $project: {
    // Include fields
    name: 1,
    email: 1,

    // Exclude sensitive fields
    password: 0,
    ssn: 0,

    // Rename field
    fullName: "$name",

    // Computed: annual salary
    annualSalary: { $multiply: ["$monthlySalary", 12] },

    // String concatenation
    displayName: { $concat: ["$firstName", " ", "$lastName"] },

    // Date parts
    joinYear: { $year: "$joinedAt" },

    // Conditional
    status: {
      $cond: { if: { $gte: ["$experience", 5] }, then: "Senior", else: "Junior" }
    },

    // Nested field rename
    "contact.primary": "$phone"
  }
}]);

// $addFields — add without removing existing
db.users.aggregate([{
  $addFields: {
    fullName: { $concat: ["$firstName", " ", "$lastName"] },
    ageGroup: {
      $switch: {
        branches: [
          { case: { $lt: ["$age", 25] }, then: "Young" },
          { case: { $lt: ["$age", 40] }, then: "Middle" }
        ],
        default: "Senior"
      }
    }
  }
}]);

Why it matters: $project is used in virtually every pipeline. Tests ability to compute derived fields and reshape documents for downstream stages or API responses.

Real applications: Computing profit margins from revenue/cost fields, creating display names from firstName/lastName, formatting dates into readable strings.

Common mistakes: Using $project to add fields while also excluding others — mixing inclusion/exclusion in aggregation $project is allowed, unlike find() projections.

$facet allows running multiple independent aggregation pipelines within a single stage, each producing its own output. This is perfect for generating multiple summaries in one query (e.g., faceted search results with category counts, price histograms, and rating distributions simultaneously). $bucket categorizes documents into user-defined buckets based on a field's value range. $bucketAuto automatically determines bucket boundaries for even distribution. These are key for implementing faceted search (like e-commerce filters).

// $facet — multiple pipelines in one query
db.products.aggregate([{
  $facet: {
    // Pipeline 1: Category breakdown
    "byCategory": [
      { $group: { _id: "$category", count: { $sum: 1 } } },
      { $sort: { count: -1 } }
    ],

    // Pipeline 2: Price distribution
    "priceRanges": [
      { $bucket: {
        groupBy: "$price",
        boundaries: [0, 1000, 5000, 20000, 100000],
        default: "Other",
        output: { count: { $sum: 1 }, avgPrice: { $avg: "$price" } }
      }}
    ],

    // Pipeline 3: Rating stats
    "ratingStats": [
      { $group: { _id: null, avgRating: { $avg: "$rating" }, total: { $sum: 1 } } }
    ]
  }
}]);

// $bucketAuto — auto-generate 5 equal buckets
db.products.aggregate([{
  $bucketAuto: {
    groupBy: "$price",
    buckets: 5,
    output: { count: { $sum: 1 }, avgPrice: { $avg: "$price" } }
  }
}]);

Why it matters: $facet is the MongoDB way to build e-commerce-style faceted search. This is a high-value interview topic showing advanced aggregation knowledge.

Real applications: E-commerce product search with simultaneous category counts, price ranges, and brand filters — all computed in one database round-trip.

Common mistakes: Running separate aggregation queries for each facet — $facet does all of them in a single pass through data, which is significantly more efficient.

MongoDB 3.6 introduced expressive $lookup that uses a pipeline instead of simple field equality, enabling complex join conditions, sub-pipelines, and multi-condition matches. The let option exposes local document fields as variables (prefixed with $$) inside the pipeline. The pipeline runs against the foreign collection with access to both the local variables and pipeline operators. This enables filtered joins, computed joins, and nested lookups that are impossible with basic $lookup.

// Expressive lookup — join with multiple conditions
db.orders.aggregate([{
  $lookup: {
    from: "promotions",
    let: { orderTotal: "$total", orderDate: "$createdAt", custId: "$customerId" },
    pipeline: [
      {
        $match: {
          $expr: {
            $and: [
              { $lte: ["$minOrderValue", "$$orderTotal"] }, // promo threshold met
              { $lte: ["$startDate", "$$orderDate"] },       // promo active
              { $gte: ["$endDate",   "$$orderDate"] }
            ]
          }
        }
      },
      { $project: { code: 1, discount: 1 } }  // only return needed fields
    ],
    as: "eligiblePromos"
  }
}]);

// Self-join — employees with their managers
db.employees.aggregate([{
  $lookup: {
    from: "employees",
    let: { managerId: "$managerId" },
    pipeline: [
      { $match: { $expr: { $eq: ["$_id", "$$managerId"] } } },
      { $project: { name: 1, email: 1 } }
    ],
    as: "manager"
  }
}, {
  $unwind: { path: "$manager", preserveNullAndEmptyArrays: true }
}]);

Why it matters: Expressive $lookup is a senior-level feature. It demonstrates mastery of MongoDB's join capabilities and shows you can handle complex data relationships without application-level joins.

Real applications: Eligibility checks (which promotions apply to an order), hierarchical data (employee-manager trees), product recommendations based on order history and category.

Common mistakes: Not indexing the foreign collection's match fields within the pipeline — each input document triggers a pipeline scan on the foreign collection without proper indexes.

$expr allows using aggregation expressions inside query operators like $match and find(). It enables comparing two fields in the same document, applying aggregation expressions in queries, and using variables. Without $expr, MongoDB query operators only compare a field against a literal value. With $expr, you can write { $gt: ["$fieldA", "$fieldB"] } — comparing two document fields. It's especially powerful in $lookup pipelines for correlated queries.

// Find orders where actual amount exceeds budget
db.orders.find({
  $expr: { $gt: ["$actualCost", "$budgetedCost"] }
});

// Use in $match stage
db.products.aggregate([
  { $match: {
    $expr: {
      $and: [
        { $gt: ["$price", "$cost"] },        // profitable
        { $lt: ["$stock", "$reorderPoint"] ] // needs reorder
      ]
    }
  }}
]);

// Compute with $expr
db.employees.find({
  $expr: {
    $gte: [
      { $multiply: ["$yearsExp", 10000] },  // computed value
      "$salary"
    ]
  }
});

// Variables in $expr ($$ROOT, $$CURRENT, $$NOW)
db.events.aggregate([{
  $match: {
    $expr: { $lt: ["$eventDate", "$$NOW"] } // already occurred
  }
}]);

Why it matters: $expr bridges the gap between query language and aggregation expressions. It's essential for field-to-field comparisons, which are impossible with regular query operators.

Real applications: Finding overdue tasks (due date < current date), profitable products (price > cost), over-budget projects (actual > estimated).

Common mistakes: Trying to compare two document fields with regular operators: { $gt: ["$a", "$b"] } doesn't work outside $expr — MongoDB treats "$b" as a literal string.

Both $out and $merge write aggregation results to a collection. $out (MongoDB 2.6+) completely replaces the target collection — if it exists, all existing documents are deleted and replaced with pipeline results. $merge (MongoDB 4.2+) is more flexible: it can insert, replace, merge, fail, or keep existing documents based on matching criteria. $merge supports upsert-like behavior for incrementally updating materialized views. Both must be the last stage in the pipeline and cannot be used in transactions.

// $out — replace collection entirely
db.orders.aggregate([
  { $match: { status: "completed" } },
  { $group: { _id: "$customerId", totalSpent: { $sum: "$amount" } } },
  { $out: "customer_lifetime_value" }  // replaces entire collection
]);

// $merge — upsert into existing collection
db.daily_sales.aggregate([
  { $group: { _id: "$date", revenue: { $sum: "$amount" }, count: { $sum: 1 } } },
  {
    $merge: {
      into: "sales_summary",         // target collection
      on: "_id",                     // match key
      whenMatched: "merge",          // merge fields if doc exists
      whenNotMatched: "insert"       // insert if doc doesn't exist
    }
  }
]);

// whenMatched options:
// "replace"   — replace entire document
// "merge"     — merge incoming with existing
// "keepExisting" — keep existing, discard new
// "fail"      — throw error on conflict
// pipeline    — custom merge logic

Why it matters: $merge enables building materialized views and incremental aggregations — a critical pattern for pre-computing expensive analytics without full recalculation.

Real applications: Daily revenue summaries updated incrementally each night, product popularity scores updated as new orders arrive, user analytics tables for reporting dashboards.

Common mistakes: Using $out for incremental updates — it deletes all existing records first. Use $merge with whenMatched: "merge" for incremental views.

$addFields and its alias $set (MongoDB 4.2+) add new fields to documents or overwrite existing ones without removing other fields. Unlike $project, they only specify what to add — all existing fields are retained automatically. Multiple fields can be added in a single stage. They support all aggregation expressions for computing values. $unset (alias: $project with 0) removes fields. Using $set/$unset is the idiomatic MongoDB 4.2+ style for field manipulation.

// $addFields / $set — add computed fields
db.products.aggregate([{
  $set: {
    // Add profit margin
    margin: {
      $multiply: [
        { $divide: [{ $subtract: ["$price", "$cost"] }, "$price"] },
        100
      ]
    },

    // Add full name from parts
    fullName: { $concat: ["$brand", " - ", "$model"] },

    // Overwrite existing field with formatted version
    price: { $round: ["$price", 2] },

    // Conditional tag
    inStock: { $gt: ["$stock", 0] },

    // Current timestamp
    processedAt: "$$NOW"
  }
}]);

// $unset — remove fields
db.users.aggregate([
  { $unset: ["password", "ssn", "creditCard"] }
]);

// Chain $set stages for readability
db.employees.aggregate([
  { $set: { annualSalary: { $multiply: ["$monthlySalary", 12] } } },
  { $set: { taxBracket: { $switch: {
    branches: [
      { case: { $gte: ["$annualSalary", 1000000] }, then: "30%" },
      { case: { $gte: ["$annualSalary", 500000] }, then: "20%" }
    ],
    default: "10%"
  }}}},
  { $unset: "monthlySalary" }
]);

Why it matters: $set/$addFields is the most readable way to enrich documents in pipelines. Shows awareness of MongoDB 4.2+ modernized syntax and the preference for readable stage names.

Real applications: Computing derived metrics (profit margins, engagement rates), adding display fields for API responses, and normalizing data during migration pipelines.

Common mistakes: Using $project when only adding fields — this requires explicitly listing all fields you want to keep, making pipelines verbose and fragile when schema changes.

When using $lookup followed by $unwind, the combined behavior is like an INNER JOIN — documents without matches are dropped because $lookup produces an empty array and $unwind removes documents with empty arrays. To simulate a LEFT JOIN (keep unmatched documents), use $unwind with preserveNullAndEmptyArrays: true. This is one of the most common patterns in MongoDB aggregation and is critical for understanding when documents might be accidentally lost in pipelines.

// INNER JOIN behavior (drops unmatched orders)
db.orders.aggregate([
  { $lookup: { from: "users", localField: "userId", foreignField: "_id", as: "user" } },
  { $unwind: "$user" }  // drops orders with no matching user!
]);

// LEFT JOIN behavior (keeps orders without user)
db.orders.aggregate([
  { $lookup: { from: "users", localField: "userId", foreignField: "_id", as: "user" } },
  { $unwind: { path: "$user", preserveNullAndEmptyArrays: true } }
  // unmatched orders: user = null
]);

// Check for orphaned records
db.orders.aggregate([
  { $lookup: { from: "users", localField: "userId", foreignField: "_id", as: "user" } },
  { $match: { user: { $size: 0 } } }  // find orders with no user
]);

// Best practice: filter early, join late
db.orders.aggregate([
  { $match: { status: "pending", createdAt: { $gte: ISODate("2026-01-01") } } },
  { $lookup: { from: "users", localField: "userId", foreignField: "_id", as: "user" } },
  { $unwind: { path: "$user", preserveNullAndEmptyArrays: true } },
  { $project: { "user.password": 0 } }
]);

Why it matters: This is the most common data integrity issue in MongoDB aggregations. Accidentally dropping documents by not using preserveNullAndEmptyArrays causes silent incorrect report totals.

Real applications: Order reports must include all orders even if the user account was deleted. Product listings should show all products even those without reviews.

Common mistakes: In analytics, missing the preserveNullAndEmptyArrays flag causes totals to silently undercount — the most common cause of data discrepancies in MongoDB reports.

$graphLookup performs recursive lookups in a collection, enabling traversal of hierarchical and graph-like data structures like organizational charts, category trees, friend networks, and bill-of-materials. It starts from a set of seed documents, then recursively looks up connected documents based on matching fields, extending up to a maxDepth limit. The result is an array of all reachable documents in the traversal. Unlike application-level recursion with multiple queries, $graphLookup does this in a single aggregation pipeline stage.

// Employee hierarchy — find all reports under a manager
db.employees.aggregate([{
  $graphLookup: {
    from: "employees",          // same collection (self-join)
    startWith: "$_id",          // start from this employee's ID
    connectFromField: "_id",    // match outgoing field
    connectToField: "managerId",// field in other docs that points back
    as: "allReports",           // output array field
    maxDepth: 5,                // max recursion depth
    depthField: "level"         // add depth level to each result
  }
}]);

// Category tree — find all subcategories
db.categories.aggregate([
  { $match: { name: "Electronics" } },
  {
    $graphLookup: {
      from: "categories",
      startWith: "$_id",
      connectFromField: "_id",
      connectToField: "parentId",
      as: "allSubcategories",
      maxDepth: 4
    }
  }
]);

// Friend-of-friend network (social graph)
db.users.aggregate([
  { $match: { _id: userId } },
  { $graphLookup: {
    from: "users",
    startWith: "$friends",
    connectFromField: "friends",
    connectToField: "_id",
    as: "network",
    maxDepth: 2           // friends of friends
  }}
]);

Why it matters: Shows mastery of graph/tree data problems in MongoDB. This stage replaces multiple recursive application-level queries with a single efficient database operation.

Real applications: Org chart traversal, e-commerce category trees, bill of materials (product components), and social network friend graphs.

Common mistakes: Not setting maxDepth on circular graphs — this causes infinite loops. Also not indexing connectToField, making recursive lookups extremely slow on large collections.

$setWindowFields (MongoDB 5.0+) brings window function capabilities similar to SQL's OVER() clause, allowing computations across a defined window of documents without collapsing them into groups. Unlike $group which reduces N documents to 1, $setWindowFields adds computed fields to each document while considering surrounding documents. Key operators: $rank, $denseRank (ranking), $sum, $avg, $count (cumulative/windowed), $shift (access previous/next document), $derivative, $integral (rate of change), $first/$last (window boundaries).

// Running total (cumulative sum)
db.sales.aggregate([{
  $setWindowFields: {
    partitionBy: "$region",       // group by region
    sortBy: { saleDate: 1 },      // order within partition
    output: {
      runningRevenue: {
        $sum: "$revenue",
        window: { documents: ["unbounded", "current"] } // from start to current
      },
      rank: { $rank: {} },        // rank within region by saleDate
      prevDayRevenue: {
        $shift: { output: "$revenue", by: -1, default: 0 } // previous day
      }
    }
  }
}]);

// Moving average (last 7 days)
db.metrics.aggregate([{
  $setWindowFields: {
    sortBy: { date: 1 },
    output: {
      movingAvg: {
        $avg: "$value",
        window: { documents: [-6, 0] } // current + 6 previous docs
      }
    }
  }
}]);

// Dense rank for leaderboard
db.scores.aggregate([{
  $setWindowFields: {
    sortBy: { score: -1 },
    output: { leaderboardRank: { $denseRank: {} } }
  }
}]);

Why it matters: Window functions are a major MongoDB 5.0 feature that brings MongoDB on par with SQL analytics. Shows awareness of modern MongoDB capabilities for time-series and analytics workloads.

Real applications: Sales leaderboards with ranks, 7-day moving average metrics, cumulative revenue by region, day-over-day growth rates.

Common mistakes: Using $group + application-level ranking instead of $setWindowFields — this is far less efficient and requires multiple queries or complex application code.

MongoDB automatically optimizes some pipeline stages, but understanding manual optimization techniques is critical for production performance. Key rules: (1) Put $match and $limit as early as possible to reduce data volume flowing through stages. (2) $match + $sort at the start uses an index. (3) Use $project early to reduce document size. (4) Avoid $unwind before $group when possible (use $group with $push). (5) Use allowDiskUse: true for large aggregations exceeding 100 MB memory limit. (6) Use the explain() method to analyze pipeline performance.

// ❌ BAD: $match at end (processes all docs)
db.orders.aggregate([
  { $lookup: { ... } },
  { $unwind: "$items" },
  { $group: { ... } },
  { $match: { status: "completed" } }  // too late!
]);

// ✅ GOOD: $match first (reduces docs early)
db.orders.aggregate([
  { $match: { status: "completed", createdAt: { $gte: startDate } } }, // uses index
  { $lookup: { ... } },
  { $project: { "items.price": 1, customerId: 1 } }, // reduce doc size
  { $unwind: "$items" },
  { $group: { ... } }
]);

// Allow disk use for large aggregations (>100MB)
db.bigCollection.aggregate([...], { allowDiskUse: true });

// Explain aggregation pipeline
db.orders.aggregate([...], { explain: true });
// or
db.orders.explain("executionStats").aggregate([...]);

// Use $indexStats to find unused indexes
db.orders.aggregate([{ $indexStats: {} }]);

Why it matters: Pipeline optimization is a key topic for senior roles and system design. Poorly ordered pipelines on large collections can be 100× slower than optimized ones.

Real applications: Nightly analytics jobs processing millions of documents must use allowDiskUse and carefully ordered stages to complete in reasonable time.

Common mistakes: Forgetting allowDiskUse: true for aggregations on large collections — MongoDB throws a memory exceeded error at 100 MB without it.

MongoDB's aggregation framework includes comprehensive date operator expressions for extracting, formatting, and computing date values. Key operators: $year, $month, $dayOfMonth, $hour, $minute, $second, $dayOfWeek, $week, $isoWeek (extract date parts), $dateToString (format date as string), $dateFromString (parse string to date), $dateAdd/$dateSubtract (date arithmetic, MongoDB 5.0+), $dateDiff (difference between dates, MongoDB 5.0+). These enable powerful time-series aggregations and date-based reporting.

// Group sales by month and year
db.sales.aggregate([{
  $group: {
    _id: {
      year:  { $year:  "$saleDate" },
      month: { $month: "$saleDate" }
    },
    revenue: { $sum: "$amount" },
    count: { $sum: 1 }
  }
}, { $sort: { "_id.year": 1, "_id.month": 1 } }]);

// $dateToString — custom format
db.events.aggregate([{
  $project: {
    formattedDate: { $dateToString: { format: "%Y-%m-%d", date: "$createdAt" } },
    dayOfWeek: { $dayOfWeek: "$createdAt" }, // 1=Sun, 7=Sat
    hour: { $hour: "$createdAt" }
  }
}]);

// $dateDiff — time-since-signup in days
db.users.aggregate([{
  $addFields: {
    accountAgeDays: {
      $dateDiff: {
        startDate: "$createdAt",
        endDate: "$$NOW",
        unit: "day"
      }
    }
  }
}]);

// Group by week
db.orders.aggregate([{
  $group: {
    _id: { week: { $isoWeek: "$createdAt" }, year: { $isoWeekYear: "$createdAt" } },
    total: { $sum: "$amount" }
  }
}]);

Why it matters: Date aggregations are fundamental for business intelligence and analytics. Every production application needs date-based reporting — by day, week, month, hour.

Real applications: Daily active user charts, monthly revenue reports, peak hour analysis for capacity planning, and cohort analysis by signup week.

Common mistakes: Not accounting for timezone offsets — using $dateToString without a timezone option returns UTC dates, which may not match the user's local dates for business reporting.

Prev CRUD Operations Next Indexes & Performance

Aggregation Pipeline

1What is the MongoDB aggregation pipeline and how does it work?

2What does the $group stage do and what accumulators are available?

3How does the $lookup stage work for joining collections?

4What does $unwind do and when do you need it?

5How does the $project stage reshape documents?

6What are $facet and $bucket stages used for?

7How does the $lookup with pipeline (expressive join) work?

8What is $expr and how is it used in aggregation?

9What is $merge and $out and how do they differ?

10How do you use $addFields and $set in aggregation?

11What is the $lookup with $unwind vs preserveNullAndEmptyArrays pattern?

12How does $graphLookup enable recursive tree/graph traversals?

13How do you use window functions with $setWindowFields?

14What are aggregation pipeline optimization techniques?

15How do you use $dateToString and date operators in aggregation?