MongoDB Index Optimization and Query Performance Tuning: Advanced Database Performance Engineering
Modern enterprise applications demand exceptional database performance to support millions of users, complex queries, and real-time analytics workloads. Traditional approaches to database performance optimization often rely on rigid indexing strategies, manual query tuning, and reactive performance monitoring that fails to scale with growing data volumes and evolving access patterns.
MongoDB's flexible indexing system provides comprehensive performance optimization capabilities that combine intelligent index selection, advanced compound indexing strategies, and sophisticated query execution analysis. Unlike traditional database systems that require extensive manual tuning, MongoDB's index optimization features enable proactive performance management with automated recommendations, flexible indexing patterns, and detailed performance analytics.
The Traditional Database Performance Challenge
Relational database performance optimization has significant complexity and maintenance overhead:
-- Traditional PostgreSQL performance optimization - complex and manual
-- Customer orders table with performance challenges
CREATE TABLE customer_orders (
order_id BIGSERIAL PRIMARY KEY,
customer_id BIGINT NOT NULL,
order_date TIMESTAMP NOT NULL,
order_status VARCHAR(20) NOT NULL,
total_amount DECIMAL(12,2) NOT NULL,
shipping_address_id BIGINT,
billing_address_id BIGINT,
payment_method VARCHAR(50),
shipping_method VARCHAR(50),
order_priority VARCHAR(20) DEFAULT 'standard',
sales_rep_id BIGINT,
-- Additional fields for complex queries
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
completed_at TIMESTAMP,
cancelled_at TIMESTAMP,
-- Foreign key constraints
CONSTRAINT fk_customer FOREIGN KEY (customer_id) REFERENCES customers(customer_id),
CONSTRAINT fk_shipping_address FOREIGN KEY (shipping_address_id) REFERENCES addresses(address_id),
CONSTRAINT fk_billing_address FOREIGN KEY (billing_address_id) REFERENCES addresses(address_id),
CONSTRAINT fk_sales_rep FOREIGN KEY (sales_rep_id) REFERENCES employees(employee_id)
);
-- Order items table for line-level details
CREATE TABLE order_items (
item_id BIGSERIAL PRIMARY KEY,
order_id BIGINT NOT NULL,
product_id BIGINT NOT NULL,
quantity INTEGER NOT NULL,
unit_price DECIMAL(10,2) NOT NULL,
discount_amount DECIMAL(10,2) DEFAULT 0.00,
tax_amount DECIMAL(10,2) NOT NULL,
CONSTRAINT fk_order FOREIGN KEY (order_id) REFERENCES customer_orders(order_id),
CONSTRAINT fk_product FOREIGN KEY (product_id) REFERENCES products(product_id)
);
-- Manual index creation - requires extensive analysis and planning
-- Basic indexes for common queries
CREATE INDEX idx_orders_customer_id ON customer_orders(customer_id);
CREATE INDEX idx_orders_order_date ON customer_orders(order_date);
CREATE INDEX idx_orders_status ON customer_orders(order_status);
-- Compound indexes for complex query patterns
CREATE INDEX idx_orders_customer_date ON customer_orders(customer_id, order_date DESC);
CREATE INDEX idx_orders_status_date ON customer_orders(order_status, order_date DESC);
CREATE INDEX idx_orders_rep_status ON customer_orders(sales_rep_id, order_status, order_date DESC);
-- Partial indexes for selective filtering
CREATE INDEX idx_orders_completed_recent ON customer_orders(completed_at, total_amount)
WHERE order_status = 'completed' AND completed_at >= CURRENT_DATE - INTERVAL '90 days';
-- Covering indexes for query optimization (include columns)
CREATE INDEX idx_orders_customer_covering ON customer_orders(customer_id, order_date DESC)
INCLUDE (order_status, total_amount, payment_method);
-- Complex multi-table query requiring careful index planning
SELECT
o.order_id,
o.order_date,
o.total_amount,
o.order_status,
c.customer_name,
c.customer_email,
-- Aggregated order items (expensive without proper indexes)
COUNT(oi.item_id) as item_count,
SUM(oi.quantity * oi.unit_price) as items_subtotal,
SUM(oi.discount_amount) as total_discount,
SUM(oi.tax_amount) as total_tax,
-- Product information (requires additional joins)
array_agg(DISTINCT p.product_name) as product_names,
array_agg(DISTINCT p.category) as product_categories,
-- Address information (more joins)
sa.street_address as shipping_street,
sa.city as shipping_city,
sa.state as shipping_state,
-- Employee information
e.first_name || ' ' || e.last_name as sales_rep_name
FROM customer_orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
LEFT JOIN addresses sa ON o.shipping_address_id = sa.address_id
LEFT JOIN employees e ON o.sales_rep_id = e.employee_id
WHERE
o.order_date >= CURRENT_DATE - INTERVAL '30 days'
AND o.order_status IN ('processing', 'shipped', 'delivered')
AND o.total_amount >= 100.00
AND c.customer_tier IN ('premium', 'enterprise')
GROUP BY
o.order_id, o.order_date, o.total_amount, o.order_status,
c.customer_name, c.customer_email,
sa.street_address, sa.city, sa.state,
e.first_name, e.last_name
HAVING COUNT(oi.item_id) >= 2
ORDER BY o.order_date DESC, o.total_amount DESC
LIMIT 100;
-- Analyze query performance (complex interpretation required)
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT o.order_id, o.total_amount, c.customer_name
FROM customer_orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '7 days'
AND o.order_status = 'completed'
AND c.customer_tier = 'premium'
ORDER BY o.total_amount DESC
LIMIT 50;
-- Performance monitoring queries (complex and manual)
SELECT
schemaname,
tablename,
attname as column_name,
n_distinct,
correlation,
most_common_vals,
most_common_freqs
FROM pg_stats
WHERE schemaname = 'public'
AND tablename IN ('customer_orders', 'order_items')
ORDER BY tablename, attname;
-- Index usage statistics
SELECT
schemaname,
tablename,
indexname,
idx_tup_read,
idx_tup_fetch,
idx_scan,
-- Index efficiency calculation
CASE
WHEN idx_scan > 0 THEN ROUND((idx_tup_fetch::numeric / idx_scan), 2)
ELSE 0
END as avg_tuples_per_scan,
-- Index selectivity (estimated)
CASE
WHEN idx_tup_read > 0 THEN ROUND((idx_tup_fetch::numeric / idx_tup_read) * 100, 2)
ELSE 0
END as selectivity_percent
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan DESC;
-- Problems with traditional PostgreSQL performance optimization:
-- 1. Manual index design requires deep expertise and continuous maintenance
-- 2. Query plan analysis is complex and difficult to interpret
-- 3. Index maintenance overhead grows with data volume
-- 4. Limited support for dynamic query patterns and evolving schemas
-- 5. Difficult to optimize across multiple tables and complex joins
-- 6. Performance monitoring requires custom scripts and manual interpretation
-- 7. Index selection strategies are static and don't adapt to changing workloads
-- 8. Covering index management is complex and error-prone
-- 9. Partial index design requires detailed knowledge of data distribution
-- 10. Limited automated recommendations for performance improvements
MongoDB provides comprehensive performance optimization with intelligent indexing:
// MongoDB Index Optimization - intelligent and automated performance tuning
const { MongoClient } = require('mongodb');
const client = new MongoClient('mongodb://localhost:27017');
const db = client.db('ecommerce_platform');
// Advanced Index Management and Optimization
class MongoDBIndexOptimizer {
constructor(db) {
this.db = db;
this.performanceMetrics = new Map();
this.indexRecommendations = new Map();
}
async createOptimizedCollections() {
console.log('Creating optimized collections with intelligent indexing...');
// Orders collection with comprehensive document structure
const ordersCollection = db.collection('orders');
// Sample order document structure for index planning
const sampleOrder = {
_id: new ObjectId(),
orderNumber: "ORD-2025-001234",
// Customer information (embedded for performance)
customer: {
customerId: new ObjectId("64a1b2c3d4e5f6789012345a"),
name: "John Doe",
email: "[email protected]",
tier: "premium", // standard, premium, enterprise
accountType: "individual" // individual, business
},
// Order details
orderDate: new Date("2025-11-11T10:30:00Z"),
status: "processing", // pending, processing, shipped, delivered, cancelled
priority: "standard", // standard, expedited, overnight
// Financial information
totals: {
subtotal: 299.97,
tax: 24.00,
shipping: 12.99,
discount: 15.00,
grandTotal: 321.96,
currency: "USD"
},
// Items array (embedded for query performance)
items: [
{
productId: new ObjectId("64b2c3d4e5f6789012345b1a"),
sku: "WIDGET-001",
name: "Premium Widget",
category: "electronics",
subcategory: "gadgets",
quantity: 2,
unitPrice: 99.99,
totalPrice: 199.98,
// Product attributes for filtering
attributes: {
brand: "TechCorp",
model: "WG-2024",
color: "black",
size: null,
weight: 1.2
}
},
{
productId: new ObjectId("64b2c3d4e5f6789012345b1b"),
sku: "ACCESSORY-001",
name: "Widget Accessory",
category: "electronics",
subcategory: "accessories",
quantity: 1,
unitPrice: 99.99,
totalPrice: 99.99,
attributes: {
brand: "TechCorp",
model: "AC-2024",
color: "silver",
compatibility: ["WG-2024", "WG-2023"]
}
}
],
// Address information
addresses: {
shipping: {
name: "John Doe",
street: "123 Main Street",
city: "San Francisco",
state: "CA",
postalCode: "94105",
country: "US",
coordinates: {
latitude: 37.7749,
longitude: -122.4194
}
},
billing: {
name: "John Doe",
street: "123 Main Street",
city: "San Francisco",
state: "CA",
postalCode: "94105",
country: "US"
}
},
// Payment information
payment: {
method: "credit_card", // credit_card, debit_card, paypal, etc.
provider: "stripe",
transactionId: "txn_1234567890",
status: "captured" // pending, authorized, captured, failed
},
// Shipping information
shipping: {
method: "standard", // standard, expedited, overnight
carrier: "UPS",
trackingNumber: "1Z12345E1234567890",
estimatedDelivery: new Date("2025-11-15T18:00:00Z"),
actualDelivery: null
},
// Sales and marketing
salesInfo: {
salesRepId: new ObjectId("64c3d4e5f67890123456c2a"),
salesRepName: "Jane Smith",
channel: "online", // online, phone, in_store
source: "organic", // organic, paid_search, social, email
campaign: "holiday_2025"
},
// Operational metadata
fulfillment: {
warehouseId: "WH-SF-001",
pickingStarted: null,
pickingCompleted: null,
packingStarted: null,
packingCompleted: null,
shippedAt: null
},
// Analytics and business intelligence
analytics: {
customerLifetimeValue: 1250.00,
orderFrequency: "monthly",
seasonality: "Q4",
profitMargin: 0.35,
riskScore: 12 // fraud risk score 0-100
},
// Audit trail
audit: {
createdAt: new Date("2025-11-11T10:30:00Z"),
updatedAt: new Date("2025-11-11T14:45:00Z"),
createdBy: "system",
updatedBy: "user_12345",
version: 2,
// Change history for critical fields
statusHistory: [
{
status: "pending",
timestamp: new Date("2025-11-11T10:30:00Z"),
userId: "customer_67890"
},
{
status: "processing",
timestamp: new Date("2025-11-11T14:45:00Z"),
userId: "system"
}
]
}
};
// Insert sample data for index testing
await ordersCollection.insertOne(sampleOrder);
// Create comprehensive index strategy
await this.createIntelligentIndexes(ordersCollection);
return ordersCollection;
}
async createIntelligentIndexes(collection) {
console.log('Creating intelligent index strategy...');
try {
// 1. Primary query patterns - single field indexes
await collection.createIndexes([
// Customer-based queries (most common pattern)
{
key: { "customer.customerId": 1 },
name: "idx_customer_id",
background: true
},
// Date-based queries for reporting
{
key: { "orderDate": -1 },
name: "idx_order_date_desc",
background: true
},
// Status queries for operational workflows
{
key: { "status": 1 },
name: "idx_status",
background: true
},
// Order number lookups (unique)
{
key: { "orderNumber": 1 },
name: "idx_order_number",
unique: true,
background: true
}
]);
// 2. Compound indexes for complex query patterns
await collection.createIndexes([
// Customer order history (most frequent compound query)
{
key: {
"customer.customerId": 1,
"orderDate": -1,
"status": 1
},
name: "idx_customer_date_status",
background: true
},
// Order fulfillment workflow
{
key: {
"status": 1,
"priority": 1,
"orderDate": 1
},
name: "idx_fulfillment_workflow",
background: true
},
// Financial reporting and analytics
{
key: {
"orderDate": -1,
"totals.grandTotal": -1,
"customer.tier": 1
},
name: "idx_financial_reporting",
background: true
},
// Sales rep performance tracking
{
key: {
"salesInfo.salesRepId": 1,
"orderDate": -1,
"status": 1
},
name: "idx_sales_rep_performance",
background: true
},
// Geographic analysis
{
key: {
"addresses.shipping.state": 1,
"addresses.shipping.city": 1,
"orderDate": -1
},
name: "idx_geographic_analysis",
background: true
}
]);
// 3. Specialized indexes for advanced query patterns
await collection.createIndexes([
// Text search across multiple fields
{
key: {
"customer.name": "text",
"customer.email": "text",
"orderNumber": "text",
"items.name": "text",
"items.sku": "text"
},
name: "idx_text_search",
background: true
},
// Geospatial index for location-based queries
{
key: { "addresses.shipping.coordinates": "2dsphere" },
name: "idx_shipping_location",
background: true
},
// Sparse index for optional tracking numbers
{
key: { "shipping.trackingNumber": 1 },
name: "idx_tracking_number",
sparse: true,
background: true
},
// Partial index for recent high-value orders
{
key: {
"orderDate": -1,
"totals.grandTotal": -1
},
name: "idx_recent_high_value",
partialFilterExpression: {
"orderDate": { $gte: new Date(Date.now() - 90 * 24 * 60 * 60 * 1000) },
"totals.grandTotal": { $gte: 500 }
},
background: true
}
]);
// 4. Array indexing for embedded documents
await collection.createIndexes([
// Product-based queries on order items
{
key: { "items.productId": 1 },
name: "idx_product_id",
background: true
},
// SKU lookups
{
key: { "items.sku": 1 },
name: "idx_item_sku",
background: true
},
// Category-based analytics
{
key: {
"items.category": 1,
"items.subcategory": 1,
"orderDate": -1
},
name: "idx_category_analytics",
background: true
},
// Brand analysis
{
key: { "items.attributes.brand": 1 },
name: "idx_brand_analysis",
background: true
}
]);
// 5. TTL index for data lifecycle management
await collection.createIndex(
{ "audit.createdAt": 1 },
{
name: "idx_ttl_cleanup",
expireAfterSeconds: 60 * 60 * 24 * 365 * 7, // 7 years retention
background: true
}
);
console.log('Intelligent indexing strategy implemented successfully');
} catch (error) {
console.error('Error creating indexes:', error);
throw error;
}
}
async analyzeQueryPerformance(collection, queryPattern, options = {}) {
console.log('Analyzing query performance with advanced explain plans...');
try {
// Sample query patterns for analysis
const queryPatterns = {
customerOrders: {
filter: { "customer.customerId": new ObjectId("64a1b2c3d4e5f6789012345a") },
sort: { "orderDate": -1 },
limit: 20
},
recentHighValue: {
filter: {
"orderDate": { $gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) },
"totals.grandTotal": { $gte: 100 },
"status": { $in: ["processing", "shipped", "delivered"] }
},
sort: { "totals.grandTotal": -1 },
limit: 50
},
fulfillmentQueue: {
filter: {
"status": "processing",
"priority": { $in: ["expedited", "overnight"] }
},
sort: { "orderDate": 1 },
limit: 100
},
salesAnalytics: {
filter: {
"salesInfo.salesRepId": new ObjectId("64c3d4e5f67890123456c2a"),
"orderDate": {
$gte: new Date("2025-11-01"),
$lt: new Date("2025-12-01")
}
},
sort: { "orderDate": -1 }
}
};
const selectedQuery = queryPatterns[queryPattern] || queryPatterns.customerOrders;
// Execute explain plan with detailed analysis
const explainResult = await collection.find(selectedQuery.filter)
.sort(selectedQuery.sort || {})
.limit(selectedQuery.limit || 1000)
.explain("executionStats");
// Analyze execution statistics
const executionStats = explainResult.executionStats;
const winningPlan = explainResult.queryPlanner.winningPlan;
const performanceAnalysis = {
queryPattern: queryPattern,
executionTime: executionStats.executionTimeMillis,
documentsExamined: executionStats.totalDocsExamined,
documentsReturned: executionStats.totalDocsReturned,
indexesUsed: this.extractIndexesUsed(winningPlan),
// Performance efficiency metrics
selectivityRatio: executionStats.totalDocsReturned / Math.max(executionStats.totalDocsExamined, 1),
indexEfficiency: this.calculateIndexEfficiency(executionStats),
// Performance classification
performanceRating: this.classifyPerformance(executionStats),
// Optimization recommendations
recommendations: this.generateOptimizationRecommendations(explainResult),
// Detailed execution breakdown
executionBreakdown: this.analyzeExecutionStages(winningPlan),
queryDetails: {
filter: selectedQuery.filter,
sort: selectedQuery.sort,
limit: selectedQuery.limit
},
timestamp: new Date()
};
// Store performance metrics for trending
this.performanceMetrics.set(queryPattern, performanceAnalysis);
console.log(`Query Performance Analysis for ${queryPattern}:`);
console.log(` Execution Time: ${performanceAnalysis.executionTime}ms`);
console.log(` Documents Examined: ${performanceAnalysis.documentsExamined}`);
console.log(` Documents Returned: ${performanceAnalysis.documentsReturned}`);
console.log(` Selectivity Ratio: ${performanceAnalysis.selectivityRatio.toFixed(4)}`);
console.log(` Performance Rating: ${performanceAnalysis.performanceRating}`);
console.log(` Indexes Used: ${JSON.stringify(performanceAnalysis.indexesUsed)}`);
if (performanceAnalysis.recommendations.length > 0) {
console.log(' Optimization Recommendations:');
performanceAnalysis.recommendations.forEach(rec => {
console.log(` - ${rec}`);
});
}
return performanceAnalysis;
} catch (error) {
console.error('Error analyzing query performance:', error);
throw error;
}
}
extractIndexesUsed(winningPlan) {
const indexes = [];
const extractFromStage = (stage) => {
if (stage.indexName) {
indexes.push(stage.indexName);
}
if (stage.inputStage) {
extractFromStage(stage.inputStage);
}
if (stage.inputStages) {
stage.inputStages.forEach(inputStage => {
extractFromStage(inputStage);
});
}
};
extractFromStage(winningPlan);
return [...new Set(indexes)]; // Remove duplicates
}
calculateIndexEfficiency(executionStats) {
// Index efficiency = (docs returned / docs examined) * (1 / execution time factor)
const selectivity = executionStats.totalDocsReturned / Math.max(executionStats.totalDocsExamined, 1);
const timeFactor = Math.min(executionStats.executionTimeMillis / 100, 1); // Normalize execution time
return selectivity * (1 - timeFactor);
}
classifyPerformance(executionStats) {
const { executionTimeMillis, totalDocsExamined, totalDocsReturned } = executionStats;
const selectivity = totalDocsReturned / Math.max(totalDocsExamined, 1);
if (executionTimeMillis < 10 && selectivity > 0.1) return 'Excellent';
if (executionTimeMillis < 50 && selectivity > 0.01) return 'Good';
if (executionTimeMillis < 100 && selectivity > 0.001) return 'Fair';
return 'Poor';
}
generateOptimizationRecommendations(explainResult) {
const recommendations = [];
const executionStats = explainResult.executionStats;
const winningPlan = explainResult.queryPlanner.winningPlan;
// High execution time
if (executionStats.executionTimeMillis > 100) {
recommendations.push('Consider adding compound indexes for better query selectivity');
}
// Low selectivity (examining many documents vs returning few)
const selectivity = executionStats.totalDocsReturned / Math.max(executionStats.totalDocsExamined, 1);
if (selectivity < 0.01) {
recommendations.push('Improve query selectivity with more specific filtering criteria');
}
// Collection scan detected
if (winningPlan.stage === 'COLLSCAN') {
recommendations.push('Critical: Query is performing collection scan - add appropriate indexes');
}
// Sort not using index
if (this.findStageInPlan(winningPlan, 'SORT') && !this.findStageInPlan(winningPlan, 'IXSCAN')) {
recommendations.push('Sort operation not using index - consider compound index with sort fields');
}
// High key examination
if (executionStats.totalKeysExamined > executionStats.totalDocsReturned * 10) {
recommendations.push('High key examination ratio - consider more selective compound indexes');
}
return recommendations;
}
findStageInPlan(plan, stageName) {
if (plan.stage === stageName) return true;
if (plan.inputStage && this.findStageInPlan(plan.inputStage, stageName)) return true;
if (plan.inputStages) {
return plan.inputStages.some(stage => this.findStageInPlan(stage, stageName));
}
return false;
}
analyzeExecutionStages(winningPlan) {
const stages = [];
const extractStages = (stage) => {
stages.push({
stage: stage.stage,
indexName: stage.indexName || null,
direction: stage.direction || null,
keysExamined: stage.keysExamined || null,
docsExamined: stage.docsExamined || null,
executionTimeMillis: stage.executionTimeMillisEstimate || null
});
if (stage.inputStage) {
extractStages(stage.inputStage);
}
if (stage.inputStages) {
stage.inputStages.forEach(inputStage => {
extractStages(inputStage);
});
}
};
extractStages(winningPlan);
return stages;
}
async performComprehensiveIndexAnalysis(collection) {
console.log('Performing comprehensive index analysis...');
try {
// Get index statistics
const indexStats = await collection.aggregate([
{ $indexStats: {} }
]).toArray();
// Get collection statistics
const collectionStats = await db.runCommand({ collStats: collection.collectionName });
// Analyze index usage patterns
const indexAnalysis = indexStats.map(index => {
const usageStats = index.accesses;
const indexSize = index.size || 0;
const indexName = index.name;
return {
name: indexName,
// Usage metrics
accessCount: usageStats.ops || 0,
lastAccessed: usageStats.since || null,
// Size metrics
sizeBytes: indexSize,
sizeMB: (indexSize / 1024 / 1024).toFixed(2),
// Efficiency analysis
accessFrequency: this.calculateAccessFrequency(usageStats),
utilizationScore: this.calculateUtilizationScore(usageStats, indexSize),
// Recommendations
recommendation: this.analyzeIndexRecommendation(indexName, usageStats, indexSize)
};
});
// Collection-level analysis
const collectionAnalysis = {
totalDocuments: collectionStats.count,
totalSize: collectionStats.size,
averageDocumentSize: collectionStats.avgObjSize,
totalIndexSize: collectionStats.totalIndexSize,
indexToDataRatio: (collectionStats.totalIndexSize / collectionStats.size).toFixed(2),
// Index efficiency summary
totalIndexes: indexStats.length,
activeIndexes: indexStats.filter(idx => idx.accesses.ops > 0).length,
unusedIndexes: indexStats.filter(idx => idx.accesses.ops === 0).length,
// Performance indicators
indexOverhead: ((collectionStats.totalIndexSize / collectionStats.size) * 100).toFixed(1) + '%',
recommendations: this.generateCollectionRecommendations(indexAnalysis, collectionStats)
};
const analysis = {
collection: collection.collectionName,
analyzedAt: new Date(),
collectionMetrics: collectionAnalysis,
indexDetails: indexAnalysis,
// Summary classifications
performanceStatus: this.classifyCollectionPerformance(collectionAnalysis),
optimizationPriority: this.determineOptimizationPriority(indexAnalysis),
// Action items
actionItems: this.generateActionItems(indexAnalysis, collectionAnalysis)
};
console.log('Index Analysis Summary:');
console.log(` Total Indexes: ${collectionAnalysis.totalIndexes}`);
console.log(` Active Indexes: ${collectionAnalysis.activeIndexes}`);
console.log(` Unused Indexes: ${collectionAnalysis.unusedIndexes}`);
console.log(` Index Overhead: ${collectionAnalysis.indexOverhead}`);
console.log(` Performance Status: ${analysis.performanceStatus}`);
return analysis;
} catch (error) {
console.error('Error performing index analysis:', error);
throw error;
}
}
calculateAccessFrequency(usageStats) {
if (!usageStats.since || usageStats.ops === 0) return 'Never';
const daysSince = (Date.now() - usageStats.since.getTime()) / (1000 * 60 * 60 * 24);
const accessesPerDay = usageStats.ops / Math.max(daysSince, 1);
if (accessesPerDay > 1000) return 'Very High';
if (accessesPerDay > 100) return 'High';
if (accessesPerDay > 10) return 'Moderate';
if (accessesPerDay > 1) return 'Low';
return 'Very Low';
}
calculateUtilizationScore(usageStats, indexSize) {
// Score based on access frequency vs storage cost
const accessCount = usageStats.ops || 0;
const sizeCost = indexSize / (1024 * 1024); // Size in MB
if (accessCount === 0) return 0;
// Higher score for more accesses per MB of storage
return Math.min((accessCount / Math.max(sizeCost, 1)) / 1000, 10);
}
analyzeIndexRecommendation(indexName, usageStats, indexSize) {
if (indexName === '_id_') return 'System index - always keep';
if (usageStats.ops === 0) {
return 'Consider dropping - unused index consuming storage';
}
if (usageStats.ops < 10 && indexSize > 10 * 1024 * 1024) { // < 10 uses and > 10MB
return 'Low utilization - evaluate if index is necessary';
}
if (usageStats.ops > 10000) {
return 'High utilization - keep and monitor performance';
}
return 'Normal utilization - maintain current index';
}
generateCollectionRecommendations(indexAnalysis, collectionStats) {
const recommendations = [];
// Check for unused indexes
const unusedIndexes = indexAnalysis.filter(idx => idx.accessCount === 0 && idx.name !== '_id_');
if (unusedIndexes.length > 0) {
recommendations.push(`Drop ${unusedIndexes.length} unused indexes to reduce storage overhead`);
}
// Check index-to-data ratio
const indexRatio = collectionStats.totalIndexSize / collectionStats.size;
if (indexRatio > 1.5) {
recommendations.push('High index overhead - review index necessity and consider consolidation');
}
// Check for very large indexes with low utilization
const inefficientIndexes = indexAnalysis.filter(idx =>
idx.sizeBytes > 100 * 1024 * 1024 && idx.utilizationScore < 1
);
if (inefficientIndexes.length > 0) {
recommendations.push('Large indexes with low utilization detected - consider optimization');
}
return recommendations;
}
classifyCollectionPerformance(collectionAnalysis) {
const unusedRatio = collectionAnalysis.unusedIndexes / collectionAnalysis.totalIndexes;
const indexOverheadPercent = parseFloat(collectionAnalysis.indexOverhead);
if (unusedRatio > 0.3 || indexOverheadPercent > 200) return 'Poor';
if (unusedRatio > 0.2 || indexOverheadPercent > 150) return 'Fair';
if (unusedRatio > 0.1 || indexOverheadPercent > 100) return 'Good';
return 'Excellent';
}
determineOptimizationPriority(indexAnalysis) {
const unusedCount = indexAnalysis.filter(idx => idx.accessCount === 0).length;
const lowUtilizationCount = indexAnalysis.filter(idx => idx.utilizationScore < 1).length;
if (unusedCount > 3 || lowUtilizationCount > 5) return 'High';
if (unusedCount > 1 || lowUtilizationCount > 2) return 'Medium';
return 'Low';
}
generateActionItems(indexAnalysis, collectionAnalysis) {
const actions = [];
// Unused index cleanup
const unusedIndexes = indexAnalysis.filter(idx => idx.accessCount === 0 && idx.name !== '_id_');
unusedIndexes.forEach(idx => {
actions.push({
type: 'DROP_INDEX',
indexName: idx.name,
reason: 'Unused index consuming storage',
priority: 'Medium',
estimatedSavings: `${idx.sizeMB}MB storage`
});
});
// Low utilization optimization
const lowUtilizationIndexes = indexAnalysis.filter(idx =>
idx.utilizationScore < 1 && idx.accessCount > 0 && idx.sizeBytes > 10 * 1024 * 1024
);
lowUtilizationIndexes.forEach(idx => {
actions.push({
type: 'REVIEW_INDEX',
indexName: idx.name,
reason: 'Low utilization for large index',
priority: 'Low',
recommendation: 'Evaluate query patterns and consider consolidation'
});
});
return actions;
}
async demonstrateAdvancedQuerying(collection) {
console.log('Demonstrating advanced querying with performance optimization...');
const queryExamples = [
{
name: 'Customer Order History with Analytics',
query: async () => {
return await collection.find({
"customer.customerId": new ObjectId("64a1b2c3d4e5f6789012345a"),
"orderDate": { $gte: new Date("2025-01-01") }
})
.sort({ "orderDate": -1 })
.limit(20)
.explain("executionStats");
}
},
{
name: 'High-Value Recent Orders',
query: async () => {
return await collection.find({
"orderDate": { $gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) },
"totals.grandTotal": { $gte: 500 },
"status": { $in: ["processing", "shipped", "delivered"] }
})
.sort({ "totals.grandTotal": -1 })
.limit(50)
.explain("executionStats");
}
},
{
name: 'Geographic Sales Analysis',
query: async () => {
return await collection.find({
"addresses.shipping.state": "CA",
"orderDate": {
$gte: new Date("2025-11-01"),
$lt: new Date("2025-12-01")
}
})
.sort({ "orderDate": -1 })
.explain("executionStats");
}
},
{
name: 'Product Category Performance',
query: async () => {
return await collection.find({
"items.category": "electronics",
"orderDate": { $gte: new Date("2025-11-01") }
})
.sort({ "totals.grandTotal": -1 })
.explain("executionStats");
}
}
];
const results = {};
for (const example of queryExamples) {
try {
console.log(`\nTesting: ${example.name}`);
const result = await example.query();
const stats = result.executionStats;
const performance = {
executionTime: stats.executionTimeMillis,
documentsExamined: stats.totalDocsExamined,
documentsReturned: stats.totalDocsReturned,
indexesUsed: this.extractIndexesUsed(result.queryPlanner.winningPlan),
efficiency: (stats.totalDocsReturned / Math.max(stats.totalDocsExamined, 1)).toFixed(4)
};
console.log(` Execution Time: ${performance.executionTime}ms`);
console.log(` Efficiency Ratio: ${performance.efficiency}`);
console.log(` Indexes Used: ${JSON.stringify(performance.indexesUsed)}`);
results[example.name] = performance;
} catch (error) {
console.error(`Error testing ${example.name}:`, error);
results[example.name] = { error: error.message };
}
}
return results;
}
}
// Export optimization class
module.exports = { MongoDBIndexOptimizer };
// Benefits of MongoDB Index Optimization:
// - Intelligent compound indexing for complex query patterns
// - Automated performance analysis and recommendations
// - Flexible indexing strategies for evolving schemas
// - Advanced query execution analysis with detailed metrics
// - Comprehensive index utilization monitoring
// - Automated optimization suggestions based on usage patterns
// - Support for specialized indexes (geospatial, text, sparse, partial)
// - Integration with existing MongoDB ecosystem and tooling
// - Real-time performance monitoring and alerting capabilities
// - Cost-effective storage optimization through intelligent index management
Understanding MongoDB Index Architecture
Compound Index Design Patterns
MongoDB's compound indexing system supports sophisticated query optimization strategies:
// Advanced compound indexing patterns for enterprise applications
class CompoundIndexStrategist {
constructor(db) {
this.db = db;
this.indexStrategies = new Map();
this.queryPatterns = new Map();
}
async analyzeQueryPatternsAndCreateIndexes() {
console.log('Analyzing query patterns and creating optimized compound indexes...');
// Pattern 1: ESR (Equality, Sort, Range) Index Design
const esrPattern = {
description: "Equality-Sort-Range compound index optimization",
// Customer order queries: customer (equality) + date (sort) + status (range)
index: {
"customer.customerId": 1, // Equality first
"orderDate": -1, // Sort second
"status": 1 // Range/filter third
},
queryExamples: [
{
filter: {
"customer.customerId": "specific_customer_id",
"status": { $in: ["processing", "shipped"] }
},
sort: { "orderDate": -1 },
description: "Customer order history with status filtering"
}
],
performance: "Optimal - follows ESR pattern for maximum efficiency"
};
// Pattern 2: Multi-dimensional Analytics Index
const analyticsPattern = {
description: "Multi-dimensional analytics with hierarchical grouping",
index: {
"orderDate": -1, // Time dimension (most selective)
"customer.tier": 1, // Customer segment
"items.category": 1, // Product category
"totals.grandTotal": -1 // Value dimension
},
queryExamples: [
{
pipeline: [
{
$match: {
"orderDate": { $gte: new Date("2025-01-01") },
"customer.tier": "premium"
}
},
{
$group: {
_id: {
month: { $dateToString: { format: "%Y-%m", date: "$orderDate" } },
category: "$items.category"
},
totalRevenue: { $sum: "$totals.grandTotal" },
orderCount: { $sum: 1 }
}
}
],
description: "Monthly revenue by customer tier and product category"
}
]
};
// Pattern 3: Geospatial + Business Logic Index
const geospatialPattern = {
description: "Geospatial queries combined with business filters",
index: {
"addresses.shipping.coordinates": "2dsphere", // Geospatial first
"status": 1, // Business filter
"orderDate": -1 // Time component
},
queryExamples: [
{
filter: {
"addresses.shipping.coordinates": {
$near: {
$geometry: { type: "Point", coordinates: [-122.4194, 37.7749] },
$maxDistance: 10000 // 10km radius
}
},
"status": "processing",
"orderDate": { $gte: new Date("2025-11-01") }
},
description: "Recent processing orders within geographic radius"
}
]
};
// Pattern 4: Text Search + Faceted Filtering
const textSearchPattern = {
description: "Full-text search with multiple filter dimensions",
textIndex: {
"customer.name": "text",
"items.name": "text",
"items.sku": "text",
"orderNumber": "text"
},
supportingIndexes: [
{
"customer.tier": 1,
"orderDate": -1
},
{
"items.category": 1,
"totals.grandTotal": -1
}
],
queryExamples: [
{
filter: {
$text: { $search: "premium widget" },
"customer.tier": "enterprise",
"orderDate": { $gte: new Date("2025-10-01") }
},
sort: { score: { $meta: "textScore" } },
description: "Text search with customer tier and date filtering"
}
]
};
// Create indexes based on patterns
const ordersCollection = this.db.collection('orders');
await this.implementIndexStrategy(ordersCollection, 'ESR_Pattern', esrPattern.index);
await this.implementIndexStrategy(ordersCollection, 'Analytics_Pattern', analyticsPattern.index);
await this.implementIndexStrategy(ordersCollection, 'Geospatial_Pattern', geospatialPattern.index);
await this.implementTextSearchStrategy(ordersCollection, textSearchPattern);
// Store strategies for analysis
this.indexStrategies.set('esr', esrPattern);
this.indexStrategies.set('analytics', analyticsPattern);
this.indexStrategies.set('geospatial', geospatialPattern);
this.indexStrategies.set('textSearch', textSearchPattern);
console.log('Advanced compound index strategies implemented');
return this.indexStrategies;
}
async implementIndexStrategy(collection, strategyName, indexSpec) {
try {
await collection.createIndex(indexSpec, {
name: `idx_${strategyName.toLowerCase()}`,
background: true
});
console.log(`✅ Created index strategy: ${strategyName}`);
} catch (error) {
console.error(`❌ Failed to create ${strategyName}:`, error.message);
}
}
async implementTextSearchStrategy(collection, textPattern) {
try {
// Create text index
await collection.createIndex(textPattern.textIndex, {
name: "idx_text_search_comprehensive",
background: true
});
// Create supporting indexes for faceted filtering
for (let i = 0; i < textPattern.supportingIndexes.length; i++) {
await collection.createIndex(textPattern.supportingIndexes[i], {
name: `idx_text_support_${i + 1}`,
background: true
});
}
console.log('✅ Created text search strategy with supporting indexes');
} catch (error) {
console.error('❌ Failed to create text search strategy:', error.message);
}
}
async optimizeExistingIndexes(collection) {
console.log('Optimizing existing indexes based on query patterns...');
try {
// Get current indexes
const currentIndexes = await collection.listIndexes().toArray();
// Analyze index effectiveness
const indexAnalysis = await this.analyzeIndexEffectiveness(collection, currentIndexes);
// Generate optimization plan
const optimizationPlan = this.createOptimizationPlan(indexAnalysis);
// Execute optimization (with safety checks)
await this.executeOptimizationPlan(collection, optimizationPlan);
return optimizationPlan;
} catch (error) {
console.error('Error optimizing indexes:', error);
throw error;
}
}
async analyzeIndexEffectiveness(collection, indexes) {
const analysis = [];
for (const index of indexes) {
if (index.name === '_id_') continue; // Skip default index
try {
// Get index statistics
const stats = await collection.aggregate([
{ $indexStats: {} },
{ $match: { name: index.name } }
]).toArray();
const indexStat = stats[0];
if (!indexStat) continue;
// Analyze index composition
const indexComposition = this.analyzeIndexComposition(index.key);
// Calculate efficiency metrics
const efficiency = {
usageCount: indexStat.accesses?.ops || 0,
lastUsed: indexStat.accesses?.since || null,
sizeBytes: indexStat.size || 0,
// Index pattern analysis
composition: indexComposition,
followsESRPattern: this.checkESRPattern(index.key),
hasRedundancy: await this.checkIndexRedundancy(collection, index),
// Performance classification
utilizationScore: this.calculateUtilizationScore(indexStat),
efficiencyRating: this.rateIndexEfficiency(indexStat, indexComposition)
};
analysis.push({
name: index.name,
keyPattern: index.key,
...efficiency
});
} catch (error) {
console.warn(`Could not analyze index ${index.name}:`, error.message);
}
}
return analysis;
}
analyzeIndexComposition(keyPattern) {
const keys = Object.keys(keyPattern);
const composition = {
fieldCount: keys.length,
hasEquality: false,
hasSort: false,
hasRange: false,
hasGeospatial: false,
hasText: false
};
keys.forEach((key, index) => {
const value = keyPattern[key];
// Detect index type based on value and position
if (value === 1 || value === -1) {
if (index === 0) composition.hasEquality = true;
if (index === 1) composition.hasSort = true;
if (index > 1) composition.hasRange = true;
}
if (value === '2dsphere' || value === '2d') composition.hasGeospatial = true;
if (value === 'text') composition.hasText = true;
});
return composition;
}
checkESRPattern(keyPattern) {
const keys = Object.keys(keyPattern);
if (keys.length < 3) return false;
// ESR: First field equality, second sort, third range
const values = Object.values(keyPattern);
return (values[0] === 1 || values[0] === -1) &&
(values[1] === 1 || values[1] === -1) &&
(values[2] === 1 || values[2] === -1);
}
async checkIndexRedundancy(collection, targetIndex) {
// Check if this index is redundant with other indexes
const allIndexes = await collection.listIndexes().toArray();
const targetKeys = Object.keys(targetIndex.key);
for (const otherIndex of allIndexes) {
if (otherIndex.name === targetIndex.name || otherIndex.name === '_id_') continue;
const otherKeys = Object.keys(otherIndex.key);
// Check if targetIndex is a prefix of otherIndex (redundant)
if (targetKeys.length <= otherKeys.length) {
const isPrefix = targetKeys.every((key, index) =>
otherKeys[index] === key &&
targetIndex.key[key] === otherIndex.key[key]
);
if (isPrefix) return otherIndex.name;
}
}
return false;
}
calculateUtilizationScore(indexStat) {
const usage = indexStat.accesses?.ops || 0;
const size = indexStat.size || 0;
if (usage === 0) return 0;
if (size === 0) return 10; // System indexes
// Score based on usage per MB
const sizeMB = size / (1024 * 1024);
return Math.min((usage / sizeMB) / 100, 10);
}
rateIndexEfficiency(indexStat, composition) {
let score = 5; // Base score
// Usage factor
const usage = indexStat.accesses?.ops || 0;
if (usage > 10000) score += 2;
else if (usage > 1000) score += 1;
else if (usage === 0) score -= 3;
// Composition factor
if (composition.followsESRPattern) score += 2;
if (composition.hasGeospatial || composition.hasText) score += 1;
if (composition.fieldCount > 5) score -= 1; // Too many fields
// Size factor (prefer smaller indexes for same functionality)
const sizeMB = (indexStat.size || 0) / (1024 * 1024);
if (sizeMB > 100) score -= 1;
return Math.max(Math.min(score, 10), 0);
}
createOptimizationPlan(indexAnalysis) {
const plan = {
actions: [],
expectedBenefits: [],
risks: [],
estimatedImpact: {}
};
// Identify unused indexes
const unusedIndexes = indexAnalysis.filter(idx => idx.usageCount === 0);
unusedIndexes.forEach(idx => {
plan.actions.push({
type: 'DROP',
indexName: idx.name,
reason: 'Unused index consuming storage',
impact: `Save ${(idx.sizeBytes / 1024 / 1024).toFixed(2)}MB storage`,
priority: 'HIGH'
});
});
// Identify redundant indexes
const redundantIndexes = indexAnalysis.filter(idx => idx.hasRedundancy);
redundantIndexes.forEach(idx => {
plan.actions.push({
type: 'DROP',
indexName: idx.name,
reason: `Redundant with ${idx.hasRedundancy}`,
impact: 'Reduce index maintenance overhead',
priority: 'MEDIUM'
});
});
// Suggest compound index improvements
const inefficientIndexes = indexAnalysis.filter(idx =>
idx.efficiencyRating < 5 && idx.usageCount > 0
);
inefficientIndexes.forEach(idx => {
if (!idx.composition.followsESRPattern) {
plan.actions.push({
type: 'REBUILD',
indexName: idx.name,
reason: 'Does not follow ESR pattern',
suggestion: 'Reorder fields: Equality, Sort, Range',
impact: 'Improve query performance',
priority: 'MEDIUM'
});
}
});
// Calculate expected benefits
const storageSavings = unusedIndexes.reduce((sum, idx) => sum + idx.sizeBytes, 0);
plan.estimatedImpact.storageSavings = `${(storageSavings / 1024 / 1024).toFixed(2)}MB`;
plan.estimatedImpact.maintenanceReduction = `${unusedIndexes.length + redundantIndexes.length} fewer indexes`;
return plan;
}
async executeOptimizationPlan(collection, plan) {
console.log('Executing index optimization plan...');
for (const action of plan.actions) {
try {
if (action.type === 'DROP' && action.priority === 'HIGH') {
// Only auto-execute high-priority drops (unused indexes)
console.log(`Dropping unused index: ${action.indexName}`);
await collection.dropIndex(action.indexName);
console.log(`✅ Successfully dropped index: ${action.indexName}`);
} else {
console.log(`📋 Recommended action: ${action.type} ${action.indexName} - ${action.reason}`);
}
} catch (error) {
console.error(`❌ Failed to execute action on ${action.indexName}:`, error.message);
}
}
console.log('Index optimization plan execution completed');
}
async generatePerformanceReport(collection) {
console.log('Generating comprehensive performance report...');
try {
// Get collection statistics
const stats = await this.db.runCommand({ collStats: collection.collectionName });
// Get index usage statistics
const indexStats = await collection.aggregate([
{ $indexStats: {} }
]).toArray();
// Analyze recent query performance
const performanceMetrics = Array.from(this.performanceMetrics.values());
// Generate comprehensive report
const report = {
collectionName: collection.collectionName,
generatedAt: new Date(),
// Collection overview
overview: {
totalDocuments: stats.count,
totalSizeGB: (stats.size / 1024 / 1024 / 1024).toFixed(2),
averageDocumentSizeKB: (stats.avgObjSize / 1024).toFixed(2),
totalIndexes: indexStats.length,
totalIndexSizeGB: (stats.totalIndexSize / 1024 / 1024 / 1024).toFixed(2),
indexToDataRatio: (stats.totalIndexSize / stats.size).toFixed(2)
},
// Index performance summary
indexPerformance: {
activeIndexes: indexStats.filter(idx => idx.accesses?.ops > 0).length,
unusedIndexes: indexStats.filter(idx => idx.accesses?.ops === 0).length - 1, // Exclude _id_
highUtilizationIndexes: indexStats.filter(idx => idx.accesses?.ops > 10000).length,
// Top performing indexes
topIndexes: indexStats
.filter(idx => idx.name !== '_id_' && idx.accesses?.ops > 0)
.sort((a, b) => (b.accesses?.ops || 0) - (a.accesses?.ops || 0))
.slice(0, 5)
.map(idx => ({
name: idx.name,
accessCount: idx.accesses?.ops || 0,
sizeMB: ((idx.size || 0) / 1024 / 1024).toFixed(2)
}))
},
// Query performance analysis
queryPerformance: {
totalQueriesAnalyzed: performanceMetrics.length,
averageExecutionTime: performanceMetrics.length > 0 ?
(performanceMetrics.reduce((sum, metric) => sum + metric.executionTime, 0) / performanceMetrics.length).toFixed(2) : 0,
excellentQueries: performanceMetrics.filter(m => m.performanceRating === 'Excellent').length,
poorQueries: performanceMetrics.filter(m => m.performanceRating === 'Poor').length,
// Query patterns
commonPatterns: this.identifyCommonQueryPatterns(performanceMetrics)
},
// Recommendations
recommendations: this.generatePerformanceRecommendations(stats, indexStats, performanceMetrics),
// Health score
healthScore: this.calculateHealthScore(stats, indexStats, performanceMetrics)
};
// Display report summary
console.log('\n📊 Performance Report Summary:');
console.log(`Collection: ${report.collectionName}`);
console.log(`Documents: ${report.overview.totalDocuments.toLocaleString()}`);
console.log(`Data Size: ${report.overview.totalSizeGB}GB`);
console.log(`Index Size: ${report.overview.totalIndexSizeGB}GB`);
console.log(`Active Indexes: ${report.indexPerformance.activeIndexes}/${report.overview.totalIndexes}`);
console.log(`Health Score: ${report.healthScore}/100`);
if (report.recommendations.length > 0) {
console.log('\n💡 Top Recommendations:');
report.recommendations.slice(0, 3).forEach(rec => {
console.log(` • ${rec}`);
});
}
return report;
} catch (error) {
console.error('Error generating performance report:', error);
throw error;
}
}
identifyCommonQueryPatterns(performanceMetrics) {
// Analyze query patterns to identify common access patterns
const patterns = new Map();
performanceMetrics.forEach(metric => {
const pattern = metric.queryPattern || 'unknown';
if (patterns.has(pattern)) {
patterns.set(pattern, patterns.get(pattern) + 1);
} else {
patterns.set(pattern, 1);
}
});
return Array.from(patterns.entries())
.sort(([,a], [,b]) => b - a)
.slice(0, 5)
.map(([pattern, count]) => ({ pattern, count }));
}
generatePerformanceRecommendations(collectionStats, indexStats, queryMetrics) {
const recommendations = [];
// Index optimization recommendations
const unusedCount = indexStats.filter(idx => idx.name !== '_id_' && idx.accesses?.ops === 0).length;
if (unusedCount > 0) {
recommendations.push(`Remove ${unusedCount} unused indexes to reduce storage and maintenance overhead`);
}
// Size recommendations
const indexRatio = collectionStats.totalIndexSize / collectionStats.size;
if (indexRatio > 1.5) {
recommendations.push('High index-to-data ratio detected - review index necessity');
}
// Query performance recommendations
const poorQueries = queryMetrics.filter(m => m.performanceRating === 'Poor').length;
if (poorQueries > 0) {
recommendations.push(`Optimize ${poorQueries} poorly performing query patterns`);
}
// Compound index recommendations
const singleFieldIndexes = indexStats.filter(idx =>
Object.keys(idx.key || {}).length === 1 && idx.name !== '_id_'
).length;
if (singleFieldIndexes > 5) {
recommendations.push('Consider consolidating single-field indexes into compound indexes');
}
return recommendations;
}
calculateHealthScore(collectionStats, indexStats, queryMetrics) {
let score = 100;
// Index efficiency penalty
const unusedIndexes = indexStats.filter(idx => idx.name !== '_id_' && idx.accesses?.ops === 0).length;
const totalIndexes = indexStats.length - 1; // Exclude _id_
const unusedRatio = unusedIndexes / Math.max(totalIndexes, 1);
score -= unusedRatio * 30; // Up to 30 points penalty
// Size efficiency penalty
const indexRatio = collectionStats.totalIndexSize / collectionStats.size;
if (indexRatio > 2) score -= 20;
else if (indexRatio > 1.5) score -= 10;
// Query performance penalty
const poorQueryRatio = queryMetrics.filter(m => m.performanceRating === 'Poor').length / Math.max(queryMetrics.length, 1);
score -= poorQueryRatio * 25; // Up to 25 points penalty
// Average execution time penalty
const avgExecutionTime = queryMetrics.length > 0 ?
queryMetrics.reduce((sum, metric) => sum + metric.executionTime, 0) / queryMetrics.length : 0;
if (avgExecutionTime > 100) score -= 15;
else if (avgExecutionTime > 50) score -= 8;
return Math.max(Math.round(score), 0);
}
}
// Export the compound index strategist
module.exports = { CompoundIndexStrategist };
SQL-Style Index Optimization with QueryLeaf
QueryLeaf provides familiar SQL syntax for MongoDB index optimization and performance tuning:
-- QueryLeaf index optimization with SQL-familiar syntax
-- Create optimized indexes using SQL DDL syntax
CREATE INDEX idx_customer_order_history ON orders (
customer.customer_id ASC,
order_date DESC,
status ASC
) WITH (
background = true,
name = 'idx_customer_order_history'
);
-- Create compound indexes following ESR (Equality, Sort, Range) pattern
CREATE INDEX idx_sales_analytics ON orders (
sales_info.sales_rep_id ASC, -- Equality filter (most selective)
order_date DESC, -- Sort operation
totals.grand_total DESC -- Range filter
) WITH (
background = true,
partial_filter = 'status IN (''completed'', ''delivered'')'
);
-- Create geospatial index for location-based queries
CREATE INDEX idx_shipping_location ON orders
USING GEOSPHERE (addresses.shipping.coordinates)
WITH (background = true);
-- Create text index for search functionality
CREATE INDEX idx_full_text_search ON orders
USING TEXT (
customer.name,
customer.email,
order_number,
items.name,
items.sku
) WITH (
default_language = 'english',
background = true
);
-- Analyze query performance with SQL EXPLAIN
EXPLAIN (ANALYZE true, BUFFERS true)
SELECT
order_number,
customer.name,
order_date,
totals.grand_total,
status
FROM orders
WHERE customer.customer_id = ObjectId('64a1b2c3d4e5f6789012345a')
AND order_date >= CURRENT_DATE - INTERVAL '90 days'
AND status IN ('processing', 'shipped', 'delivered')
ORDER BY order_date DESC
LIMIT 20;
-- Index usage analysis and optimization recommendations
WITH index_usage_stats AS (
SELECT
index_name,
access_count,
last_accessed,
size_bytes,
size_mb,
-- Calculate utilization metrics
CASE
WHEN access_count = 0 THEN 'Unused'
WHEN access_count < 100 THEN 'Low'
WHEN access_count < 10000 THEN 'Moderate'
ELSE 'High'
END as usage_level,
-- Calculate efficiency score
CASE
WHEN access_count = 0 THEN 0
ELSE ROUND((access_count::numeric / (size_mb + 1)) * 100, 2)
END as efficiency_score
FROM mongodb_index_statistics('orders')
WHERE index_name != '_id_'
),
index_recommendations AS (
SELECT
index_name,
usage_level,
efficiency_score,
size_mb,
-- Generate recommendations based on usage patterns
CASE
WHEN usage_level = 'Unused' THEN 'DROP - Unused index consuming storage'
WHEN usage_level = 'Low' AND size_mb > 10 THEN 'REVIEW - Low usage for large index'
WHEN efficiency_score > 1000 THEN 'MAINTAIN - High efficiency index'
WHEN efficiency_score < 50 THEN 'OPTIMIZE - Poor efficiency ratio'
ELSE 'MONITOR - Normal usage pattern'
END as recommendation,
-- Priority for action
CASE
WHEN usage_level = 'Unused' THEN 'HIGH'
WHEN usage_level = 'Low' AND size_mb > 50 THEN 'MEDIUM'
WHEN efficiency_score < 25 THEN 'MEDIUM'
ELSE 'LOW'
END as priority
FROM index_usage_stats
)
SELECT
index_name,
usage_level,
ROUND(efficiency_score, 2) as efficiency_score,
ROUND(size_mb, 2) as size_mb,
recommendation,
priority,
-- Estimated impact
CASE
WHEN recommendation LIKE 'DROP%' THEN CONCAT('Save ', ROUND(size_mb, 1), 'MB storage')
WHEN recommendation LIKE 'OPTIMIZE%' THEN 'Improve query performance'
ELSE 'Monitor performance'
END as estimated_impact
FROM index_recommendations
ORDER BY
CASE priority
WHEN 'HIGH' THEN 1
WHEN 'MEDIUM' THEN 2
ELSE 3
END,
efficiency_score ASC;
-- Compound index optimization analysis
WITH query_pattern_analysis AS (
SELECT
collection_name,
query_pattern,
avg_execution_time_ms,
avg_docs_examined,
avg_docs_returned,
-- Calculate selectivity ratio
CASE
WHEN avg_docs_examined > 0 THEN
ROUND((avg_docs_returned::numeric / avg_docs_examined) * 100, 2)
ELSE 0
END as selectivity_percent,
-- Identify query pattern type
CASE
WHEN query_pattern LIKE '%customer_id%' AND query_pattern LIKE '%order_date%' THEN 'customer_history'
WHEN query_pattern LIKE '%status%' AND query_pattern LIKE '%priority%' THEN 'fulfillment'
WHEN query_pattern LIKE '%sales_rep%' THEN 'sales_analytics'
WHEN query_pattern LIKE '%location%' THEN 'geographic'
ELSE 'other'
END as pattern_type
FROM mongodb_query_performance_log
WHERE timestamp >= CURRENT_TIMESTAMP - INTERVAL '7 days'
AND collection_name = 'orders'
),
index_optimization_opportunities AS (
SELECT
pattern_type,
COUNT(*) as query_count,
AVG(avg_execution_time_ms) as avg_execution_time,
AVG(selectivity_percent) as avg_selectivity,
-- Performance classification
CASE
WHEN AVG(avg_execution_time_ms) > 100 THEN 'Poor'
WHEN AVG(avg_execution_time_ms) > 50 THEN 'Fair'
WHEN AVG(avg_execution_time_ms) > 10 THEN 'Good'
ELSE 'Excellent'
END as performance_rating,
-- Optimization recommendations
CASE pattern_type
WHEN 'customer_history' THEN 'Compound index: customer_id + order_date + status'
WHEN 'fulfillment' THEN 'Compound index: status + priority + order_date'
WHEN 'sales_analytics' THEN 'Compound index: sales_rep_id + order_date + total_amount'
WHEN 'geographic' THEN 'Geospatial index: shipping_coordinates + status + date'
ELSE 'Analyze query patterns for custom compound index'
END as index_recommendation
FROM query_pattern_analysis
GROUP BY pattern_type
HAVING COUNT(*) >= 10 -- Only analyze patterns with sufficient volume
)
SELECT
pattern_type,
query_count,
ROUND(avg_execution_time, 2) as avg_execution_time_ms,
ROUND(avg_selectivity, 2) as avg_selectivity_percent,
performance_rating,
index_recommendation,
-- Optimization priority
CASE
WHEN performance_rating = 'Poor' AND query_count > 1000 THEN 'CRITICAL'
WHEN performance_rating IN ('Poor', 'Fair') AND query_count > 100 THEN 'HIGH'
WHEN performance_rating = 'Fair' THEN 'MEDIUM'
ELSE 'LOW'
END as optimization_priority
FROM index_optimization_opportunities
ORDER BY
CASE optimization_priority
WHEN 'CRITICAL' THEN 1
WHEN 'HIGH' THEN 2
WHEN 'MEDIUM' THEN 3
ELSE 4
END,
query_count DESC;
-- Performance monitoring dashboard
WITH performance_metrics AS (
SELECT
DATE_TRUNC('hour', timestamp) as hour_bucket,
collection_name,
-- Query performance metrics
COUNT(*) as total_queries,
AVG(execution_time_ms) as avg_execution_time,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY execution_time_ms) as p95_execution_time,
MAX(execution_time_ms) as max_execution_time,
-- Index usage metrics
AVG(docs_examined::numeric / GREATEST(docs_returned, 1)) as avg_docs_per_result,
AVG(CASE WHEN index_used THEN 1.0 ELSE 0.0 END) as index_usage_ratio,
-- Query efficiency
AVG(CASE WHEN docs_examined > 0 THEN docs_returned::numeric / docs_examined ELSE 1 END) as avg_selectivity,
-- Performance classification
COUNT(*) FILTER (WHERE execution_time_ms <= 10) as excellent_queries,
COUNT(*) FILTER (WHERE execution_time_ms <= 50) as good_queries,
COUNT(*) FILTER (WHERE execution_time_ms <= 100) as fair_queries,
COUNT(*) FILTER (WHERE execution_time_ms > 100) as poor_queries
FROM mongodb_query_performance_log
WHERE timestamp >= CURRENT_TIMESTAMP - INTERVAL '24 hours'
AND collection_name = 'orders'
GROUP BY DATE_TRUNC('hour', timestamp), collection_name
),
performance_trends AS (
SELECT *,
-- Calculate performance trends
LAG(avg_execution_time) OVER (ORDER BY hour_bucket) as prev_hour_avg_time,
LAG(index_usage_ratio) OVER (ORDER BY hour_bucket) as prev_hour_index_usage,
-- Performance health score (0-100)
ROUND(
(excellent_queries::numeric / total_queries * 40) +
(good_queries::numeric / total_queries * 30) +
(fair_queries::numeric / total_queries * 20) +
(index_usage_ratio * 10),
0
) as performance_health_score
FROM performance_metrics
)
SELECT
TO_CHAR(hour_bucket, 'YYYY-MM-DD HH24:00') as monitoring_hour,
total_queries,
ROUND(avg_execution_time::numeric, 2) as avg_execution_time_ms,
ROUND(p95_execution_time::numeric, 2) as p95_execution_time_ms,
ROUND((index_usage_ratio * 100)::numeric, 1) as index_usage_percent,
ROUND((avg_selectivity * 100)::numeric, 2) as avg_selectivity_percent,
performance_health_score,
-- Performance distribution
CONCAT(
excellent_queries, ' excellent, ',
good_queries, ' good, ',
fair_queries, ' fair, ',
poor_queries, ' poor'
) as query_distribution,
-- Trend indicators
CASE
WHEN avg_execution_time > prev_hour_avg_time * 1.2 THEN '📈 Degrading'
WHEN avg_execution_time < prev_hour_avg_time * 0.8 THEN '📉 Improving'
ELSE '➡️ Stable'
END as performance_trend,
-- Health status
CASE
WHEN performance_health_score >= 90 THEN '🟢 Excellent'
WHEN performance_health_score >= 75 THEN '🟡 Good'
WHEN performance_health_score >= 60 THEN '🟠 Fair'
ELSE '🔴 Poor'
END as health_status,
-- Recommendations
CASE
WHEN performance_health_score < 60 THEN 'Immediate optimization required'
WHEN index_usage_ratio < 0.8 THEN 'Review query patterns and add missing indexes'
WHEN avg_selectivity < 0.1 THEN 'Improve query selectivity with better filtering'
WHEN poor_queries > total_queries * 0.1 THEN 'Optimize slow query patterns'
ELSE 'Performance within acceptable range'
END as recommendation
FROM performance_trends
WHERE hour_bucket >= CURRENT_TIMESTAMP - INTERVAL '24 hours'
ORDER BY hour_bucket DESC;
-- Index maintenance automation
CREATE PROCEDURE optimize_collection_indexes(
collection_name VARCHAR(100),
maintenance_mode VARCHAR(20) DEFAULT 'conservative'
) AS
BEGIN
-- Analyze current index usage
WITH index_analysis AS (
SELECT
index_name,
access_count,
size_bytes,
last_accessed,
CASE
WHEN access_count = 0 THEN 'unused'
WHEN access_count < 10 AND size_bytes > 10 * 1024 * 1024 THEN 'underutilized'
WHEN access_count > 50000 THEN 'high_usage'
ELSE 'normal'
END as usage_category
FROM mongodb_index_statistics(collection_name)
WHERE index_name != '_id_'
)
SELECT
COUNT(*) FILTER (WHERE usage_category = 'unused') as unused_count,
COUNT(*) FILTER (WHERE usage_category = 'underutilized') as underutilized_count,
SUM(size_bytes) FILTER (WHERE usage_category = 'unused') as unused_size_bytes
INTO TEMPORARY TABLE maintenance_summary;
-- Execute maintenance based on mode
CASE maintenance_mode
WHEN 'aggressive' THEN
-- Drop unused and underutilized indexes
CALL mongodb_drop_unused_indexes(collection_name);
CALL mongodb_review_underutilized_indexes(collection_name);
WHEN 'conservative' THEN
-- Only drop clearly unused indexes (0 access, older than 30 days)
CALL mongodb_drop_unused_indexes(collection_name, min_age_days => 30);
WHEN 'analyze_only' THEN
-- Generate report without making changes
CALL mongodb_generate_index_report(collection_name);
END CASE;
-- Log maintenance activity
INSERT INTO index_maintenance_log (
collection_name,
maintenance_mode,
maintenance_timestamp,
unused_indexes_dropped,
storage_saved_bytes
)
SELECT
collection_name,
maintenance_mode,
CURRENT_TIMESTAMP,
(SELECT unused_count FROM maintenance_summary),
(SELECT unused_size_bytes FROM maintenance_summary);
COMMIT;
END;
-- QueryLeaf provides comprehensive index optimization capabilities:
-- 1. SQL-familiar index creation and management syntax
-- 2. Advanced compound index strategies with ESR pattern optimization
-- 3. Automated query performance analysis and explain plan interpretation
-- 4. Index usage monitoring and utilization tracking
-- 5. Performance trend analysis and health scoring
-- 6. Automated optimization recommendations based on usage patterns
-- 7. Maintenance procedures for index lifecycle management
-- 8. Integration with MongoDB's native indexing and performance features
-- 9. Real-time performance monitoring with alerting capabilities
-- 10. Familiar SQL patterns for complex index optimization requirements
Best Practices for Index Optimization
Index Design Principles
Essential practices for effective MongoDB index optimization:
- ESR Pattern: Design compound indexes following Equality-Sort-Range order
- Query-First Design: Create indexes based on actual query patterns, not theoretical needs
- Selectivity Optimization: Place most selective fields first in compound indexes
- Index Intersection: Leverage MongoDB's ability to use multiple indexes for complex queries
- Covering Indexes: Include frequently accessed fields to avoid document lookups
- Maintenance Balance: Balance query performance with write performance and storage costs
Performance Monitoring
Implement comprehensive performance monitoring for production environments:
- Continuous Analysis: Monitor query performance patterns and execution statistics
- Usage Tracking: Track index utilization to identify unused or underutilized indexes
- Trend Analysis: Identify performance degradation trends before they impact users
- Automated Alerting: Set up alerts for slow queries and index efficiency metrics
- Regular Optimization: Schedule periodic index analysis and optimization cycles
- Capacity Planning: Monitor index growth and plan for scaling requirements
Conclusion
MongoDB Index Optimization provides comprehensive query performance tuning capabilities that eliminate the complexity and manual overhead of traditional database optimization approaches. The combination of intelligent compound indexing, automated performance analysis, and sophisticated query execution monitoring enables proactive performance management that scales with growing data volumes and evolving access patterns.
Key Index Optimization benefits include:
- Intelligent Compound Indexing: Advanced ESR pattern optimization for maximum query efficiency
- Automated Performance Analysis: Comprehensive query execution analysis with actionable recommendations
- Usage-Based Optimization: Index recommendations based on actual utilization patterns
- Comprehensive Monitoring: Real-time performance tracking with trend analysis and alerting
- Maintenance Automation: Automated cleanup of unused indexes and optimization suggestions
- Developer Familiarity: SQL-style optimization patterns with MongoDB's flexible indexing system
Whether you're building high-traffic web applications, analytics platforms, real-time systems, or any application requiring exceptional database performance, MongoDB Index Optimization with QueryLeaf's familiar SQL interface provides the foundation for enterprise-grade performance engineering. This combination enables you to implement sophisticated optimization strategies while preserving familiar database interaction patterns.
QueryLeaf Integration: QueryLeaf automatically translates SQL index operations into MongoDB index management, providing SQL-familiar CREATE INDEX syntax, EXPLAIN plan analysis, and performance monitoring queries. Advanced optimization strategies, compound index design, and automated maintenance are seamlessly handled through familiar SQL patterns, making enterprise performance optimization both powerful and accessible.
The integration of comprehensive optimization capabilities with SQL-style operations makes MongoDB an ideal platform for applications requiring both exceptional performance and familiar database optimization patterns, ensuring your performance solutions remain both effective and maintainable as they scale and evolve.