Skip to content

MongoDB Geospatial Queries and Location-Based Services: SQL-Style Spatial Operations for Modern Applications

Location-aware applications have become fundamental to modern software experiences - from ride-sharing platforms and delivery services to social networks and retail applications. These applications require sophisticated spatial data processing capabilities including proximity searches, route optimization, geofencing, and real-time location tracking that traditional relational databases struggle to handle efficiently.

MongoDB provides comprehensive geospatial functionality with support for 2D and 3D coordinates, multiple coordinate reference systems, and advanced spatial operations. Unlike traditional databases that require complex extensions for spatial data, MongoDB natively supports geospatial indexes, queries, and aggregation operations that can handle billions of location data points with sub-second query performance.

The Traditional Spatial Data Challenge

Relational databases face significant limitations when handling geospatial data and location-based queries:

-- Traditional PostgreSQL/PostGIS approach - complex setup and limited performance
-- Location-based application with spatial data

CREATE EXTENSION IF NOT EXISTS postgis;

-- Store locations with geometry data
CREATE TABLE locations (
    location_id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    category VARCHAR(100),
    address TEXT,
    city VARCHAR(100),
    state VARCHAR(50),
    country VARCHAR(100),

    -- PostGIS geometry column (complex setup required)
    coordinates GEOMETRY(POINT, 4326), -- WGS84 coordinate system

    -- Additional spatial data
    service_area GEOMETRY(POLYGON, 4326), -- Service coverage area
    delivery_zones GEOMETRY(MULTIPOLYGON, 4326), -- Multiple delivery zones

    -- Business data
    rating DECIMAL(3,2),
    total_reviews INTEGER DEFAULT 0,
    is_active BOOLEAN DEFAULT true,
    hours_of_operation JSONB,

    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create spatial indexes (requires PostGIS extension)
CREATE INDEX idx_locations_coordinates ON locations USING GIST (coordinates);
CREATE INDEX idx_locations_service_area ON locations USING GIST (service_area);

-- Store user locations and activities
CREATE TABLE user_locations (
    user_location_id SERIAL PRIMARY KEY,
    user_id INTEGER NOT NULL REFERENCES users(user_id),
    coordinates GEOMETRY(POINT, 4326),
    accuracy_meters DECIMAL(8,2),
    recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    activity_type VARCHAR(50), -- 'check-in', 'delivery', 'movement'
    device_info JSONB
);

CREATE INDEX idx_user_locations_coordinates ON user_locations USING GIST (coordinates);
CREATE INDEX idx_user_locations_user_time ON user_locations (user_id, recorded_at);

-- Complex proximity search query
WITH nearby_locations AS (
    SELECT 
        l.location_id,
        l.name,
        l.category,
        l.rating,

        -- Distance calculation in meters
        ST_Distance(
            l.coordinates,
            ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326) -- San Francisco coordinates
        ) as distance_meters,

        -- Check if point is within service area
        ST_Contains(
            l.service_area,
            ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)
        ) as is_in_service_area,

        -- Convert coordinates back to lat/lng for application
        ST_Y(l.coordinates) as latitude,
        ST_X(l.coordinates) as longitude

    FROM locations l
    WHERE 
        l.is_active = true
        AND ST_DWithin(
            l.coordinates,
            ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326),
            5000 -- 5km radius in meters
        )
),
location_analytics AS (
    -- Add user activity data for locations
    SELECT 
        nl.*,
        COUNT(DISTINCT ul.user_id) as unique_visitors_last_30_days,
        COUNT(ul.user_location_id) as total_activities_last_30_days,
        AVG(ul.accuracy_meters) as avg_location_accuracy
    FROM nearby_locations nl
    LEFT JOIN user_locations ul ON ST_DWithin(
        ST_SetSRID(ST_MakePoint(nl.longitude, nl.latitude), 4326),
        ul.coordinates,
        100 -- Within 100 meters of location
    )
    AND ul.recorded_at >= CURRENT_DATE - INTERVAL '30 days'
    GROUP BY nl.location_id, nl.name, nl.category, nl.rating, 
             nl.distance_meters, nl.is_in_service_area, 
             nl.latitude, nl.longitude
)
SELECT 
    location_id,
    name,
    category,
    rating,
    ROUND(distance_meters::numeric, 0) as distance_meters,
    is_in_service_area,
    latitude,
    longitude,
    unique_visitors_last_30_days,
    total_activities_last_30_days,
    ROUND(avg_location_accuracy::numeric, 1) as avg_accuracy_meters,

    -- Relevance scoring based on distance, rating, and activity
    (
        (1000 - LEAST(distance_meters, 1000)) / 1000 * 0.4 + -- Distance factor (40%)
        (rating / 5.0) * 0.3 + -- Rating factor (30%)
        (LEAST(unique_visitors_last_30_days, 50) / 50.0) * 0.3 -- Activity factor (30%)
    ) as relevance_score

FROM location_analytics
ORDER BY relevance_score DESC, distance_meters ASC
LIMIT 20;

-- Problems with traditional spatial approach:
-- 1. Complex PostGIS extension setup and maintenance
-- 2. Requires specialized spatial database knowledge
-- 3. Limited coordinate system support without additional configuration
-- 4. Performance degrades with large datasets and complex queries
-- 5. Difficult integration with application object models
-- 6. Complex geometry data types and manipulation functions
-- 7. Limited aggregation capabilities for spatial analytics
-- 8. Challenging horizontal scaling for global applications
-- 9. Memory-intensive spatial operations
-- 10. Complex backup and restore procedures for spatial data

-- MySQL spatial limitations (even more restrictive):
CREATE TABLE locations_mysql (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255),
    -- MySQL spatial support limited and less capable
    coordinates POINT NOT NULL,
    SPATIAL INDEX(coordinates)
);

-- Basic proximity query in MySQL (limited functionality)
SELECT 
    id, name,
    ST_Distance_Sphere(
        coordinates, 
        POINT(-122.4194, 37.7749)
    ) as distance_meters
FROM locations_mysql
WHERE ST_Distance_Sphere(
    coordinates, 
    POINT(-122.4194, 37.7749)
) < 5000
ORDER BY distance_meters
LIMIT 10;

-- MySQL limitations:
-- - Limited spatial functions compared to PostGIS
-- - Poor performance with large spatial datasets
-- - No advanced spatial analytics capabilities
-- - Limited coordinate system support
-- - Basic geometry types only
-- - No spatial aggregation functions
-- - Difficult to implement complex spatial business logic

MongoDB provides comprehensive geospatial capabilities with simple, intuitive syntax:

// MongoDB native geospatial support - powerful and intuitive
const { MongoClient } = require('mongodb');

const client = new MongoClient('mongodb://localhost:27017');
const db = client.db('location_services');

// MongoDB geospatial document structure - native and flexible
const createLocationServiceDataModel = async () => {
  // Create locations collection with rich geospatial data
  const locations = db.collection('locations');

  // Example location document with geospatial data
  const locationDocument = {
    _id: new ObjectId(),

    // Basic business information
    name: "Blue Bottle Coffee - Ferry Building",
    category: "cafe",
    subcategory: "specialty_coffee",
    chain: "Blue Bottle Coffee",

    // Address information
    address: {
      street: "1 Ferry Building",
      unit: "Shop 7",
      city: "San Francisco",
      state: "CA",
      country: "USA",
      postalCode: "94111",
      formattedAddress: "1 Ferry Building, Shop 7, San Francisco, CA 94111"
    },

    // Primary location - GeoJSON Point format
    location: {
      type: "Point",
      coordinates: [-122.3937, 37.7955] // [longitude, latitude] - NOTE: MongoDB uses [lng, lat]
    },

    // Service area - GeoJSON Polygon format
    serviceArea: {
      type: "Polygon",
      coordinates: [[
        [-122.4050, 37.7850], // Southwest corner
        [-122.3850, 37.7850], // Southeast corner  
        [-122.3850, 37.8050], // Northeast corner
        [-122.4050, 37.8050], // Northwest corner
        [-122.4050, 37.7850]  // Close polygon
      ]]
    },

    // Multiple delivery zones - GeoJSON MultiPolygon
    deliveryZones: {
      type: "MultiPolygon", 
      coordinates: [
        [[ // First delivery zone
          [-122.4000, 37.7900],
          [-122.3900, 37.7900],
          [-122.3900, 37.8000],
          [-122.4000, 37.8000],
          [-122.4000, 37.7900]
        ]],
        [[ // Second delivery zone
          [-122.4100, 37.7800],
          [-122.3950, 37.7800],
          [-122.3950, 37.7900],
          [-122.4100, 37.7900],
          [-122.4100, 37.7800]
        ]]
      ]
    },

    // Business information
    business: {
      rating: 4.6,
      totalReviews: 1247,
      priceRange: "$$",
      phoneNumber: "+1-415-555-0123",
      website: "https://bluebottlecoffee.com",
      isActive: true,
      isChain: true,

      // Hours of operation with geospatial considerations
      hours: {
        monday: { open: "06:00", close: "19:00", timezone: "America/Los_Angeles" },
        tuesday: { open: "06:00", close: "19:00", timezone: "America/Los_Angeles" },
        wednesday: { open: "06:00", close: "19:00", timezone: "America/Los_Angeles" },
        thursday: { open: "06:00", close: "19:00", timezone: "America/Los_Angeles" },
        friday: { open: "06:00", close: "20:00", timezone: "America/Los_Angeles" },
        saturday: { open: "07:00", close: "20:00", timezone: "America/Los_Angeles" },
        sunday: { open: "07:00", close: "19:00", timezone: "America/Los_Angeles" }
      },

      // Services and amenities
      amenities: ["wifi", "outdoor_seating", "takeout", "delivery", "mobile_payment"],
      specialties: ["single_origin", "cold_brew", "espresso", "pour_over"]
    },

    // Geospatial metadata
    geoMetadata: {
      coordinateSystem: "WGS84",
      accuracyMeters: 5,
      elevationMeters: 15,
      dataSource: "GPS_verified",
      lastVerified: new Date("2024-09-01"),

      // Nearby landmarks for context
      nearbyLandmarks: [
        {
          name: "Ferry Building Marketplace",
          distance: 50,
          bearing: "north"
        },
        {
          name: "Embarcadero BART Station", 
          distance: 200,
          bearing: "west"
        }
      ]
    },

    // Analytics and performance data
    analytics: {
      monthlyVisitors: 12500,
      averageVisitDuration: 25, // minutes
      peakHours: ["08:00-09:00", "12:00-13:00", "15:00-16:00"],
      popularDays: ["monday", "tuesday", "wednesday", "friday"],

      // Location-specific metrics
      locationMetrics: {
        averageWalkingTime: 3.5, // minutes from nearest transit
        parkingAvailability: "limited",
        accessibilityRating: 4.2,
        noiseLevel: "moderate",
        crowdLevel: "busy"
      }
    },

    // SEO and discovery
    searchTerms: [
      "coffee shop ferry building", 
      "blue bottle san francisco",
      "specialty coffee embarcadero",
      "third wave coffee downtown sf"
    ],

    tags: ["coffee", "cafe", "specialty", "artisan", "downtown", "waterfront"],

    createdAt: new Date("2024-01-15"),
    updatedAt: new Date("2024-09-14")
  };

  // Insert the location document
  await locations.insertOne(locationDocument);

  // Create geospatial index - 2dsphere for spherical geometry (Earth)
  await locations.createIndex({ location: "2dsphere" });
  await locations.createIndex({ serviceArea: "2dsphere" });
  await locations.createIndex({ deliveryZones: "2dsphere" });

  // Additional indexes for common queries
  await locations.createIndex({ category: 1, "business.rating": -1 });
  await locations.createIndex({ "business.isActive": 1, "location": "2dsphere" });
  await locations.createIndex({ tags: 1, "location": "2dsphere" });

  console.log("Location document and indexes created successfully");
  return locations;
};

// Advanced geospatial queries and operations
const performGeospatialOperations = async () => {
  const locations = db.collection('locations');

  // 1. Proximity Search - Find nearby locations
  console.log("=== Proximity Search ===");
  const userLocation = [-122.4194, 37.7749]; // San Francisco coordinates [lng, lat]

  const nearbyLocations = await locations.find({
    location: {
      $near: {
        $geometry: {
          type: "Point",
          coordinates: userLocation
        },
        $maxDistance: 5000, // 5km in meters
        $minDistance: 0
      }
    },
    "business.isActive": true
  }).limit(10).toArray();

  console.log(`Found ${nearbyLocations.length} locations within 5km`);

  // 2. Geo Within - Find locations within a specific area
  console.log("\n=== Geo Within Search ===");
  const searchPolygon = {
    type: "Polygon", 
    coordinates: [[
      [-122.4270, 37.7609], // Southwest corner
      [-122.3968, 37.7609], // Southeast corner
      [-122.3968, 37.7908], // Northeast corner  
      [-122.4270, 37.7908], // Northwest corner
      [-122.4270, 37.7609]  // Close polygon
    ]]
  };

  const locationsInArea = await locations.find({
    location: {
      $geoWithin: {
        $geometry: searchPolygon
      }
    },
    category: "restaurant"
  }).toArray();

  console.log(`Found ${locationsInArea.length} restaurants in specified area`);

  // 3. Geospatial Aggregation - Complex analytics
  console.log("\n=== Geospatial Analytics ===");
  const geospatialAnalytics = await locations.aggregate([
    // Match active locations
    {
      $match: {
        "business.isActive": true,
        location: {
          $geoWithin: {
            $centerSphere: [userLocation, 10 / 3963.2] // 10 miles radius
          }
        }
      }
    },

    // Calculate distance from user location
    {
      $addFields: {
        distanceFromUser: {
          $divide: [
            {
              $sqrt: {
                $add: [
                  {
                    $pow: [
                      { $subtract: [{ $arrayElemAt: ["$location.coordinates", 0] }, userLocation[0]] },
                      2
                    ]
                  },
                  {
                    $pow: [
                      { $subtract: [{ $arrayElemAt: ["$location.coordinates", 1] }, userLocation[1]] },
                      2
                    ]
                  }
                ]
              }
            },
            0.000009 // Approximate degrees to meters conversion
          ]
        }
      }
    },

    // Group by category and analyze
    {
      $group: {
        _id: "$category",
        totalLocations: { $sum: 1 },
        averageRating: { $avg: "$business.rating" },
        averageDistance: { $avg: "$distanceFromUser" },
        closestLocation: {
          $min: {
            name: "$name",
            distance: "$distanceFromUser",
            coordinates: "$location.coordinates"
          }
        },

        // Collect all locations in category
        locations: {
          $push: {
            name: "$name",
            rating: "$business.rating",
            distance: "$distanceFromUser",
            coordinates: "$location.coordinates"
          }
        },

        // Rating distribution
        highRatedCount: {
          $sum: { $cond: [{ $gte: ["$business.rating", 4.5] }, 1, 0] }
        },
        mediumRatedCount: {
          $sum: { $cond: [{ $and: [{ $gte: ["$business.rating", 3.5] }, { $lt: ["$business.rating", 4.5] }] }, 1, 0] }
        },
        lowRatedCount: {
          $sum: { $cond: [{ $lt: ["$business.rating", 3.5] }, 1, 0] }
        }
      }
    },

    // Calculate additional metrics
    {
      $addFields: {
        categoryDensity: { $divide: ["$totalLocations", 314] }, // per square km (10 mile radius ≈ 314 sq km)
        highRatedPercentage: { $multiply: [{ $divide: ["$highRatedCount", "$totalLocations"] }, 100] },
        averageDistanceKm: { $multiply: ["$averageDistance", 111] } // Rough conversion to km
      }
    },

    // Sort by total locations and rating
    {
      $sort: {
        totalLocations: -1,
        averageRating: -1
      }
    },

    // Format output
    {
      $project: {
        category: "$_id",
        totalLocations: 1,
        averageRating: { $round: ["$averageRating", 2] },
        averageDistanceKm: { $round: ["$averageDistanceKm", 2] },
        categoryDensity: { $round: ["$categoryDensity", 2] },
        highRatedPercentage: { $round: ["$highRatedPercentage", 1] },
        closestLocation: 1,
        ratingDistribution: {
          high: "$highRatedCount",
          medium: "$mediumRatedCount", 
          low: "$lowRatedCount"
        }
      }
    }
  ]).toArray();

  console.log("Geospatial Analytics Results:");
  console.log(JSON.stringify(geospatialAnalytics, null, 2));

  // 4. Route optimization - Find optimal path through multiple locations
  console.log("\n=== Route Optimization ===");
  const waypointLocations = [
    [-122.4194, 37.7749], // Start: San Francisco
    [-122.4094, 37.7849], // Waypoint 1
    [-122.3994, 37.7949], // Waypoint 2
    [-122.4194, 37.7749]  // End: Back to start
  ];

  // Find locations near each waypoint
  const routeAnalysis = await Promise.all(
    waypointLocations.map(async (waypoint, index) => {
      const nearbyOnRoute = await locations.find({
        location: {
          $near: {
            $geometry: {
              type: "Point",
              coordinates: waypoint
            },
            $maxDistance: 500 // 500m radius
          }
        },
        "business.isActive": true
      }).limit(5).toArray();

      return {
        waypointIndex: index,
        coordinates: waypoint,
        nearbyLocations: nearbyOnRoute.map(loc => ({
          name: loc.name,
          category: loc.category,
          rating: loc.business.rating,
          coordinates: loc.location.coordinates
        }))
      };
    })
  );

  console.log("Route Analysis:");
  console.log(JSON.stringify(routeAnalysis, null, 2));

  return {
    nearbyLocations: nearbyLocations.length,
    locationsInArea: locationsInArea.length,
    analyticsResults: geospatialAnalytics.length,
    routeWaypoints: routeAnalysis.length
  };
};

// Real-time location tracking and geofencing
const setupLocationTracking = async () => {
  const userLocations = db.collection('user_locations');
  const geofences = db.collection('geofences');

  // Create user location tracking document
  const userLocationDocument = {
    _id: new ObjectId(),
    userId: new ObjectId("64a1b2c3d4e5f6789012347a"),

    // Current location
    currentLocation: {
      type: "Point",
      coordinates: [-122.4194, 37.7749]
    },

    // Location metadata
    locationMetadata: {
      accuracy: 10, // meters
      altitude: 15, // meters above sea level
      heading: 45, // degrees from north
      speed: 1.5, // meters per second
      timestamp: new Date(),
      source: "GPS", // GPS, WiFi, Cellular, Manual
      batteryLevel: 85,

      // Device context
      device: {
        platform: "iOS",
        version: "17.1",
        model: "iPhone 15 Pro",
        appVersion: "2.1.0"
      }
    },

    // Location history (recent positions)
    locationHistory: [
      {
        location: {
          type: "Point", 
          coordinates: [-122.4204, 37.7739]
        },
        timestamp: new Date(Date.now() - 300000), // 5 minutes ago
        accuracy: 15,
        source: "GPS"
      },
      {
        location: {
          type: "Point",
          coordinates: [-122.4214, 37.7729] 
        },
        timestamp: new Date(Date.now() - 600000), // 10 minutes ago
        accuracy: 12,
        source: "GPS"
      }
    ],

    // Privacy and permissions
    privacy: {
      shareLocation: true,
      accuracyLevel: "precise", // precise, approximate, city
      shareWithFriends: true,
      shareWithBusiness: false,
      trackingEnabled: true
    },

    // Activity context
    activity: {
      type: "walking", // walking, driving, cycling, stationary
      confidence: 0.85,
      detectedTransition: null,
      lastActivity: "stationary"
    },

    createdAt: new Date(),
    updatedAt: new Date()
  };

  // Create indexes for location tracking
  await userLocations.createIndex({ currentLocation: "2dsphere" });
  await userLocations.createIndex({ userId: 1, "locationMetadata.timestamp": -1 });
  await userLocations.createIndex({ "locationHistory.location": "2dsphere" });

  await userLocations.insertOne(userLocationDocument);

  // Create geofence system
  const geofenceDocument = {
    _id: new ObjectId(),
    name: "Downtown Coffee Shop Promo Zone",
    description: "Special promotions for coffee shops in downtown area",

    // Geofence area
    area: {
      type: "Polygon",
      coordinates: [[
        [-122.4200, 37.7700],
        [-122.4100, 37.7700], 
        [-122.4100, 37.7800],
        [-122.4200, 37.7800],
        [-122.4200, 37.7700]
      ]]
    },

    // Geofence configuration
    config: {
      type: "promotional", // promotional, security, analytics, notification
      radius: null, // For circular geofences
      isActive: true,

      // Trigger conditions
      triggers: {
        onEnter: true,
        onExit: true,
        onDwell: true,
        dwellTimeMinutes: 5,

        // Rate limiting
        minTimeBetweenTriggers: 300, // seconds
        maxTriggersPerDay: 10
      },

      // Actions to take
      actions: {
        notification: {
          enabled: true,
          title: "Coffee Deals Nearby!",
          message: "Check out special offers at local coffee shops",
          deepLink: "app://offers/coffee"
        },
        analytics: {
          trackEntry: true,
          trackExit: true,
          trackDwellTime: true
        },
        webhook: {
          enabled: false,
          url: "https://api.example.com/geofence-trigger",
          method: "POST"
        }
      }
    },

    // Analytics
    analytics: {
      totalEnters: 1456,
      totalExits: 1423,
      avgDwellTimeMinutes: 12.5,
      uniqueUsers: 342,

      // Time-based patterns
      hourlyActivity: {
        "08": 45, "09": 78, "10": 23, "11": 34,
        "12": 89, "13": 67, "14": 45, "15": 56,
        "16": 78, "17": 123, "18": 89, "19": 34
      },

      dailyActivity: {
        "monday": 234, "tuesday": 189, "wednesday": 267,
        "thursday": 201, "friday": 298, "saturday": 156, "sunday": 111
      }
    },

    createdAt: new Date("2024-09-01"),
    updatedAt: new Date("2024-09-14")
  };

  await geofences.createIndex({ area: "2dsphere" });
  await geofences.createIndex({ "config.isActive": 1, "config.type": 1 });

  await geofences.insertOne(geofenceDocument);

  // Real-time geofence checking function
  const checkGeofences = async (userId, currentLocation) => {
    console.log("Checking geofences for user location...");

    // Find all active geofences that contain the user's location
    const triggeredGeofences = await geofences.find({
      "config.isActive": true,
      area: {
        $geoIntersects: {
          $geometry: {
            type: "Point",
            coordinates: currentLocation
          }
        }
      }
    }).toArray();

    console.log(`Found ${triggeredGeofences.length} triggered geofences`);

    // Process each triggered geofence
    for (const geofence of triggeredGeofences) {
      console.log(`Processing geofence: ${geofence.name}`);

      // Update analytics
      await geofences.updateOne(
        { _id: geofence._id },
        {
          $inc: { 
            "analytics.totalEnters": 1,
            [`analytics.hourlyActivity.${new Date().getHours().toString().padStart(2, '0')}`]: 1,
            [`analytics.dailyActivity.${new Date().toLocaleDateString('en-US', { weekday: 'long' }).toLowerCase()}`]: 1
          },
          $set: { updatedAt: new Date() }
        }
      );

      // Trigger actions (notifications, webhooks, etc.)
      if (geofence.config.actions.notification.enabled) {
        console.log(`Sending notification: ${geofence.config.actions.notification.title}`);
        // Implementation would send actual notification
      }
    }

    return triggeredGeofences;
  };

  // Test geofence checking
  const testLocation = [-122.4150, 37.7750]; // Point within the geofence
  const triggeredFences = await checkGeofences(userLocationDocument.userId, testLocation);

  return {
    userLocationDocument,
    geofenceDocument,
    triggeredGeofences: triggeredFences.length
  };
};

// Advanced spatial analytics and heatmap generation
const generateSpatialAnalytics = async () => {
  const locations = db.collection('locations');
  const userLocations = db.collection('user_locations');

  console.log("=== Generating Spatial Analytics ===");

  // 1. Location Density Analysis
  const locationDensityAnalysis = await locations.aggregate([
    {
      $match: {
        "business.isActive": true
      }
    },

    // Create grid cells for density analysis
    {
      $addFields: {
        gridCell: {
          lat: {
            $floor: {
              $multiply: [
                { $arrayElemAt: ["$location.coordinates", 1] }, // latitude
                1000 // Create 0.001 degree grid cells (~100m)
              ]
            }
          },
          lng: {
            $floor: {
              $multiply: [
                { $arrayElemAt: ["$location.coordinates", 0] }, // longitude  
                1000
              ]
            }
          }
        }
      }
    },

    // Group by grid cell
    {
      $group: {
        _id: "$gridCell",
        locationCount: { $sum: 1 },
        avgRating: { $avg: "$business.rating" },
        categories: { $push: "$category" },

        // Calculate center point of grid cell
        centerCoordinates: {
          $first: {
            type: "Point",
            coordinates: [
              { $divide: ["$gridCell.lng", 1000] },
              { $divide: ["$gridCell.lat", 1000] }
            ]
          }
        },

        // Business metrics
        totalReviews: { $sum: "$business.totalReviews" },
        uniqueCategories: { $addToSet: "$category" }
      }
    },

    // Calculate density metrics
    {
      $addFields: {
        densityScore: {
          $multiply: [
            "$locationCount",
            { $divide: ["$avgRating", 5] } // Weight by average rating
          ]
        },
        categoryDiversity: { $size: "$uniqueCategories" }
      }
    },

    // Sort by density
    {
      $sort: { densityScore: -1 }
    },

    {
      $limit: 20 // Top 20 densest areas
    },

    {
      $project: {
        gridId: "$_id",
        locationCount: 1,
        densityScore: { $round: ["$densityScore", 2] },
        avgRating: { $round: ["$avgRating", 2] },
        categoryDiversity: 1,
        totalReviews: 1,
        centerCoordinates: 1
      }
    }
  ]).toArray();

  console.log(`Location Density Analysis - Found ${locationDensityAnalysis.length} high-density areas`);

  // 2. User Movement Patterns
  const userMovementAnalysis = await userLocations.aggregate([
    {
      $match: {
        "locationMetadata.timestamp": {
          $gte: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000) // Last 7 days
        }
      }
    },

    // Unwind location history
    { $unwind: "$locationHistory" },

    // Calculate movement vectors
    {
      $addFields: {
        movement: {
          fromLat: { $arrayElemAt: ["$locationHistory.location.coordinates", 1] },
          fromLng: { $arrayElemAt: ["$locationHistory.location.coordinates", 0] },
          toLat: { $arrayElemAt: ["$currentLocation.coordinates", 1] },
          toLng: { $arrayElemAt: ["$currentLocation.coordinates", 0] },
          timestamp: "$locationHistory.timestamp"
        }
      }
    },

    // Calculate distance and bearing
    {
      $addFields: {
        "movement.distance": {
          // Haversine formula approximation
          $multiply: [
            6371000, // Earth radius in meters
            {
              $acos: {
                $add: [
                  {
                    $multiply: [
                      { $sin: { $multiply: [{ $degreesToRadians: "$movement.fromLat" }, 1] } },
                      { $sin: { $multiply: [{ $degreesToRadians: "$movement.toLat" }, 1] } }
                    ]
                  },
                  {
                    $multiply: [
                      { $cos: { $multiply: [{ $degreesToRadians: "$movement.fromLat" }, 1] } },
                      { $cos: { $multiply: [{ $degreesToRadians: "$movement.toLat" }, 1] } },
                      { $cos: {
                        $multiply: [
                          { $degreesToRadians: { $subtract: ["$movement.toLng", "$movement.fromLng"] } },
                          1
                        ]
                      } }
                    ]
                  }
                ]
              }
            }
          ]
        }
      }
    },

    // Group movement patterns
    {
      $group: {
        _id: {
          hour: { $hour: "$movement.timestamp" },
          dayOfWeek: { $dayOfWeek: "$movement.timestamp" }
        },

        totalMovements: { $sum: 1 },
        avgDistance: { $avg: "$movement.distance" },
        totalDistance: { $sum: "$movement.distance" },
        uniqueUsers: { $addToSet: "$userId" },

        // Movement characteristics
        shortMovements: {
          $sum: { $cond: [{ $lt: ["$movement.distance", 100] }, 1, 0] } // < 100m
        },
        mediumMovements: {
          $sum: { $cond: [
            { $and: [
              { $gte: ["$movement.distance", 100] },
              { $lt: ["$movement.distance", 1000] }
            ]}, 1, 0
          ] } // 100m - 1km
        },
        longMovements: {
          $sum: { $cond: [{ $gte: ["$movement.distance", 1000] }, 1, 0] } // > 1km
        }
      }
    },

    // Calculate additional metrics
    {
      $addFields: {
        uniqueUserCount: { $size: "$uniqueUsers" },
        avgMovementsPerUser: { $divide: ["$totalMovements", { $size: "$uniqueUsers" }] },
        movementDistribution: {
          short: { $divide: ["$shortMovements", "$totalMovements"] },
          medium: { $divide: ["$mediumMovements", "$totalMovements"] },
          long: { $divide: ["$longMovements", "$totalMovements"] }
        }
      }
    },

    {
      $sort: { totalMovements: -1 }
    },

    {
      $project: {
        hour: "$_id.hour",
        dayOfWeek: "$_id.dayOfWeek", 
        totalMovements: 1,
        uniqueUserCount: 1,
        avgDistance: { $round: ["$avgDistance", 1] },
        avgMovementsPerUser: { $round: ["$avgMovementsPerUser", 1] },
        movementDistribution: {
          short: { $round: ["$movementDistribution.short", 3] },
          medium: { $round: ["$movementDistribution.medium", 3] },
          long: { $round: ["$movementDistribution.long", 3] }
        }
      }
    }
  ]).toArray();

  console.log(`User Movement Analysis - Analyzed ${userMovementAnalysis.length} time periods`);

  // 3. Geographic Performance Analysis
  const geoPerformanceAnalysis = await locations.aggregate([
    {
      $match: {
        "business.isActive": true,
        "analytics.monthlyVisitors": { $exists: true }
      }
    },

    // Create geographic regions
    {
      $addFields: {
        region: {
          $switch: {
            branches: [
              {
                case: {
                  $and: [
                    { $gte: [{ $arrayElemAt: ["$location.coordinates", 1] }, 37.77] }, // North of 37.77°N
                    { $lte: [{ $arrayElemAt: ["$location.coordinates", 0] }, -122.41] } // West of -122.41°W
                  ]
                },
                then: "Northwest"
              },
              {
                case: {
                  $and: [
                    { $gte: [{ $arrayElemAt: ["$location.coordinates", 1] }, 37.77] },
                    { $gt: [{ $arrayElemAt: ["$location.coordinates", 0] }, -122.41] }
                  ]
                },
                then: "Northeast"
              },
              {
                case: {
                  $and: [
                    { $lt: [{ $arrayElemAt: ["$location.coordinates", 1] }, 37.77] },
                    { $lte: [{ $arrayElemAt: ["$location.coordinates", 0] }, -122.41] }
                  ]
                },
                then: "Southwest"
              },
              {
                case: {
                  $and: [
                    { $lt: [{ $arrayElemAt: ["$location.coordinates", 1] }, 37.77] },
                    { $gt: [{ $arrayElemAt: ["$location.coordinates", 0] }, -122.41] }
                  ]
                },
                then: "Southeast"
              }
            ],
            default: "Other"
          }
        }
      }
    },

    // Group by region and category
    {
      $group: {
        _id: {
          region: "$region",
          category: "$category"
        },

        locationCount: { $sum: 1 },
        avgRating: { $avg: "$business.rating" },
        avgMonthlyVisitors: { $avg: "$analytics.monthlyVisitors" },
        totalMonthlyVisitors: { $sum: "$analytics.monthlyVisitors" },

        // Performance metrics
        highPerformers: {
          $sum: {
            $cond: [
              {
                $and: [
                  { $gte: ["$business.rating", 4.5] },
                  { $gte: ["$analytics.monthlyVisitors", 10000] }
                ]
              }, 1, 0
            ]
          }
        },

        topLocation: {
          $max: {
            name: "$name",
            visitors: "$analytics.monthlyVisitors",
            rating: "$business.rating"
          }
        }
      }
    },

    // Calculate regional metrics
    {
      $group: {
        _id: "$_id.region",

        categories: {
          $push: {
            category: "$_id.category",
            locationCount: "$locationCount",
            avgRating: "$avgRating",
            avgMonthlyVisitors: "$avgMonthlyVisitors",
            totalMonthlyVisitors: "$totalMonthlyVisitors",
            highPerformers: "$highPerformers",
            topLocation: "$topLocation"
          }
        },

        regionalTotals: {
          totalLocations: { $sum: "$locationCount" },
          totalMonthlyVisitors: { $sum: "$totalMonthlyVisitors" },
          totalHighPerformers: { $sum: "$highPerformers" }
        }
      }
    },

    // Sort by total visitors
    {
      $sort: { "regionalTotals.totalMonthlyVisitors": -1 }
    },

    {
      $project: {
        region: "$_id",
        categories: 1,
        regionalTotals: 1,

        // Calculate regional performance metrics
        performanceMetrics: {
          avgVisitorsPerLocation: {
            $divide: ["$regionalTotals.totalMonthlyVisitors", "$regionalTotals.totalLocations"]
          },
          highPerformerRatio: {
            $divide: ["$regionalTotals.totalHighPerformers", "$regionalTotals.totalLocations"]
          }
        }
      }
    }
  ]).toArray();

  console.log(`Geographic Performance Analysis - Analyzed ${geoPerformanceAnalysis.length} regions`);

  return {
    densityAnalysis: locationDensityAnalysis,
    movementAnalysis: userMovementAnalysis,
    performanceAnalysis: geoPerformanceAnalysis,

    summary: {
      densityHotspots: locationDensityAnalysis.length,
      movementPatterns: userMovementAnalysis.length,
      regionalInsights: geoPerformanceAnalysis.length
    }
  };
};

// Benefits of MongoDB Geospatial Features:
// - Native GeoJSON support with automatic validation
// - Multiple coordinate reference systems (2D, 2dsphere)
// - Built-in spatial operators and aggregation functions
// - Automatic spatial indexing with B-tree and R-tree structures
// - Spherical geometry calculations for Earth-based applications
// - Integration with aggregation framework for complex analytics
// - Real-time geofencing and location tracking capabilities
// - Scalable to billions of location data points
// - Simple query syntax compared to PostGIS extensions
// - No additional setup required - works out of the box

module.exports = {
  createLocationServiceDataModel,
  performGeospatialOperations,
  setupLocationTracking,
  generateSpatialAnalytics
};

Understanding MongoDB Geospatial Architecture

Coordinate Systems and Indexing Strategies

MongoDB supports multiple geospatial indexing approaches optimized for different use cases:

// Advanced geospatial indexing and coordinate system management
class GeospatialIndexManager {
  constructor(db) {
    this.db = db;
    this.collections = new Map();
  }

  async setupGeospatialIndexing() {
    // 1. 2dsphere Index - For spherical geometry (Earth-based coordinates)
    const locations = this.db.collection('locations');

    // Create 2dsphere index for GeoJSON objects
    await locations.createIndex({ location: "2dsphere" });

    // Compound index for filtered geospatial queries
    await locations.createIndex({ 
      category: 1, 
      "business.isActive": 1, 
      location: "2dsphere" 
    });

    // Text and geospatial compound index
    await locations.createIndex({
      "$**": "text",
      location: "2dsphere"
    });

    console.log("2dsphere indexes created for global location queries");

    // 2. 2d Index - For flat geometry (game maps, floor plans)
    const gameLocations = this.db.collection('game_locations');

    // 2d index for flat coordinate system (e.g., game world coordinates)
    await gameLocations.createIndex({ position: "2d" });

    // Example game location document
    const gameLocationDoc = {
      _id: new ObjectId(),
      playerId: new ObjectId(),
      characterName: "DragonSlayer42",

      // Flat 2D coordinates for game world
      position: [1250.5, 875.2], // [x, y] coordinates in game units

      // Game-specific data
      level: 45,
      zone: "Enchanted Forest",
      server: "US-East-1",

      // Bounding box for area of influence
      areaOfInfluence: {
        bottomLeft: [1200, 825],
        topRight: [1300, 925]
      },

      lastUpdated: new Date()
    };

    await gameLocations.insertOne(gameLocationDoc);
    console.log("2d index created for flat coordinate system");

    // 3. Specialized indexing for different data patterns
    const trajectories = this.db.collection('vehicle_trajectories');

    // Index for trajectory lines and paths
    await trajectories.createIndex({ route: "2dsphere" });
    await trajectories.createIndex({ vehicleId: 1, timestamp: 1 });

    // Example trajectory document
    const trajectoryDoc = {
      _id: new ObjectId(),
      vehicleId: "TRUCK_001",
      driverId: new ObjectId(),

      // LineString geometry for route
      route: {
        type: "LineString",
        coordinates: [
          [-122.4194, 37.7749], // Start point
          [-122.4184, 37.7759], // Waypoint 1
          [-122.4174, 37.7769], // Waypoint 2
          [-122.4164, 37.7779]  // End point
        ]
      },

      // Route metadata
      routeMetadata: {
        totalDistance: 2.3, // km
        estimatedTime: 8, // minutes
        actualTime: 9.5, // minutes
        fuelUsed: 0.45, // liters
        trafficConditions: "moderate"
      },

      // Time-based tracking
      startTime: new Date("2024-09-18T14:30:00Z"),
      endTime: new Date("2024-09-18T14:39:30Z"),

      // Performance metrics
      metrics: {
        averageSpeed: 14.5, // km/h
        maxSpeed: 25.0,
        idleTime: 45, // seconds
        hardBrakingEvents: 1,
        hardAccelerationEvents: 0
      }
    };

    await trajectories.insertOne(trajectoryDoc);
    console.log("Trajectory tracking setup completed");

    return {
      sphericalIndexes: ["locations.location", "locations.compound"],
      flatIndexes: ["game_locations.position"],
      trajectoryIndexes: ["trajectories.route"]
    };
  }

  async performAdvancedSpatialQueries() {
    const locations = this.db.collection('locations');

    // 1. Multi-stage geospatial aggregation
    console.log("=== Advanced Spatial Aggregation ===");

    const complexSpatialAnalysis = await locations.aggregate([
      // Stage 1: Geospatial filtering
      {
        $geoNear: {
          near: {
            type: "Point",
            coordinates: [-122.4194, 37.7749]
          },
          distanceField: "calculatedDistance",
          maxDistance: 10000, // 10km
          spherical: true,
          query: { "business.isActive": true }
        }
      },

      // Stage 2: Spatial relationship analysis
      {
        $addFields: {
          // Distance categories
          distanceCategory: {
            $switch: {
              branches: [
                { case: { $lte: ["$calculatedDistance", 1000] }, then: "nearby" },
                { case: { $lte: ["$calculatedDistance", 5000] }, then: "moderate" },
                { case: { $lte: ["$calculatedDistance", 10000] }, then: "distant" }
              ],
              default: "very_distant"
            }
          },

          // Spatial density calculation
          spatialDensity: {
            $divide: ["$analytics.monthlyVisitors", { $add: ["$calculatedDistance", 1] }]
          }
        }
      },

      // Stage 3: Complex geospatial grouping
      {
        $group: {
          _id: {
            category: "$category",
            distanceCategory: "$distanceCategory"
          },

          locations: { $push: "$$ROOT" },
          avgDistance: { $avg: "$calculatedDistance" },
          avgRating: { $avg: "$business.rating" },
          avgDensity: { $avg: "$spatialDensity" },
          count: { $sum: 1 },

          // Geospatial aggregations
          centroid: {
            $avg: {
              coordinates: "$location.coordinates"
            }
          },

          // Bounding box calculation
          minLat: { $min: { $arrayElemAt: ["$location.coordinates", 1] } },
          maxLat: { $max: { $arrayElemAt: ["$location.coordinates", 1] } },
          minLng: { $min: { $arrayElemAt: ["$location.coordinates", 0] } },
          maxLng: { $max: { $arrayElemAt: ["$location.coordinates", 0] } }
        }
      },

      // Stage 4: Spatial statistics
      {
        $addFields: {
          boundingBox: {
            type: "Polygon",
            coordinates: [[
              ["$minLng", "$minLat"],
              ["$maxLng", "$minLat"], 
              ["$maxLng", "$maxLat"],
              ["$minLng", "$maxLat"],
              ["$minLng", "$minLat"]
            ]]
          },

          // Geographic spread calculation
          geographicSpread: {
            $sqrt: {
              $add: [
                { $pow: [{ $subtract: ["$maxLat", "$minLat"] }, 2] },
                { $pow: [{ $subtract: ["$maxLng", "$minLng"] }, 2] }
              ]
            }
          }
        }
      },

      {
        $sort: { count: -1, avgDensity: -1 }
      }
    ]).toArray();

    console.log(`Complex Spatial Analysis - ${complexSpatialAnalysis.length} category/distance combinations`);

    // 2. Intersection and overlay queries
    console.log("\n=== Spatial Intersection Analysis ===");

    const intersectionAnalysis = await locations.aggregate([
      {
        $match: {
          "business.isActive": true,
          deliveryZones: { $exists: true }
        }
      },

      // Find intersections between delivery zones
      {
        $lookup: {
          from: "locations",
          let: { currentZones: "$deliveryZones" },
          pipeline: [
            {
              $match: {
                $expr: {
                  $and: [
                    { $ne: ["$_id", "$$ROOT._id"] }, // Different location
                    { $ne: ["$$currentZones", null] },
                    {
                      $gt: [{
                        $size: {
                          $filter: {
                            input: "$deliveryZones.coordinates",
                            cond: {
                              // Simplified intersection check
                              $anyElementTrue: {
                                $map: {
                                  input: "$$currentZones.coordinates",
                                  in: { $ne: ["$$this", null] }
                                }
                              }
                            }
                          }
                        }
                      }, 0]
                    }
                  ]
                }
              }
            },
            {
              $project: {
                name: 1,
                category: 1,
                "business.rating": 1
              }
            }
          ],
          as: "overlappingLocations"
        }
      },

      // Calculate overlap metrics
      {
        $addFields: {
          overlapCount: { $size: "$overlappingLocations" },
          hasOverlap: { $gt: [{ $size: "$overlappingLocations" }, 0] },
          competitionLevel: {
            $switch: {
              branches: [
                { case: { $gte: [{ $size: "$overlappingLocations" }, 5] }, then: "high" },
                { case: { $gte: [{ $size: "$overlappingLocations" }, 2] }, then: "medium" },
                { case: { $gt: [{ $size: "$overlappingLocations" }, 0] }, then: "low" }
              ],
              default: "none"
            }
          }
        }
      },

      {
        $match: { hasOverlap: true }
      },

      {
        $group: {
          _id: "$category",
          avgOverlapCount: { $avg: "$overlapCount" },
          locationsWithOverlap: { $sum: 1 },
          highCompetitionAreas: {
            $sum: { $cond: [{ $eq: ["$competitionLevel", "high"] }, 1, 0] }
          }
        }
      },

      { $sort: { avgOverlapCount: -1 } }
    ]).toArray();

    console.log(`Intersection Analysis - ${intersectionAnalysis.length} categories with delivery zone overlaps`);

    // 3. Temporal-spatial analysis
    console.log("\n=== Temporal-Spatial Analysis ===");

    const temporalSpatialAnalysis = await this.db.collection('user_locations').aggregate([
      {
        $match: {
          "locationMetadata.timestamp": {
            $gte: new Date(Date.now() - 24 * 60 * 60 * 1000) // Last 24 hours
          }
        }
      },

      // Unwind location history for temporal analysis
      { $unwind: "$locationHistory" },

      // Create time buckets
      {
        $addFields: {
          timeBucket: {
            $dateTrunc: {
              date: "$locationHistory.timestamp",
              unit: "hour"
            }
          },

          // Grid cell for spatial grouping
          spatialGrid: {
            lat: {
              $floor: {
                $multiply: [
                  { $arrayElemAt: ["$locationHistory.location.coordinates", 1] },
                  1000 // 0.001 degree precision
                ]
              }
            },
            lng: {
              $floor: {
                $multiply: [
                  { $arrayElemAt: ["$locationHistory.location.coordinates", 0] },
                  1000
                ]
              }
            }
          }
        }
      },

      // Group by time and space
      {
        $group: {
          _id: {
            timeBucket: "$timeBucket",
            spatialGrid: "$spatialGrid"
          },

          uniqueUsers: { $addToSet: "$userId" },
          totalEvents: { $sum: 1 },
          avgAccuracy: { $avg: "$locationHistory.accuracy" },

          // Location cluster center
          centerLat: { $avg: { $arrayElemAt: ["$locationHistory.location.coordinates", 1] } },
          centerLng: { $avg: { $arrayElemAt: ["$locationHistory.location.coordinates", 0] } }
        }
      },

      // Calculate density metrics
      {
        $addFields: {
          userDensity: { $size: "$uniqueUsers" },
          eventDensity: "$totalEvents",
          densityScore: { $multiply: [{ $size: "$uniqueUsers" }, { $log: { $add: ["$totalEvents", 1] } }] }
        }
      },

      // Temporal pattern analysis
      {
        $group: {
          _id: { $hour: "$_id.timeBucket" },

          totalGridCells: { $sum: 1 },
          avgUserDensity: { $avg: "$userDensity" },
          maxUserDensity: { $max: "$userDensity" },
          totalUniqueUsers: { $sum: "$userDensity" },

          // Hotspot identification
          hotspots: {
            $push: {
              $cond: [
                { $gte: ["$densityScore", 10] },
                {
                  center: { type: "Point", coordinates: ["$centerLng", "$centerLat"] },
                  userDensity: "$userDensity",
                  densityScore: "$densityScore"
                },
                null
              ]
            }
          }
        }
      },

      // Clean up hotspots array
      {
        $addFields: {
          hotspots: {
            $filter: {
              input: "$hotspots",
              cond: { $ne: ["$$this", null] }
            }
          }
        }
      },

      { $sort: { "_id": 1 } },

      {
        $project: {
          hour: "$_id",
          totalGridCells: 1,
          avgUserDensity: { $round: ["$avgUserDensity", 2] },
          maxUserDensity: 1,
          totalUniqueUsers: 1,
          hotspotCount: { $size: "$hotspots" },
          topHotspots: { $slice: ["$hotspots", 5] }
        }
      }
    ]).toArray();

    console.log(`Temporal-Spatial Analysis - ${temporalSpatialAnalysis.length} hourly patterns`);

    return {
      complexSpatialResults: complexSpatialAnalysis.length,
      intersectionResults: intersectionAnalysis.length,  
      temporalSpatialResults: temporalSpatialAnalysis.length,

      insights: {
        spatialComplexity: complexSpatialAnalysis,
        deliveryOverlaps: intersectionAnalysis,
        hourlyPatterns: temporalSpatialAnalysis
      }
    };
  }

  async optimizeGeospatialPerformance() {
    console.log("=== Geospatial Performance Optimization ===");

    // 1. Index performance analysis
    const locations = this.db.collection('locations');

    // Test different query patterns
    const performanceTests = [
      {
        name: "Simple Proximity Query",
        query: {
          location: {
            $near: {
              $geometry: { type: "Point", coordinates: [-122.4194, 37.7749] },
              $maxDistance: 5000
            }
          }
        }
      },
      {
        name: "Filtered Proximity Query", 
        query: {
          location: {
            $near: {
              $geometry: { type: "Point", coordinates: [-122.4194, 37.7749] },
              $maxDistance: 5000
            }
          },
          category: "restaurant",
          "business.isActive": true
        }
      },
      {
        name: "Geo Within Query",
        query: {
          location: {
            $geoWithin: {
              $centerSphere: [[-122.4194, 37.7749], 5 / 3963.2] // 5 miles
            }
          }
        }
      }
    ];

    const performanceResults = [];

    for (const test of performanceTests) {
      const startTime = Date.now();

      const results = await locations.find(test.query)
        .limit(20)
        .explain("executionStats");

      const executionTime = Date.now() - startTime;

      performanceResults.push({
        testName: test.name,
        executionTimeMs: executionTime,
        documentsExamined: results.executionStats.totalDocsExamined,
        documentsReturned: results.executionStats.totalDocsReturned,
        indexUsed: results.executionStats.executionStages?.indexName || "none",
        efficiency: results.executionStats.totalDocsReturned / Math.max(results.executionStats.totalDocsExamined, 1)
      });
    }

    console.log("Performance Test Results:");
    performanceResults.forEach(result => {
      console.log(`${result.testName}: ${result.executionTimeMs}ms, Efficiency: ${(result.efficiency * 100).toFixed(1)}%`);
    });

    // 2. Index recommendations
    const indexRecommendations = await this.analyzeIndexUsage(locations);

    // 3. Memory usage optimization
    const memoryOptimization = await this.optimizeMemoryUsage(locations);

    return {
      performanceResults,
      indexRecommendations,
      memoryOptimization,

      recommendations: [
        "Use 2dsphere indexes for Earth-based coordinates",
        "Include commonly filtered fields in compound indexes",
        "Limit result sets with appropriate $maxDistance values", 
        "Use $geoNear aggregation for complex distance-based analytics",
        "Monitor index usage and query patterns regularly"
      ]
    };
  }

  async analyzeIndexUsage(collection) {
    // Get index usage statistics
    const indexStats = await collection.aggregate([
      { $indexStats: {} }
    ]).toArray();

    const recommendations = [];

    indexStats.forEach(stat => {
      const usageRatio = stat.accesses.ops / (stat.accesses.since?.getTime() || 1);

      if (usageRatio < 0.001) {
        recommendations.push({
          type: "remove",
          index: stat.name,
          reason: "Low usage index - consider removing",
          usage: usageRatio
        });
      } else if (usageRatio > 10) {
        recommendations.push({
          type: "optimize",
          index: stat.name, 
          reason: "High usage index - ensure optimal configuration",
          usage: usageRatio
        });
      }
    });

    return {
      totalIndexes: indexStats.length,
      recommendations: recommendations,
      indexStats: indexStats
    };
  }

  async optimizeMemoryUsage(collection) {
    // Analyze document sizes and memory patterns
    const sizeAnalysis = await collection.aggregate([
      {
        $project: {
          documentSize: { $bsonSize: "$$ROOT" },
          hasLocationHistory: { $ne: ["$locationHistory", null] },
          locationHistorySize: { $size: { $ifNull: ["$locationHistory", []] } },
          hasDeliveryZones: { $ne: ["$deliveryZones", null] }
        }
      },
      {
        $group: {
          _id: null,

          avgDocumentSize: { $avg: "$documentSize" },
          maxDocumentSize: { $max: "$documentSize" },
          minDocumentSize: { $min: "$documentSize" },

          largeDocuments: { $sum: { $cond: [{ $gt: ["$documentSize", 16384] }, 1, 0] } }, // > 16KB
          documentsWithHistory: { $sum: { $cond: ["$hasLocationHistory", 1, 0] } },
          avgHistorySize: { $avg: "$locationHistorySize" },

          totalDocuments: { $sum: 1 }
        }
      }
    ]).toArray();

    const analysis = sizeAnalysis[0] || {};

    const optimizationTips = [];

    if (analysis.avgDocumentSize > 8192) {
      optimizationTips.push("Consider splitting large documents or using references");
    }

    if (analysis.avgHistorySize > 100) {
      optimizationTips.push("Limit location history array size or archive old data");
    }

    if (analysis.largeDocuments > analysis.totalDocuments * 0.1) {
      optimizationTips.push("High number of large documents - review document structure");
    }

    return {
      sizeAnalysis: analysis,
      optimizationTips: optimizationTips,

      recommendations: {
        documentSize: "Keep documents under 16MB, optimal under 1MB",
        arrays: "Limit embedded arrays to prevent unbounded growth", 
        indexing: "Use partial indexes for sparse geospatial data",
        sharding: "Consider sharding key that includes geospatial distribution"
      }
    };
  }
}

SQL-Style Geospatial Operations with QueryLeaf

QueryLeaf provides familiar SQL syntax for MongoDB's powerful geospatial capabilities:

-- QueryLeaf geospatial operations with SQL-familiar syntax

-- Create geospatial-enabled table/collection
CREATE TABLE locations (
  id OBJECTID PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  category VARCHAR(100),

  -- Geospatial columns with native GeoJSON support
  location POINT NOT NULL, -- GeoJSON Point
  service_area POLYGON,    -- GeoJSON Polygon
  delivery_zones MULTIPOLYGON, -- GeoJSON MultiPolygon

  -- Business data
  rating DECIMAL(3,2),
  total_reviews INTEGER DEFAULT 0,
  is_active BOOLEAN DEFAULT true,

  -- Address information
  address DOCUMENT {
    street VARCHAR(255),
    city VARCHAR(100),
    state VARCHAR(50),
    country VARCHAR(100),
    postal_code VARCHAR(20)
  },

  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create geospatial indexes
CREATE SPATIAL INDEX idx_locations_location ON locations (location);
CREATE SPATIAL INDEX idx_locations_service_area ON locations (service_area);
CREATE COMPOUND INDEX idx_locations_category_geo ON locations (category, location);

-- Insert location data with geospatial coordinates
INSERT INTO locations (name, category, location, service_area, address, rating, total_reviews)
VALUES (
  'Blue Bottle Coffee',
  'cafe', 
  ST_POINT(-122.3937, 37.7955), -- Longitude, Latitude
  ST_POLYGON(ARRAY[
    ARRAY[-122.4050, 37.7850], -- Southwest
    ARRAY[-122.3850, 37.7850], -- Southeast  
    ARRAY[-122.3850, 37.8050], -- Northeast
    ARRAY[-122.4050, 37.8050], -- Northwest
    ARRAY[-122.4050, 37.7850]  -- Close polygon
  ]),
  {
    street: '1 Ferry Building',
    city: 'San Francisco',
    state: 'CA',
    country: 'USA',
    postal_code: '94111'
  },
  4.6,
  1247
);

-- Proximity search - find nearby locations
SELECT 
  id,
  name,
  category,
  rating,

  -- Calculate distance in meters
  ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749)) as distance_meters,

  -- Extract coordinates for display
  ST_X(location) as longitude,
  ST_Y(location) as latitude,

  -- Address information
  address.street,
  address.city,
  address.state

FROM locations
WHERE 
  is_active = true
  AND ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749)) <= 5000 -- Within 5km
  AND category IN ('cafe', 'restaurant', 'retail')
ORDER BY ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749))
LIMIT 20;

-- Advanced proximity search with relevance scoring
WITH nearby_locations AS (
  SELECT 
    *,
    ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749)) as distance_meters
  FROM locations
  WHERE 
    is_active = true
    AND ST_DWITHIN(location, ST_POINT(-122.4194, 37.7749), 10000) -- 10km radius
),
scored_locations AS (
  SELECT *,
    -- Relevance scoring: distance (40%) + rating (30%) + reviews (30%)
    (
      (1000 - LEAST(distance_meters, 1000)) / 1000 * 0.4 +
      (rating / 5.0) * 0.3 +
      (LEAST(total_reviews, 1000) / 1000.0) * 0.3
    ) as relevance_score,

    -- Distance categories
    CASE 
      WHEN distance_meters <= 1000 THEN 'nearby'
      WHEN distance_meters <= 5000 THEN 'moderate'
      ELSE 'distant'
    END as distance_category

  FROM nearby_locations
)
SELECT 
  name,
  category, 
  rating,
  total_reviews,
  ROUND(distance_meters) as distance_m,
  distance_category,
  ROUND(relevance_score, 3) as relevance,

  -- Format coordinates for maps
  CONCAT(
    ROUND(ST_Y(location), 6), ',', 
    ROUND(ST_X(location), 6)
  ) as lat_lng

FROM scored_locations
ORDER BY relevance_score DESC, distance_meters ASC
LIMIT 25;

-- Geospatial area queries
SELECT 
  l.name,
  l.category,
  l.rating,

  -- Check if location is within specific area
  ST_CONTAINS(
    ST_POLYGON(ARRAY[
      ARRAY[-122.4270, 37.7609], -- Downtown SF polygon
      ARRAY[-122.3968, 37.7609],
      ARRAY[-122.3968, 37.7908], 
      ARRAY[-122.4270, 37.7908],
      ARRAY[-122.4270, 37.7609]
    ]),
    l.location
  ) as is_in_downtown,

  -- Check service area coverage
  ST_CONTAINS(l.service_area, ST_POINT(-122.4194, 37.7749)) as serves_user_location

FROM locations l
WHERE 
  l.is_active = true
  AND ST_INTERSECTS(
    l.location,
    ST_POLYGON(ARRAY[
      ARRAY[-122.4270, 37.7609],
      ARRAY[-122.3968, 37.7609], 
      ARRAY[-122.3968, 37.7908],
      ARRAY[-122.4270, 37.7908],
      ARRAY[-122.4270, 37.7609]
    ])
  );

-- Complex geospatial analytics with aggregation
WITH location_analytics AS (
  SELECT 
    category,

    -- Spatial clustering analysis
    ST_CLUSTERKMEANS(location, 5) OVER () as cluster_id,

    -- Distance from city center
    ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749)) as distance_from_center,

    -- Geospatial grid for density analysis
    ST_SNAPGRID(location, 0.001, 0.001) as grid_cell,

    name,
    rating,
    total_reviews,
    location

  FROM locations
  WHERE is_active = true
),
cluster_analysis AS (
  SELECT 
    cluster_id,
    category,
    COUNT(*) as location_count,
    AVG(rating) as avg_rating,
    AVG(distance_from_center) as avg_distance_from_center,

    -- Calculate cluster centroid
    ST_CENTROID(ST_COLLECT(location)) as cluster_center,

    -- Calculate cluster bounds
    ST_ENVELOPE(ST_COLLECT(location)) as cluster_bounds,

    -- Business metrics
    SUM(total_reviews) as total_reviews,
    AVG(total_reviews) as avg_reviews_per_location

  FROM location_analytics
  GROUP BY cluster_id, category
),
grid_density AS (
  SELECT 
    grid_cell,
    COUNT(DISTINCT category) as category_diversity,
    COUNT(*) as location_density,
    AVG(rating) as avg_rating,

    -- Calculate grid cell center
    ST_CENTROID(grid_cell) as grid_center

  FROM location_analytics
  GROUP BY grid_cell
  HAVING COUNT(*) >= 3 -- Only dense grid cells
)
SELECT 
  ca.cluster_id,
  ca.category,
  ca.location_count,
  ROUND(ca.avg_rating, 2) as avg_rating,
  ROUND(ca.avg_distance_from_center) as avg_distance_m,

  -- Cluster geographic data
  ST_X(ca.cluster_center) as cluster_lng,
  ST_Y(ca.cluster_center) as cluster_lat,

  -- Calculate cluster area in square meters
  ST_AREA(ca.cluster_bounds, true) as cluster_area_sqm,

  -- Density metrics
  ROUND(ca.location_count / ST_AREA(ca.cluster_bounds, true) * 1000000, 2) as density_per_sqkm,

  -- Business performance
  ca.total_reviews,
  ROUND(ca.avg_reviews_per_location) as avg_reviews,

  -- Nearby high-density areas
  (
    SELECT COUNT(*)
    FROM grid_density gd
    WHERE ST_DISTANCE(ca.cluster_center, gd.grid_center) <= 1000
  ) as nearby_dense_areas

FROM cluster_analysis ca
WHERE ca.location_count >= 2
ORDER BY ca.location_count DESC, ca.avg_rating DESC;

-- Geofencing and real-time location queries
CREATE TABLE geofences (
  id OBJECTID PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  geofence_area POLYGON NOT NULL,
  geofence_type VARCHAR(50) DEFAULT 'notification',
  is_active BOOLEAN DEFAULT true,

  -- Trigger configuration
  config DOCUMENT {
    on_enter BOOLEAN DEFAULT true,
    on_exit BOOLEAN DEFAULT true,
    on_dwell BOOLEAN DEFAULT false,
    dwell_time_minutes INTEGER DEFAULT 5,
    max_triggers_per_day INTEGER DEFAULT 10
  },

  -- Analytics tracking
  analytics DOCUMENT {
    total_enters INTEGER DEFAULT 0,
    total_exits INTEGER DEFAULT 0,
    unique_users INTEGER DEFAULT 0,
    avg_dwell_minutes DECIMAL(8,2) DEFAULT 0
  },

  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE SPATIAL INDEX idx_geofences_area ON geofences (geofence_area);

-- Check geofence triggers for user location
SELECT 
  gf.id,
  gf.name,
  gf.geofence_type,

  -- Check if user location triggers geofence
  ST_CONTAINS(gf.geofence_area, ST_POINT(-122.4150, 37.7750)) as is_triggered,

  -- Calculate distance to geofence edge
  ST_DISTANCE(
    ST_POINT(-122.4150, 37.7750),
    ST_BOUNDARY(gf.geofence_area)
  ) as distance_to_edge_m,

  -- Geofence area and perimeter
  ST_AREA(gf.geofence_area, true) as area_sqm,
  ST_PERIMETER(gf.geofence_area, true) as perimeter_m,

  -- Configuration and analytics
  gf.config,
  gf.analytics

FROM geofences gf
WHERE 
  gf.is_active = true
  AND (
    ST_CONTAINS(gf.geofence_area, ST_POINT(-122.4150, 37.7750)) -- Inside geofence
    OR ST_DISTANCE(
      ST_POINT(-122.4150, 37.7750), 
      gf.geofence_area
    ) <= 100 -- Within 100m of geofence
  );

-- Time-based geospatial analysis
CREATE TABLE user_location_history (
  id OBJECTID PRIMARY KEY,
  user_id OBJECTID NOT NULL,
  location POINT NOT NULL,
  recorded_at TIMESTAMP NOT NULL,
  accuracy_meters DECIMAL(8,2),
  activity_type VARCHAR(50),

  -- Movement data
  speed_mps DECIMAL(8,2), -- meters per second
  heading_degrees INTEGER, -- 0-360 degrees from north

  -- Context information
  context DOCUMENT {
    battery_level INTEGER,
    connection_type VARCHAR(50),
    app_state VARCHAR(50)
  }
);

CREATE COMPOUND INDEX idx_user_location_time_geo ON user_location_history (
  user_id, recorded_at, location
);

-- Movement pattern analysis
WITH user_movements AS (
  SELECT 
    user_id,
    location,
    recorded_at,

    -- Calculate distance from previous location
    ST_DISTANCE(
      location,
      LAG(location) OVER (
        PARTITION BY user_id 
        ORDER BY recorded_at
      )
    ) as movement_distance,

    -- Time since previous location
    EXTRACT(EPOCH FROM (
      recorded_at - LAG(recorded_at) OVER (
        PARTITION BY user_id 
        ORDER BY recorded_at
      )
    )) as time_elapsed_seconds,

    -- Previous location for trajectory analysis
    LAG(location) OVER (
      PARTITION BY user_id 
      ORDER BY recorded_at
    ) as previous_location

  FROM user_location_history
  WHERE recorded_at >= CURRENT_TIMESTAMP - INTERVAL '24 hours'
),
movement_metrics AS (
  SELECT 
    user_id,
    COUNT(*) as location_points,
    SUM(movement_distance) as total_distance_m,
    AVG(movement_distance / NULLIF(time_elapsed_seconds, 0)) as avg_speed_mps,
    MAX(movement_distance / NULLIF(time_elapsed_seconds, 0)) as max_speed_mps,

    -- Create trajectory line
    ST_MAKELINE(ARRAY_AGG(location ORDER BY recorded_at)) as trajectory,

    -- Calculate bounding box of movement
    ST_ENVELOPE(ST_COLLECT(location)) as movement_bounds,

    -- Time-based metrics
    MIN(recorded_at) as journey_start,
    MAX(recorded_at) as journey_end,
    EXTRACT(EPOCH FROM (MAX(recorded_at) - MIN(recorded_at))) as journey_duration_seconds,

    -- Movement patterns
    COUNT(DISTINCT ST_SNAPGRID(location, 0.001, 0.001)) as unique_areas_visited

  FROM user_movements
  WHERE movement_distance IS NOT NULL
    AND time_elapsed_seconds > 0
    AND movement_distance < 10000 -- Filter out GPS errors
  GROUP BY user_id
)
SELECT 
  user_id,
  location_points,
  ROUND(total_distance_m) as total_distance_m,
  ROUND(total_distance_m / 1000.0, 2) as total_distance_km,
  ROUND(avg_speed_mps * 3.6, 1) as avg_speed_kmh, -- Convert to km/h
  ROUND(max_speed_mps * 3.6, 1) as max_speed_kmh,

  -- Journey characteristics
  journey_start,
  journey_end,
  ROUND(journey_duration_seconds / 3600.0, 1) as journey_hours,
  unique_areas_visited,

  -- Trajectory analysis
  ST_LENGTH(trajectory, true) as trajectory_length_m,
  ST_AREA(movement_bounds, true) as coverage_area_sqm,

  -- Movement efficiency (straight-line vs actual distance)
  ROUND(
    ST_DISTANCE(
      ST_STARTPOINT(trajectory),
      ST_ENDPOINT(trajectory)
    ) / NULLIF(ST_LENGTH(trajectory, true), 0) * 100, 1
  ) as movement_efficiency_pct,

  -- Geographic extent
  ST_XMIN(movement_bounds) as min_longitude,
  ST_XMAX(movement_bounds) as max_longitude, 
  ST_YMIN(movement_bounds) as min_latitude,
  ST_YMAX(movement_bounds) as max_latitude

FROM movement_metrics
WHERE total_distance_m > 100 -- Minimum movement threshold
ORDER BY total_distance_m DESC
LIMIT 50;

-- Location-based recommendations engine
WITH user_preferences AS (
  SELECT 
    u.user_id,
    u.location as current_location,

    -- User preference analysis based on visit history
    up.preferred_categories,
    up.avg_rating_threshold,
    up.max_distance_preference,
    up.price_range_preference

  FROM user_profiles u
  JOIN user_preferences up ON u.user_id = up.user_id
  WHERE u.is_active = true
),
location_scoring AS (
  SELECT 
    l.*,
    up.user_id,

    -- Distance scoring
    ST_DISTANCE(l.location, up.current_location) as distance_m,
    EXP(-ST_DISTANCE(l.location, up.current_location) / 2000.0) as distance_score,

    -- Category preference scoring
    CASE 
      WHEN l.category = ANY(up.preferred_categories) THEN 1.0
      WHEN ARRAY_LENGTH(up.preferred_categories, 1) = 0 THEN 0.5
      ELSE 0.2
    END as category_score,

    -- Rating scoring
    l.rating / 5.0 as rating_score,

    -- Popularity scoring based on reviews
    LN(l.total_reviews + 1) / LN(1000) as popularity_score,

    -- Time-based scoring (open/closed)
    CASE 
      WHEN EXTRACT(DOW FROM CURRENT_TIMESTAMP) = 0 THEN -- Sunday
        CASE WHEN l.hours.sunday.is_open THEN 1.0 ELSE 0.3 END
      WHEN EXTRACT(DOW FROM CURRENT_TIMESTAMP) = 1 THEN -- Monday
        CASE WHEN l.hours.monday.is_open THEN 1.0 ELSE 0.3 END
      -- ... other days
      ELSE 0.8
    END as availability_score

  FROM locations l
  CROSS JOIN user_preferences up
  WHERE 
    l.is_active = true
    AND ST_DISTANCE(l.location, up.current_location) <= up.max_distance_preference
    AND l.rating >= up.avg_rating_threshold
),
final_recommendations AS (
  SELECT *,
    -- Combined relevance score
    (
      distance_score * 0.25 +
      category_score * 0.30 +
      rating_score * 0.20 +
      popularity_score * 0.15 +
      availability_score * 0.10
    ) as relevance_score

  FROM location_scoring
)
SELECT 
  user_id,
  name as location_name,
  category,
  rating,
  total_reviews,
  ROUND(distance_m) as distance_meters,
  ROUND(relevance_score, 3) as relevance,

  -- Location details for display
  ST_X(location) as longitude,
  ST_Y(location) as latitude,
  address.street || ', ' || address.city as display_address,

  -- Recommendation reasoning
  CASE 
    WHEN category_score = 1.0 THEN 'Matches your preferences'
    WHEN distance_score > 0.8 THEN 'Very close to you'
    WHEN rating_score >= 0.9 THEN 'Highly rated'
    WHEN popularity_score > 0.5 THEN 'Popular destination'
    ELSE 'Good option nearby'
  END as recommendation_reason

FROM final_recommendations
WHERE relevance_score > 0.3
ORDER BY user_id, relevance_score DESC
LIMIT 10 PER user_id;

-- QueryLeaf geospatial features provide:
-- 1. Native GeoJSON support with SQL-familiar geometry functions
-- 2. Spatial indexing with automatic optimization for Earth-based coordinates
-- 3. Distance calculations and proximity queries with intuitive syntax
-- 4. Complex geospatial aggregations and analytics using familiar SQL patterns
-- 5. Geofencing capabilities with real-time trigger detection
-- 6. Movement pattern analysis and trajectory tracking
-- 7. Location-based recommendation engines with multi-factor scoring
-- 8. Integration with MongoDB's native geospatial operators and functions
-- 9. Performance optimization through intelligent query planning
-- 10. Seamless scaling from simple proximity queries to complex spatial analytics

Best Practices for Geospatial Implementation

Coordinate System Selection

Choose the appropriate coordinate system and indexing strategy:

  1. 2dsphere Index: Use for Earth-based coordinates with spherical geometry calculations
  2. 2d Index: Use for flat coordinate systems like game maps or floor plans
  3. Coordinate Format: MongoDB uses [longitude, latitude] format (opposite of many mapping APIs)
  4. Precision Considerations: Balance coordinate precision with storage and performance requirements
  5. Projection Selection: Choose appropriate coordinate reference system for your geographic region
  6. Distance Units: Ensure consistent distance units throughout your application

Performance Optimization

Optimize geospatial queries for high performance and scalability:

  1. Index Strategy: Create compound indexes that support your most common query patterns
  2. Query Limits: Use $maxDistance and $minDistance to limit search scope
  3. Result Pagination: Implement proper pagination for large result sets
  4. Memory Management: Monitor working set size and optimize document structure
  5. Aggregation Optimization: Use $geoNear for distance-based aggregations when possible
  6. Sharding Strategy: Consider geospatial distribution when designing sharding keys

Conclusion

MongoDB geospatial capabilities provide comprehensive location-aware functionality that eliminates the complexity of traditional spatial database extensions while delivering superior performance and scalability. The native support for GeoJSON, multiple coordinate systems, and sophisticated spatial operations makes building location-based applications both powerful and intuitive.

Key geospatial benefits include:

  • Native Spatial Support: Built-in GeoJSON support without additional extensions or setup
  • High Performance: Optimized spatial indexing and query execution for billions of documents
  • Rich Query Capabilities: Comprehensive spatial operators for proximity, intersection, and containment
  • Flexible Data Models: Store complex location data with business context in single documents
  • Real-time Processing: Efficient geofencing and location tracking for live applications
  • Scalable Architecture: Horizontal scaling across distributed clusters with location-aware sharding

Whether you're building ride-sharing platforms, delivery applications, location-based social networks, or IoT sensor networks, MongoDB's geospatial features with QueryLeaf's familiar SQL interface provides the foundation for sophisticated location-aware applications. This combination enables you to implement complex spatial functionality while preserving familiar database interaction patterns.

QueryLeaf Integration: QueryLeaf automatically manages MongoDB geospatial operations while providing SQL-familiar spatial query syntax, coordinate system handling, and geographic analysis functions. Advanced geospatial indexing, proximity calculations, and spatial analytics are seamlessly handled through familiar SQL patterns, making location-based application development both powerful and accessible.

The integration of native geospatial capabilities with SQL-style spatial operations makes MongoDB an ideal platform for applications requiring both sophisticated location functionality and familiar database interaction patterns, ensuring your geospatial solutions remain both effective and maintainable as they scale and evolve.