MongoDB Geospatial Queries and Location-Based Services: SQL-Style Spatial Operations for Modern Applications
Location-aware applications have become fundamental to modern software experiences - from ride-sharing platforms and delivery services to social networks and retail applications. These applications require sophisticated spatial data processing capabilities including proximity searches, route optimization, geofencing, and real-time location tracking that traditional relational databases struggle to handle efficiently.
MongoDB provides comprehensive geospatial functionality with support for 2D and 3D coordinates, multiple coordinate reference systems, and advanced spatial operations. Unlike traditional databases that require complex extensions for spatial data, MongoDB natively supports geospatial indexes, queries, and aggregation operations that can handle billions of location data points with sub-second query performance.
The Traditional Spatial Data Challenge
Relational databases face significant limitations when handling geospatial data and location-based queries:
-- Traditional PostgreSQL/PostGIS approach - complex setup and limited performance
-- Location-based application with spatial data
CREATE EXTENSION IF NOT EXISTS postgis;
-- Store locations with geometry data
CREATE TABLE locations (
location_id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
category VARCHAR(100),
address TEXT,
city VARCHAR(100),
state VARCHAR(50),
country VARCHAR(100),
-- PostGIS geometry column (complex setup required)
coordinates GEOMETRY(POINT, 4326), -- WGS84 coordinate system
-- Additional spatial data
service_area GEOMETRY(POLYGON, 4326), -- Service coverage area
delivery_zones GEOMETRY(MULTIPOLYGON, 4326), -- Multiple delivery zones
-- Business data
rating DECIMAL(3,2),
total_reviews INTEGER DEFAULT 0,
is_active BOOLEAN DEFAULT true,
hours_of_operation JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Create spatial indexes (requires PostGIS extension)
CREATE INDEX idx_locations_coordinates ON locations USING GIST (coordinates);
CREATE INDEX idx_locations_service_area ON locations USING GIST (service_area);
-- Store user locations and activities
CREATE TABLE user_locations (
user_location_id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(user_id),
coordinates GEOMETRY(POINT, 4326),
accuracy_meters DECIMAL(8,2),
recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
activity_type VARCHAR(50), -- 'check-in', 'delivery', 'movement'
device_info JSONB
);
CREATE INDEX idx_user_locations_coordinates ON user_locations USING GIST (coordinates);
CREATE INDEX idx_user_locations_user_time ON user_locations (user_id, recorded_at);
-- Complex proximity search query
WITH nearby_locations AS (
SELECT
l.location_id,
l.name,
l.category,
l.rating,
-- Distance calculation in meters
ST_Distance(
l.coordinates,
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326) -- San Francisco coordinates
) as distance_meters,
-- Check if point is within service area
ST_Contains(
l.service_area,
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)
) as is_in_service_area,
-- Convert coordinates back to lat/lng for application
ST_Y(l.coordinates) as latitude,
ST_X(l.coordinates) as longitude
FROM locations l
WHERE
l.is_active = true
AND ST_DWithin(
l.coordinates,
ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326),
5000 -- 5km radius in meters
)
),
location_analytics AS (
-- Add user activity data for locations
SELECT
nl.*,
COUNT(DISTINCT ul.user_id) as unique_visitors_last_30_days,
COUNT(ul.user_location_id) as total_activities_last_30_days,
AVG(ul.accuracy_meters) as avg_location_accuracy
FROM nearby_locations nl
LEFT JOIN user_locations ul ON ST_DWithin(
ST_SetSRID(ST_MakePoint(nl.longitude, nl.latitude), 4326),
ul.coordinates,
100 -- Within 100 meters of location
)
AND ul.recorded_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY nl.location_id, nl.name, nl.category, nl.rating,
nl.distance_meters, nl.is_in_service_area,
nl.latitude, nl.longitude
)
SELECT
location_id,
name,
category,
rating,
ROUND(distance_meters::numeric, 0) as distance_meters,
is_in_service_area,
latitude,
longitude,
unique_visitors_last_30_days,
total_activities_last_30_days,
ROUND(avg_location_accuracy::numeric, 1) as avg_accuracy_meters,
-- Relevance scoring based on distance, rating, and activity
(
(1000 - LEAST(distance_meters, 1000)) / 1000 * 0.4 + -- Distance factor (40%)
(rating / 5.0) * 0.3 + -- Rating factor (30%)
(LEAST(unique_visitors_last_30_days, 50) / 50.0) * 0.3 -- Activity factor (30%)
) as relevance_score
FROM location_analytics
ORDER BY relevance_score DESC, distance_meters ASC
LIMIT 20;
-- Problems with traditional spatial approach:
-- 1. Complex PostGIS extension setup and maintenance
-- 2. Requires specialized spatial database knowledge
-- 3. Limited coordinate system support without additional configuration
-- 4. Performance degrades with large datasets and complex queries
-- 5. Difficult integration with application object models
-- 6. Complex geometry data types and manipulation functions
-- 7. Limited aggregation capabilities for spatial analytics
-- 8. Challenging horizontal scaling for global applications
-- 9. Memory-intensive spatial operations
-- 10. Complex backup and restore procedures for spatial data
-- MySQL spatial limitations (even more restrictive):
CREATE TABLE locations_mysql (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255),
-- MySQL spatial support limited and less capable
coordinates POINT NOT NULL,
SPATIAL INDEX(coordinates)
);
-- Basic proximity query in MySQL (limited functionality)
SELECT
id, name,
ST_Distance_Sphere(
coordinates,
POINT(-122.4194, 37.7749)
) as distance_meters
FROM locations_mysql
WHERE ST_Distance_Sphere(
coordinates,
POINT(-122.4194, 37.7749)
) < 5000
ORDER BY distance_meters
LIMIT 10;
-- MySQL limitations:
-- - Limited spatial functions compared to PostGIS
-- - Poor performance with large spatial datasets
-- - No advanced spatial analytics capabilities
-- - Limited coordinate system support
-- - Basic geometry types only
-- - No spatial aggregation functions
-- - Difficult to implement complex spatial business logic
MongoDB provides comprehensive geospatial capabilities with simple, intuitive syntax:
// MongoDB native geospatial support - powerful and intuitive
const { MongoClient } = require('mongodb');
const client = new MongoClient('mongodb://localhost:27017');
const db = client.db('location_services');
// MongoDB geospatial document structure - native and flexible
const createLocationServiceDataModel = async () => {
// Create locations collection with rich geospatial data
const locations = db.collection('locations');
// Example location document with geospatial data
const locationDocument = {
_id: new ObjectId(),
// Basic business information
name: "Blue Bottle Coffee - Ferry Building",
category: "cafe",
subcategory: "specialty_coffee",
chain: "Blue Bottle Coffee",
// Address information
address: {
street: "1 Ferry Building",
unit: "Shop 7",
city: "San Francisco",
state: "CA",
country: "USA",
postalCode: "94111",
formattedAddress: "1 Ferry Building, Shop 7, San Francisco, CA 94111"
},
// Primary location - GeoJSON Point format
location: {
type: "Point",
coordinates: [-122.3937, 37.7955] // [longitude, latitude] - NOTE: MongoDB uses [lng, lat]
},
// Service area - GeoJSON Polygon format
serviceArea: {
type: "Polygon",
coordinates: [[
[-122.4050, 37.7850], // Southwest corner
[-122.3850, 37.7850], // Southeast corner
[-122.3850, 37.8050], // Northeast corner
[-122.4050, 37.8050], // Northwest corner
[-122.4050, 37.7850] // Close polygon
]]
},
// Multiple delivery zones - GeoJSON MultiPolygon
deliveryZones: {
type: "MultiPolygon",
coordinates: [
[[ // First delivery zone
[-122.4000, 37.7900],
[-122.3900, 37.7900],
[-122.3900, 37.8000],
[-122.4000, 37.8000],
[-122.4000, 37.7900]
]],
[[ // Second delivery zone
[-122.4100, 37.7800],
[-122.3950, 37.7800],
[-122.3950, 37.7900],
[-122.4100, 37.7900],
[-122.4100, 37.7800]
]]
]
},
// Business information
business: {
rating: 4.6,
totalReviews: 1247,
priceRange: "$$",
phoneNumber: "+1-415-555-0123",
website: "https://bluebottlecoffee.com",
isActive: true,
isChain: true,
// Hours of operation with geospatial considerations
hours: {
monday: { open: "06:00", close: "19:00", timezone: "America/Los_Angeles" },
tuesday: { open: "06:00", close: "19:00", timezone: "America/Los_Angeles" },
wednesday: { open: "06:00", close: "19:00", timezone: "America/Los_Angeles" },
thursday: { open: "06:00", close: "19:00", timezone: "America/Los_Angeles" },
friday: { open: "06:00", close: "20:00", timezone: "America/Los_Angeles" },
saturday: { open: "07:00", close: "20:00", timezone: "America/Los_Angeles" },
sunday: { open: "07:00", close: "19:00", timezone: "America/Los_Angeles" }
},
// Services and amenities
amenities: ["wifi", "outdoor_seating", "takeout", "delivery", "mobile_payment"],
specialties: ["single_origin", "cold_brew", "espresso", "pour_over"]
},
// Geospatial metadata
geoMetadata: {
coordinateSystem: "WGS84",
accuracyMeters: 5,
elevationMeters: 15,
dataSource: "GPS_verified",
lastVerified: new Date("2024-09-01"),
// Nearby landmarks for context
nearbyLandmarks: [
{
name: "Ferry Building Marketplace",
distance: 50,
bearing: "north"
},
{
name: "Embarcadero BART Station",
distance: 200,
bearing: "west"
}
]
},
// Analytics and performance data
analytics: {
monthlyVisitors: 12500,
averageVisitDuration: 25, // minutes
peakHours: ["08:00-09:00", "12:00-13:00", "15:00-16:00"],
popularDays: ["monday", "tuesday", "wednesday", "friday"],
// Location-specific metrics
locationMetrics: {
averageWalkingTime: 3.5, // minutes from nearest transit
parkingAvailability: "limited",
accessibilityRating: 4.2,
noiseLevel: "moderate",
crowdLevel: "busy"
}
},
// SEO and discovery
searchTerms: [
"coffee shop ferry building",
"blue bottle san francisco",
"specialty coffee embarcadero",
"third wave coffee downtown sf"
],
tags: ["coffee", "cafe", "specialty", "artisan", "downtown", "waterfront"],
createdAt: new Date("2024-01-15"),
updatedAt: new Date("2024-09-14")
};
// Insert the location document
await locations.insertOne(locationDocument);
// Create geospatial index - 2dsphere for spherical geometry (Earth)
await locations.createIndex({ location: "2dsphere" });
await locations.createIndex({ serviceArea: "2dsphere" });
await locations.createIndex({ deliveryZones: "2dsphere" });
// Additional indexes for common queries
await locations.createIndex({ category: 1, "business.rating": -1 });
await locations.createIndex({ "business.isActive": 1, "location": "2dsphere" });
await locations.createIndex({ tags: 1, "location": "2dsphere" });
console.log("Location document and indexes created successfully");
return locations;
};
// Advanced geospatial queries and operations
const performGeospatialOperations = async () => {
const locations = db.collection('locations');
// 1. Proximity Search - Find nearby locations
console.log("=== Proximity Search ===");
const userLocation = [-122.4194, 37.7749]; // San Francisco coordinates [lng, lat]
const nearbyLocations = await locations.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: userLocation
},
$maxDistance: 5000, // 5km in meters
$minDistance: 0
}
},
"business.isActive": true
}).limit(10).toArray();
console.log(`Found ${nearbyLocations.length} locations within 5km`);
// 2. Geo Within - Find locations within a specific area
console.log("\n=== Geo Within Search ===");
const searchPolygon = {
type: "Polygon",
coordinates: [[
[-122.4270, 37.7609], // Southwest corner
[-122.3968, 37.7609], // Southeast corner
[-122.3968, 37.7908], // Northeast corner
[-122.4270, 37.7908], // Northwest corner
[-122.4270, 37.7609] // Close polygon
]]
};
const locationsInArea = await locations.find({
location: {
$geoWithin: {
$geometry: searchPolygon
}
},
category: "restaurant"
}).toArray();
console.log(`Found ${locationsInArea.length} restaurants in specified area`);
// 3. Geospatial Aggregation - Complex analytics
console.log("\n=== Geospatial Analytics ===");
const geospatialAnalytics = await locations.aggregate([
// Match active locations
{
$match: {
"business.isActive": true,
location: {
$geoWithin: {
$centerSphere: [userLocation, 10 / 3963.2] // 10 miles radius
}
}
}
},
// Calculate distance from user location
{
$addFields: {
distanceFromUser: {
$divide: [
{
$sqrt: {
$add: [
{
$pow: [
{ $subtract: [{ $arrayElemAt: ["$location.coordinates", 0] }, userLocation[0]] },
2
]
},
{
$pow: [
{ $subtract: [{ $arrayElemAt: ["$location.coordinates", 1] }, userLocation[1]] },
2
]
}
]
}
},
0.000009 // Approximate degrees to meters conversion
]
}
}
},
// Group by category and analyze
{
$group: {
_id: "$category",
totalLocations: { $sum: 1 },
averageRating: { $avg: "$business.rating" },
averageDistance: { $avg: "$distanceFromUser" },
closestLocation: {
$min: {
name: "$name",
distance: "$distanceFromUser",
coordinates: "$location.coordinates"
}
},
// Collect all locations in category
locations: {
$push: {
name: "$name",
rating: "$business.rating",
distance: "$distanceFromUser",
coordinates: "$location.coordinates"
}
},
// Rating distribution
highRatedCount: {
$sum: { $cond: [{ $gte: ["$business.rating", 4.5] }, 1, 0] }
},
mediumRatedCount: {
$sum: { $cond: [{ $and: [{ $gte: ["$business.rating", 3.5] }, { $lt: ["$business.rating", 4.5] }] }, 1, 0] }
},
lowRatedCount: {
$sum: { $cond: [{ $lt: ["$business.rating", 3.5] }, 1, 0] }
}
}
},
// Calculate additional metrics
{
$addFields: {
categoryDensity: { $divide: ["$totalLocations", 314] }, // per square km (10 mile radius ≈ 314 sq km)
highRatedPercentage: { $multiply: [{ $divide: ["$highRatedCount", "$totalLocations"] }, 100] },
averageDistanceKm: { $multiply: ["$averageDistance", 111] } // Rough conversion to km
}
},
// Sort by total locations and rating
{
$sort: {
totalLocations: -1,
averageRating: -1
}
},
// Format output
{
$project: {
category: "$_id",
totalLocations: 1,
averageRating: { $round: ["$averageRating", 2] },
averageDistanceKm: { $round: ["$averageDistanceKm", 2] },
categoryDensity: { $round: ["$categoryDensity", 2] },
highRatedPercentage: { $round: ["$highRatedPercentage", 1] },
closestLocation: 1,
ratingDistribution: {
high: "$highRatedCount",
medium: "$mediumRatedCount",
low: "$lowRatedCount"
}
}
}
]).toArray();
console.log("Geospatial Analytics Results:");
console.log(JSON.stringify(geospatialAnalytics, null, 2));
// 4. Route optimization - Find optimal path through multiple locations
console.log("\n=== Route Optimization ===");
const waypointLocations = [
[-122.4194, 37.7749], // Start: San Francisco
[-122.4094, 37.7849], // Waypoint 1
[-122.3994, 37.7949], // Waypoint 2
[-122.4194, 37.7749] // End: Back to start
];
// Find locations near each waypoint
const routeAnalysis = await Promise.all(
waypointLocations.map(async (waypoint, index) => {
const nearbyOnRoute = await locations.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: waypoint
},
$maxDistance: 500 // 500m radius
}
},
"business.isActive": true
}).limit(5).toArray();
return {
waypointIndex: index,
coordinates: waypoint,
nearbyLocations: nearbyOnRoute.map(loc => ({
name: loc.name,
category: loc.category,
rating: loc.business.rating,
coordinates: loc.location.coordinates
}))
};
})
);
console.log("Route Analysis:");
console.log(JSON.stringify(routeAnalysis, null, 2));
return {
nearbyLocations: nearbyLocations.length,
locationsInArea: locationsInArea.length,
analyticsResults: geospatialAnalytics.length,
routeWaypoints: routeAnalysis.length
};
};
// Real-time location tracking and geofencing
const setupLocationTracking = async () => {
const userLocations = db.collection('user_locations');
const geofences = db.collection('geofences');
// Create user location tracking document
const userLocationDocument = {
_id: new ObjectId(),
userId: new ObjectId("64a1b2c3d4e5f6789012347a"),
// Current location
currentLocation: {
type: "Point",
coordinates: [-122.4194, 37.7749]
},
// Location metadata
locationMetadata: {
accuracy: 10, // meters
altitude: 15, // meters above sea level
heading: 45, // degrees from north
speed: 1.5, // meters per second
timestamp: new Date(),
source: "GPS", // GPS, WiFi, Cellular, Manual
batteryLevel: 85,
// Device context
device: {
platform: "iOS",
version: "17.1",
model: "iPhone 15 Pro",
appVersion: "2.1.0"
}
},
// Location history (recent positions)
locationHistory: [
{
location: {
type: "Point",
coordinates: [-122.4204, 37.7739]
},
timestamp: new Date(Date.now() - 300000), // 5 minutes ago
accuracy: 15,
source: "GPS"
},
{
location: {
type: "Point",
coordinates: [-122.4214, 37.7729]
},
timestamp: new Date(Date.now() - 600000), // 10 minutes ago
accuracy: 12,
source: "GPS"
}
],
// Privacy and permissions
privacy: {
shareLocation: true,
accuracyLevel: "precise", // precise, approximate, city
shareWithFriends: true,
shareWithBusiness: false,
trackingEnabled: true
},
// Activity context
activity: {
type: "walking", // walking, driving, cycling, stationary
confidence: 0.85,
detectedTransition: null,
lastActivity: "stationary"
},
createdAt: new Date(),
updatedAt: new Date()
};
// Create indexes for location tracking
await userLocations.createIndex({ currentLocation: "2dsphere" });
await userLocations.createIndex({ userId: 1, "locationMetadata.timestamp": -1 });
await userLocations.createIndex({ "locationHistory.location": "2dsphere" });
await userLocations.insertOne(userLocationDocument);
// Create geofence system
const geofenceDocument = {
_id: new ObjectId(),
name: "Downtown Coffee Shop Promo Zone",
description: "Special promotions for coffee shops in downtown area",
// Geofence area
area: {
type: "Polygon",
coordinates: [[
[-122.4200, 37.7700],
[-122.4100, 37.7700],
[-122.4100, 37.7800],
[-122.4200, 37.7800],
[-122.4200, 37.7700]
]]
},
// Geofence configuration
config: {
type: "promotional", // promotional, security, analytics, notification
radius: null, // For circular geofences
isActive: true,
// Trigger conditions
triggers: {
onEnter: true,
onExit: true,
onDwell: true,
dwellTimeMinutes: 5,
// Rate limiting
minTimeBetweenTriggers: 300, // seconds
maxTriggersPerDay: 10
},
// Actions to take
actions: {
notification: {
enabled: true,
title: "Coffee Deals Nearby!",
message: "Check out special offers at local coffee shops",
deepLink: "app://offers/coffee"
},
analytics: {
trackEntry: true,
trackExit: true,
trackDwellTime: true
},
webhook: {
enabled: false,
url: "https://api.example.com/geofence-trigger",
method: "POST"
}
}
},
// Analytics
analytics: {
totalEnters: 1456,
totalExits: 1423,
avgDwellTimeMinutes: 12.5,
uniqueUsers: 342,
// Time-based patterns
hourlyActivity: {
"08": 45, "09": 78, "10": 23, "11": 34,
"12": 89, "13": 67, "14": 45, "15": 56,
"16": 78, "17": 123, "18": 89, "19": 34
},
dailyActivity: {
"monday": 234, "tuesday": 189, "wednesday": 267,
"thursday": 201, "friday": 298, "saturday": 156, "sunday": 111
}
},
createdAt: new Date("2024-09-01"),
updatedAt: new Date("2024-09-14")
};
await geofences.createIndex({ area: "2dsphere" });
await geofences.createIndex({ "config.isActive": 1, "config.type": 1 });
await geofences.insertOne(geofenceDocument);
// Real-time geofence checking function
const checkGeofences = async (userId, currentLocation) => {
console.log("Checking geofences for user location...");
// Find all active geofences that contain the user's location
const triggeredGeofences = await geofences.find({
"config.isActive": true,
area: {
$geoIntersects: {
$geometry: {
type: "Point",
coordinates: currentLocation
}
}
}
}).toArray();
console.log(`Found ${triggeredGeofences.length} triggered geofences`);
// Process each triggered geofence
for (const geofence of triggeredGeofences) {
console.log(`Processing geofence: ${geofence.name}`);
// Update analytics
await geofences.updateOne(
{ _id: geofence._id },
{
$inc: {
"analytics.totalEnters": 1,
[`analytics.hourlyActivity.${new Date().getHours().toString().padStart(2, '0')}`]: 1,
[`analytics.dailyActivity.${new Date().toLocaleDateString('en-US', { weekday: 'long' }).toLowerCase()}`]: 1
},
$set: { updatedAt: new Date() }
}
);
// Trigger actions (notifications, webhooks, etc.)
if (geofence.config.actions.notification.enabled) {
console.log(`Sending notification: ${geofence.config.actions.notification.title}`);
// Implementation would send actual notification
}
}
return triggeredGeofences;
};
// Test geofence checking
const testLocation = [-122.4150, 37.7750]; // Point within the geofence
const triggeredFences = await checkGeofences(userLocationDocument.userId, testLocation);
return {
userLocationDocument,
geofenceDocument,
triggeredGeofences: triggeredFences.length
};
};
// Advanced spatial analytics and heatmap generation
const generateSpatialAnalytics = async () => {
const locations = db.collection('locations');
const userLocations = db.collection('user_locations');
console.log("=== Generating Spatial Analytics ===");
// 1. Location Density Analysis
const locationDensityAnalysis = await locations.aggregate([
{
$match: {
"business.isActive": true
}
},
// Create grid cells for density analysis
{
$addFields: {
gridCell: {
lat: {
$floor: {
$multiply: [
{ $arrayElemAt: ["$location.coordinates", 1] }, // latitude
1000 // Create 0.001 degree grid cells (~100m)
]
}
},
lng: {
$floor: {
$multiply: [
{ $arrayElemAt: ["$location.coordinates", 0] }, // longitude
1000
]
}
}
}
}
},
// Group by grid cell
{
$group: {
_id: "$gridCell",
locationCount: { $sum: 1 },
avgRating: { $avg: "$business.rating" },
categories: { $push: "$category" },
// Calculate center point of grid cell
centerCoordinates: {
$first: {
type: "Point",
coordinates: [
{ $divide: ["$gridCell.lng", 1000] },
{ $divide: ["$gridCell.lat", 1000] }
]
}
},
// Business metrics
totalReviews: { $sum: "$business.totalReviews" },
uniqueCategories: { $addToSet: "$category" }
}
},
// Calculate density metrics
{
$addFields: {
densityScore: {
$multiply: [
"$locationCount",
{ $divide: ["$avgRating", 5] } // Weight by average rating
]
},
categoryDiversity: { $size: "$uniqueCategories" }
}
},
// Sort by density
{
$sort: { densityScore: -1 }
},
{
$limit: 20 // Top 20 densest areas
},
{
$project: {
gridId: "$_id",
locationCount: 1,
densityScore: { $round: ["$densityScore", 2] },
avgRating: { $round: ["$avgRating", 2] },
categoryDiversity: 1,
totalReviews: 1,
centerCoordinates: 1
}
}
]).toArray();
console.log(`Location Density Analysis - Found ${locationDensityAnalysis.length} high-density areas`);
// 2. User Movement Patterns
const userMovementAnalysis = await userLocations.aggregate([
{
$match: {
"locationMetadata.timestamp": {
$gte: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000) // Last 7 days
}
}
},
// Unwind location history
{ $unwind: "$locationHistory" },
// Calculate movement vectors
{
$addFields: {
movement: {
fromLat: { $arrayElemAt: ["$locationHistory.location.coordinates", 1] },
fromLng: { $arrayElemAt: ["$locationHistory.location.coordinates", 0] },
toLat: { $arrayElemAt: ["$currentLocation.coordinates", 1] },
toLng: { $arrayElemAt: ["$currentLocation.coordinates", 0] },
timestamp: "$locationHistory.timestamp"
}
}
},
// Calculate distance and bearing
{
$addFields: {
"movement.distance": {
// Haversine formula approximation
$multiply: [
6371000, // Earth radius in meters
{
$acos: {
$add: [
{
$multiply: [
{ $sin: { $multiply: [{ $degreesToRadians: "$movement.fromLat" }, 1] } },
{ $sin: { $multiply: [{ $degreesToRadians: "$movement.toLat" }, 1] } }
]
},
{
$multiply: [
{ $cos: { $multiply: [{ $degreesToRadians: "$movement.fromLat" }, 1] } },
{ $cos: { $multiply: [{ $degreesToRadians: "$movement.toLat" }, 1] } },
{ $cos: {
$multiply: [
{ $degreesToRadians: { $subtract: ["$movement.toLng", "$movement.fromLng"] } },
1
]
} }
]
}
]
}
}
]
}
}
},
// Group movement patterns
{
$group: {
_id: {
hour: { $hour: "$movement.timestamp" },
dayOfWeek: { $dayOfWeek: "$movement.timestamp" }
},
totalMovements: { $sum: 1 },
avgDistance: { $avg: "$movement.distance" },
totalDistance: { $sum: "$movement.distance" },
uniqueUsers: { $addToSet: "$userId" },
// Movement characteristics
shortMovements: {
$sum: { $cond: [{ $lt: ["$movement.distance", 100] }, 1, 0] } // < 100m
},
mediumMovements: {
$sum: { $cond: [
{ $and: [
{ $gte: ["$movement.distance", 100] },
{ $lt: ["$movement.distance", 1000] }
]}, 1, 0
] } // 100m - 1km
},
longMovements: {
$sum: { $cond: [{ $gte: ["$movement.distance", 1000] }, 1, 0] } // > 1km
}
}
},
// Calculate additional metrics
{
$addFields: {
uniqueUserCount: { $size: "$uniqueUsers" },
avgMovementsPerUser: { $divide: ["$totalMovements", { $size: "$uniqueUsers" }] },
movementDistribution: {
short: { $divide: ["$shortMovements", "$totalMovements"] },
medium: { $divide: ["$mediumMovements", "$totalMovements"] },
long: { $divide: ["$longMovements", "$totalMovements"] }
}
}
},
{
$sort: { totalMovements: -1 }
},
{
$project: {
hour: "$_id.hour",
dayOfWeek: "$_id.dayOfWeek",
totalMovements: 1,
uniqueUserCount: 1,
avgDistance: { $round: ["$avgDistance", 1] },
avgMovementsPerUser: { $round: ["$avgMovementsPerUser", 1] },
movementDistribution: {
short: { $round: ["$movementDistribution.short", 3] },
medium: { $round: ["$movementDistribution.medium", 3] },
long: { $round: ["$movementDistribution.long", 3] }
}
}
}
]).toArray();
console.log(`User Movement Analysis - Analyzed ${userMovementAnalysis.length} time periods`);
// 3. Geographic Performance Analysis
const geoPerformanceAnalysis = await locations.aggregate([
{
$match: {
"business.isActive": true,
"analytics.monthlyVisitors": { $exists: true }
}
},
// Create geographic regions
{
$addFields: {
region: {
$switch: {
branches: [
{
case: {
$and: [
{ $gte: [{ $arrayElemAt: ["$location.coordinates", 1] }, 37.77] }, // North of 37.77°N
{ $lte: [{ $arrayElemAt: ["$location.coordinates", 0] }, -122.41] } // West of -122.41°W
]
},
then: "Northwest"
},
{
case: {
$and: [
{ $gte: [{ $arrayElemAt: ["$location.coordinates", 1] }, 37.77] },
{ $gt: [{ $arrayElemAt: ["$location.coordinates", 0] }, -122.41] }
]
},
then: "Northeast"
},
{
case: {
$and: [
{ $lt: [{ $arrayElemAt: ["$location.coordinates", 1] }, 37.77] },
{ $lte: [{ $arrayElemAt: ["$location.coordinates", 0] }, -122.41] }
]
},
then: "Southwest"
},
{
case: {
$and: [
{ $lt: [{ $arrayElemAt: ["$location.coordinates", 1] }, 37.77] },
{ $gt: [{ $arrayElemAt: ["$location.coordinates", 0] }, -122.41] }
]
},
then: "Southeast"
}
],
default: "Other"
}
}
}
},
// Group by region and category
{
$group: {
_id: {
region: "$region",
category: "$category"
},
locationCount: { $sum: 1 },
avgRating: { $avg: "$business.rating" },
avgMonthlyVisitors: { $avg: "$analytics.monthlyVisitors" },
totalMonthlyVisitors: { $sum: "$analytics.monthlyVisitors" },
// Performance metrics
highPerformers: {
$sum: {
$cond: [
{
$and: [
{ $gte: ["$business.rating", 4.5] },
{ $gte: ["$analytics.monthlyVisitors", 10000] }
]
}, 1, 0
]
}
},
topLocation: {
$max: {
name: "$name",
visitors: "$analytics.monthlyVisitors",
rating: "$business.rating"
}
}
}
},
// Calculate regional metrics
{
$group: {
_id: "$_id.region",
categories: {
$push: {
category: "$_id.category",
locationCount: "$locationCount",
avgRating: "$avgRating",
avgMonthlyVisitors: "$avgMonthlyVisitors",
totalMonthlyVisitors: "$totalMonthlyVisitors",
highPerformers: "$highPerformers",
topLocation: "$topLocation"
}
},
regionalTotals: {
totalLocations: { $sum: "$locationCount" },
totalMonthlyVisitors: { $sum: "$totalMonthlyVisitors" },
totalHighPerformers: { $sum: "$highPerformers" }
}
}
},
// Sort by total visitors
{
$sort: { "regionalTotals.totalMonthlyVisitors": -1 }
},
{
$project: {
region: "$_id",
categories: 1,
regionalTotals: 1,
// Calculate regional performance metrics
performanceMetrics: {
avgVisitorsPerLocation: {
$divide: ["$regionalTotals.totalMonthlyVisitors", "$regionalTotals.totalLocations"]
},
highPerformerRatio: {
$divide: ["$regionalTotals.totalHighPerformers", "$regionalTotals.totalLocations"]
}
}
}
}
]).toArray();
console.log(`Geographic Performance Analysis - Analyzed ${geoPerformanceAnalysis.length} regions`);
return {
densityAnalysis: locationDensityAnalysis,
movementAnalysis: userMovementAnalysis,
performanceAnalysis: geoPerformanceAnalysis,
summary: {
densityHotspots: locationDensityAnalysis.length,
movementPatterns: userMovementAnalysis.length,
regionalInsights: geoPerformanceAnalysis.length
}
};
};
// Benefits of MongoDB Geospatial Features:
// - Native GeoJSON support with automatic validation
// - Multiple coordinate reference systems (2D, 2dsphere)
// - Built-in spatial operators and aggregation functions
// - Automatic spatial indexing with B-tree and R-tree structures
// - Spherical geometry calculations for Earth-based applications
// - Integration with aggregation framework for complex analytics
// - Real-time geofencing and location tracking capabilities
// - Scalable to billions of location data points
// - Simple query syntax compared to PostGIS extensions
// - No additional setup required - works out of the box
module.exports = {
createLocationServiceDataModel,
performGeospatialOperations,
setupLocationTracking,
generateSpatialAnalytics
};
Understanding MongoDB Geospatial Architecture
Coordinate Systems and Indexing Strategies
MongoDB supports multiple geospatial indexing approaches optimized for different use cases:
// Advanced geospatial indexing and coordinate system management
class GeospatialIndexManager {
constructor(db) {
this.db = db;
this.collections = new Map();
}
async setupGeospatialIndexing() {
// 1. 2dsphere Index - For spherical geometry (Earth-based coordinates)
const locations = this.db.collection('locations');
// Create 2dsphere index for GeoJSON objects
await locations.createIndex({ location: "2dsphere" });
// Compound index for filtered geospatial queries
await locations.createIndex({
category: 1,
"business.isActive": 1,
location: "2dsphere"
});
// Text and geospatial compound index
await locations.createIndex({
"$**": "text",
location: "2dsphere"
});
console.log("2dsphere indexes created for global location queries");
// 2. 2d Index - For flat geometry (game maps, floor plans)
const gameLocations = this.db.collection('game_locations');
// 2d index for flat coordinate system (e.g., game world coordinates)
await gameLocations.createIndex({ position: "2d" });
// Example game location document
const gameLocationDoc = {
_id: new ObjectId(),
playerId: new ObjectId(),
characterName: "DragonSlayer42",
// Flat 2D coordinates for game world
position: [1250.5, 875.2], // [x, y] coordinates in game units
// Game-specific data
level: 45,
zone: "Enchanted Forest",
server: "US-East-1",
// Bounding box for area of influence
areaOfInfluence: {
bottomLeft: [1200, 825],
topRight: [1300, 925]
},
lastUpdated: new Date()
};
await gameLocations.insertOne(gameLocationDoc);
console.log("2d index created for flat coordinate system");
// 3. Specialized indexing for different data patterns
const trajectories = this.db.collection('vehicle_trajectories');
// Index for trajectory lines and paths
await trajectories.createIndex({ route: "2dsphere" });
await trajectories.createIndex({ vehicleId: 1, timestamp: 1 });
// Example trajectory document
const trajectoryDoc = {
_id: new ObjectId(),
vehicleId: "TRUCK_001",
driverId: new ObjectId(),
// LineString geometry for route
route: {
type: "LineString",
coordinates: [
[-122.4194, 37.7749], // Start point
[-122.4184, 37.7759], // Waypoint 1
[-122.4174, 37.7769], // Waypoint 2
[-122.4164, 37.7779] // End point
]
},
// Route metadata
routeMetadata: {
totalDistance: 2.3, // km
estimatedTime: 8, // minutes
actualTime: 9.5, // minutes
fuelUsed: 0.45, // liters
trafficConditions: "moderate"
},
// Time-based tracking
startTime: new Date("2024-09-18T14:30:00Z"),
endTime: new Date("2024-09-18T14:39:30Z"),
// Performance metrics
metrics: {
averageSpeed: 14.5, // km/h
maxSpeed: 25.0,
idleTime: 45, // seconds
hardBrakingEvents: 1,
hardAccelerationEvents: 0
}
};
await trajectories.insertOne(trajectoryDoc);
console.log("Trajectory tracking setup completed");
return {
sphericalIndexes: ["locations.location", "locations.compound"],
flatIndexes: ["game_locations.position"],
trajectoryIndexes: ["trajectories.route"]
};
}
async performAdvancedSpatialQueries() {
const locations = this.db.collection('locations');
// 1. Multi-stage geospatial aggregation
console.log("=== Advanced Spatial Aggregation ===");
const complexSpatialAnalysis = await locations.aggregate([
// Stage 1: Geospatial filtering
{
$geoNear: {
near: {
type: "Point",
coordinates: [-122.4194, 37.7749]
},
distanceField: "calculatedDistance",
maxDistance: 10000, // 10km
spherical: true,
query: { "business.isActive": true }
}
},
// Stage 2: Spatial relationship analysis
{
$addFields: {
// Distance categories
distanceCategory: {
$switch: {
branches: [
{ case: { $lte: ["$calculatedDistance", 1000] }, then: "nearby" },
{ case: { $lte: ["$calculatedDistance", 5000] }, then: "moderate" },
{ case: { $lte: ["$calculatedDistance", 10000] }, then: "distant" }
],
default: "very_distant"
}
},
// Spatial density calculation
spatialDensity: {
$divide: ["$analytics.monthlyVisitors", { $add: ["$calculatedDistance", 1] }]
}
}
},
// Stage 3: Complex geospatial grouping
{
$group: {
_id: {
category: "$category",
distanceCategory: "$distanceCategory"
},
locations: { $push: "$$ROOT" },
avgDistance: { $avg: "$calculatedDistance" },
avgRating: { $avg: "$business.rating" },
avgDensity: { $avg: "$spatialDensity" },
count: { $sum: 1 },
// Geospatial aggregations
centroid: {
$avg: {
coordinates: "$location.coordinates"
}
},
// Bounding box calculation
minLat: { $min: { $arrayElemAt: ["$location.coordinates", 1] } },
maxLat: { $max: { $arrayElemAt: ["$location.coordinates", 1] } },
minLng: { $min: { $arrayElemAt: ["$location.coordinates", 0] } },
maxLng: { $max: { $arrayElemAt: ["$location.coordinates", 0] } }
}
},
// Stage 4: Spatial statistics
{
$addFields: {
boundingBox: {
type: "Polygon",
coordinates: [[
["$minLng", "$minLat"],
["$maxLng", "$minLat"],
["$maxLng", "$maxLat"],
["$minLng", "$maxLat"],
["$minLng", "$minLat"]
]]
},
// Geographic spread calculation
geographicSpread: {
$sqrt: {
$add: [
{ $pow: [{ $subtract: ["$maxLat", "$minLat"] }, 2] },
{ $pow: [{ $subtract: ["$maxLng", "$minLng"] }, 2] }
]
}
}
}
},
{
$sort: { count: -1, avgDensity: -1 }
}
]).toArray();
console.log(`Complex Spatial Analysis - ${complexSpatialAnalysis.length} category/distance combinations`);
// 2. Intersection and overlay queries
console.log("\n=== Spatial Intersection Analysis ===");
const intersectionAnalysis = await locations.aggregate([
{
$match: {
"business.isActive": true,
deliveryZones: { $exists: true }
}
},
// Find intersections between delivery zones
{
$lookup: {
from: "locations",
let: { currentZones: "$deliveryZones" },
pipeline: [
{
$match: {
$expr: {
$and: [
{ $ne: ["$_id", "$$ROOT._id"] }, // Different location
{ $ne: ["$$currentZones", null] },
{
$gt: [{
$size: {
$filter: {
input: "$deliveryZones.coordinates",
cond: {
// Simplified intersection check
$anyElementTrue: {
$map: {
input: "$$currentZones.coordinates",
in: { $ne: ["$$this", null] }
}
}
}
}
}
}, 0]
}
]
}
}
},
{
$project: {
name: 1,
category: 1,
"business.rating": 1
}
}
],
as: "overlappingLocations"
}
},
// Calculate overlap metrics
{
$addFields: {
overlapCount: { $size: "$overlappingLocations" },
hasOverlap: { $gt: [{ $size: "$overlappingLocations" }, 0] },
competitionLevel: {
$switch: {
branches: [
{ case: { $gte: [{ $size: "$overlappingLocations" }, 5] }, then: "high" },
{ case: { $gte: [{ $size: "$overlappingLocations" }, 2] }, then: "medium" },
{ case: { $gt: [{ $size: "$overlappingLocations" }, 0] }, then: "low" }
],
default: "none"
}
}
}
},
{
$match: { hasOverlap: true }
},
{
$group: {
_id: "$category",
avgOverlapCount: { $avg: "$overlapCount" },
locationsWithOverlap: { $sum: 1 },
highCompetitionAreas: {
$sum: { $cond: [{ $eq: ["$competitionLevel", "high"] }, 1, 0] }
}
}
},
{ $sort: { avgOverlapCount: -1 } }
]).toArray();
console.log(`Intersection Analysis - ${intersectionAnalysis.length} categories with delivery zone overlaps`);
// 3. Temporal-spatial analysis
console.log("\n=== Temporal-Spatial Analysis ===");
const temporalSpatialAnalysis = await this.db.collection('user_locations').aggregate([
{
$match: {
"locationMetadata.timestamp": {
$gte: new Date(Date.now() - 24 * 60 * 60 * 1000) // Last 24 hours
}
}
},
// Unwind location history for temporal analysis
{ $unwind: "$locationHistory" },
// Create time buckets
{
$addFields: {
timeBucket: {
$dateTrunc: {
date: "$locationHistory.timestamp",
unit: "hour"
}
},
// Grid cell for spatial grouping
spatialGrid: {
lat: {
$floor: {
$multiply: [
{ $arrayElemAt: ["$locationHistory.location.coordinates", 1] },
1000 // 0.001 degree precision
]
}
},
lng: {
$floor: {
$multiply: [
{ $arrayElemAt: ["$locationHistory.location.coordinates", 0] },
1000
]
}
}
}
}
},
// Group by time and space
{
$group: {
_id: {
timeBucket: "$timeBucket",
spatialGrid: "$spatialGrid"
},
uniqueUsers: { $addToSet: "$userId" },
totalEvents: { $sum: 1 },
avgAccuracy: { $avg: "$locationHistory.accuracy" },
// Location cluster center
centerLat: { $avg: { $arrayElemAt: ["$locationHistory.location.coordinates", 1] } },
centerLng: { $avg: { $arrayElemAt: ["$locationHistory.location.coordinates", 0] } }
}
},
// Calculate density metrics
{
$addFields: {
userDensity: { $size: "$uniqueUsers" },
eventDensity: "$totalEvents",
densityScore: { $multiply: [{ $size: "$uniqueUsers" }, { $log: { $add: ["$totalEvents", 1] } }] }
}
},
// Temporal pattern analysis
{
$group: {
_id: { $hour: "$_id.timeBucket" },
totalGridCells: { $sum: 1 },
avgUserDensity: { $avg: "$userDensity" },
maxUserDensity: { $max: "$userDensity" },
totalUniqueUsers: { $sum: "$userDensity" },
// Hotspot identification
hotspots: {
$push: {
$cond: [
{ $gte: ["$densityScore", 10] },
{
center: { type: "Point", coordinates: ["$centerLng", "$centerLat"] },
userDensity: "$userDensity",
densityScore: "$densityScore"
},
null
]
}
}
}
},
// Clean up hotspots array
{
$addFields: {
hotspots: {
$filter: {
input: "$hotspots",
cond: { $ne: ["$$this", null] }
}
}
}
},
{ $sort: { "_id": 1 } },
{
$project: {
hour: "$_id",
totalGridCells: 1,
avgUserDensity: { $round: ["$avgUserDensity", 2] },
maxUserDensity: 1,
totalUniqueUsers: 1,
hotspotCount: { $size: "$hotspots" },
topHotspots: { $slice: ["$hotspots", 5] }
}
}
]).toArray();
console.log(`Temporal-Spatial Analysis - ${temporalSpatialAnalysis.length} hourly patterns`);
return {
complexSpatialResults: complexSpatialAnalysis.length,
intersectionResults: intersectionAnalysis.length,
temporalSpatialResults: temporalSpatialAnalysis.length,
insights: {
spatialComplexity: complexSpatialAnalysis,
deliveryOverlaps: intersectionAnalysis,
hourlyPatterns: temporalSpatialAnalysis
}
};
}
async optimizeGeospatialPerformance() {
console.log("=== Geospatial Performance Optimization ===");
// 1. Index performance analysis
const locations = this.db.collection('locations');
// Test different query patterns
const performanceTests = [
{
name: "Simple Proximity Query",
query: {
location: {
$near: {
$geometry: { type: "Point", coordinates: [-122.4194, 37.7749] },
$maxDistance: 5000
}
}
}
},
{
name: "Filtered Proximity Query",
query: {
location: {
$near: {
$geometry: { type: "Point", coordinates: [-122.4194, 37.7749] },
$maxDistance: 5000
}
},
category: "restaurant",
"business.isActive": true
}
},
{
name: "Geo Within Query",
query: {
location: {
$geoWithin: {
$centerSphere: [[-122.4194, 37.7749], 5 / 3963.2] // 5 miles
}
}
}
}
];
const performanceResults = [];
for (const test of performanceTests) {
const startTime = Date.now();
const results = await locations.find(test.query)
.limit(20)
.explain("executionStats");
const executionTime = Date.now() - startTime;
performanceResults.push({
testName: test.name,
executionTimeMs: executionTime,
documentsExamined: results.executionStats.totalDocsExamined,
documentsReturned: results.executionStats.totalDocsReturned,
indexUsed: results.executionStats.executionStages?.indexName || "none",
efficiency: results.executionStats.totalDocsReturned / Math.max(results.executionStats.totalDocsExamined, 1)
});
}
console.log("Performance Test Results:");
performanceResults.forEach(result => {
console.log(`${result.testName}: ${result.executionTimeMs}ms, Efficiency: ${(result.efficiency * 100).toFixed(1)}%`);
});
// 2. Index recommendations
const indexRecommendations = await this.analyzeIndexUsage(locations);
// 3. Memory usage optimization
const memoryOptimization = await this.optimizeMemoryUsage(locations);
return {
performanceResults,
indexRecommendations,
memoryOptimization,
recommendations: [
"Use 2dsphere indexes for Earth-based coordinates",
"Include commonly filtered fields in compound indexes",
"Limit result sets with appropriate $maxDistance values",
"Use $geoNear aggregation for complex distance-based analytics",
"Monitor index usage and query patterns regularly"
]
};
}
async analyzeIndexUsage(collection) {
// Get index usage statistics
const indexStats = await collection.aggregate([
{ $indexStats: {} }
]).toArray();
const recommendations = [];
indexStats.forEach(stat => {
const usageRatio = stat.accesses.ops / (stat.accesses.since?.getTime() || 1);
if (usageRatio < 0.001) {
recommendations.push({
type: "remove",
index: stat.name,
reason: "Low usage index - consider removing",
usage: usageRatio
});
} else if (usageRatio > 10) {
recommendations.push({
type: "optimize",
index: stat.name,
reason: "High usage index - ensure optimal configuration",
usage: usageRatio
});
}
});
return {
totalIndexes: indexStats.length,
recommendations: recommendations,
indexStats: indexStats
};
}
async optimizeMemoryUsage(collection) {
// Analyze document sizes and memory patterns
const sizeAnalysis = await collection.aggregate([
{
$project: {
documentSize: { $bsonSize: "$$ROOT" },
hasLocationHistory: { $ne: ["$locationHistory", null] },
locationHistorySize: { $size: { $ifNull: ["$locationHistory", []] } },
hasDeliveryZones: { $ne: ["$deliveryZones", null] }
}
},
{
$group: {
_id: null,
avgDocumentSize: { $avg: "$documentSize" },
maxDocumentSize: { $max: "$documentSize" },
minDocumentSize: { $min: "$documentSize" },
largeDocuments: { $sum: { $cond: [{ $gt: ["$documentSize", 16384] }, 1, 0] } }, // > 16KB
documentsWithHistory: { $sum: { $cond: ["$hasLocationHistory", 1, 0] } },
avgHistorySize: { $avg: "$locationHistorySize" },
totalDocuments: { $sum: 1 }
}
}
]).toArray();
const analysis = sizeAnalysis[0] || {};
const optimizationTips = [];
if (analysis.avgDocumentSize > 8192) {
optimizationTips.push("Consider splitting large documents or using references");
}
if (analysis.avgHistorySize > 100) {
optimizationTips.push("Limit location history array size or archive old data");
}
if (analysis.largeDocuments > analysis.totalDocuments * 0.1) {
optimizationTips.push("High number of large documents - review document structure");
}
return {
sizeAnalysis: analysis,
optimizationTips: optimizationTips,
recommendations: {
documentSize: "Keep documents under 16MB, optimal under 1MB",
arrays: "Limit embedded arrays to prevent unbounded growth",
indexing: "Use partial indexes for sparse geospatial data",
sharding: "Consider sharding key that includes geospatial distribution"
}
};
}
}
SQL-Style Geospatial Operations with QueryLeaf
QueryLeaf provides familiar SQL syntax for MongoDB's powerful geospatial capabilities:
-- QueryLeaf geospatial operations with SQL-familiar syntax
-- Create geospatial-enabled table/collection
CREATE TABLE locations (
id OBJECTID PRIMARY KEY,
name VARCHAR(255) NOT NULL,
category VARCHAR(100),
-- Geospatial columns with native GeoJSON support
location POINT NOT NULL, -- GeoJSON Point
service_area POLYGON, -- GeoJSON Polygon
delivery_zones MULTIPOLYGON, -- GeoJSON MultiPolygon
-- Business data
rating DECIMAL(3,2),
total_reviews INTEGER DEFAULT 0,
is_active BOOLEAN DEFAULT true,
-- Address information
address DOCUMENT {
street VARCHAR(255),
city VARCHAR(100),
state VARCHAR(50),
country VARCHAR(100),
postal_code VARCHAR(20)
},
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Create geospatial indexes
CREATE SPATIAL INDEX idx_locations_location ON locations (location);
CREATE SPATIAL INDEX idx_locations_service_area ON locations (service_area);
CREATE COMPOUND INDEX idx_locations_category_geo ON locations (category, location);
-- Insert location data with geospatial coordinates
INSERT INTO locations (name, category, location, service_area, address, rating, total_reviews)
VALUES (
'Blue Bottle Coffee',
'cafe',
ST_POINT(-122.3937, 37.7955), -- Longitude, Latitude
ST_POLYGON(ARRAY[
ARRAY[-122.4050, 37.7850], -- Southwest
ARRAY[-122.3850, 37.7850], -- Southeast
ARRAY[-122.3850, 37.8050], -- Northeast
ARRAY[-122.4050, 37.8050], -- Northwest
ARRAY[-122.4050, 37.7850] -- Close polygon
]),
{
street: '1 Ferry Building',
city: 'San Francisco',
state: 'CA',
country: 'USA',
postal_code: '94111'
},
4.6,
1247
);
-- Proximity search - find nearby locations
SELECT
id,
name,
category,
rating,
-- Calculate distance in meters
ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749)) as distance_meters,
-- Extract coordinates for display
ST_X(location) as longitude,
ST_Y(location) as latitude,
-- Address information
address.street,
address.city,
address.state
FROM locations
WHERE
is_active = true
AND ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749)) <= 5000 -- Within 5km
AND category IN ('cafe', 'restaurant', 'retail')
ORDER BY ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749))
LIMIT 20;
-- Advanced proximity search with relevance scoring
WITH nearby_locations AS (
SELECT
*,
ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749)) as distance_meters
FROM locations
WHERE
is_active = true
AND ST_DWITHIN(location, ST_POINT(-122.4194, 37.7749), 10000) -- 10km radius
),
scored_locations AS (
SELECT *,
-- Relevance scoring: distance (40%) + rating (30%) + reviews (30%)
(
(1000 - LEAST(distance_meters, 1000)) / 1000 * 0.4 +
(rating / 5.0) * 0.3 +
(LEAST(total_reviews, 1000) / 1000.0) * 0.3
) as relevance_score,
-- Distance categories
CASE
WHEN distance_meters <= 1000 THEN 'nearby'
WHEN distance_meters <= 5000 THEN 'moderate'
ELSE 'distant'
END as distance_category
FROM nearby_locations
)
SELECT
name,
category,
rating,
total_reviews,
ROUND(distance_meters) as distance_m,
distance_category,
ROUND(relevance_score, 3) as relevance,
-- Format coordinates for maps
CONCAT(
ROUND(ST_Y(location), 6), ',',
ROUND(ST_X(location), 6)
) as lat_lng
FROM scored_locations
ORDER BY relevance_score DESC, distance_meters ASC
LIMIT 25;
-- Geospatial area queries
SELECT
l.name,
l.category,
l.rating,
-- Check if location is within specific area
ST_CONTAINS(
ST_POLYGON(ARRAY[
ARRAY[-122.4270, 37.7609], -- Downtown SF polygon
ARRAY[-122.3968, 37.7609],
ARRAY[-122.3968, 37.7908],
ARRAY[-122.4270, 37.7908],
ARRAY[-122.4270, 37.7609]
]),
l.location
) as is_in_downtown,
-- Check service area coverage
ST_CONTAINS(l.service_area, ST_POINT(-122.4194, 37.7749)) as serves_user_location
FROM locations l
WHERE
l.is_active = true
AND ST_INTERSECTS(
l.location,
ST_POLYGON(ARRAY[
ARRAY[-122.4270, 37.7609],
ARRAY[-122.3968, 37.7609],
ARRAY[-122.3968, 37.7908],
ARRAY[-122.4270, 37.7908],
ARRAY[-122.4270, 37.7609]
])
);
-- Complex geospatial analytics with aggregation
WITH location_analytics AS (
SELECT
category,
-- Spatial clustering analysis
ST_CLUSTERKMEANS(location, 5) OVER () as cluster_id,
-- Distance from city center
ST_DISTANCE(location, ST_POINT(-122.4194, 37.7749)) as distance_from_center,
-- Geospatial grid for density analysis
ST_SNAPGRID(location, 0.001, 0.001) as grid_cell,
name,
rating,
total_reviews,
location
FROM locations
WHERE is_active = true
),
cluster_analysis AS (
SELECT
cluster_id,
category,
COUNT(*) as location_count,
AVG(rating) as avg_rating,
AVG(distance_from_center) as avg_distance_from_center,
-- Calculate cluster centroid
ST_CENTROID(ST_COLLECT(location)) as cluster_center,
-- Calculate cluster bounds
ST_ENVELOPE(ST_COLLECT(location)) as cluster_bounds,
-- Business metrics
SUM(total_reviews) as total_reviews,
AVG(total_reviews) as avg_reviews_per_location
FROM location_analytics
GROUP BY cluster_id, category
),
grid_density AS (
SELECT
grid_cell,
COUNT(DISTINCT category) as category_diversity,
COUNT(*) as location_density,
AVG(rating) as avg_rating,
-- Calculate grid cell center
ST_CENTROID(grid_cell) as grid_center
FROM location_analytics
GROUP BY grid_cell
HAVING COUNT(*) >= 3 -- Only dense grid cells
)
SELECT
ca.cluster_id,
ca.category,
ca.location_count,
ROUND(ca.avg_rating, 2) as avg_rating,
ROUND(ca.avg_distance_from_center) as avg_distance_m,
-- Cluster geographic data
ST_X(ca.cluster_center) as cluster_lng,
ST_Y(ca.cluster_center) as cluster_lat,
-- Calculate cluster area in square meters
ST_AREA(ca.cluster_bounds, true) as cluster_area_sqm,
-- Density metrics
ROUND(ca.location_count / ST_AREA(ca.cluster_bounds, true) * 1000000, 2) as density_per_sqkm,
-- Business performance
ca.total_reviews,
ROUND(ca.avg_reviews_per_location) as avg_reviews,
-- Nearby high-density areas
(
SELECT COUNT(*)
FROM grid_density gd
WHERE ST_DISTANCE(ca.cluster_center, gd.grid_center) <= 1000
) as nearby_dense_areas
FROM cluster_analysis ca
WHERE ca.location_count >= 2
ORDER BY ca.location_count DESC, ca.avg_rating DESC;
-- Geofencing and real-time location queries
CREATE TABLE geofences (
id OBJECTID PRIMARY KEY,
name VARCHAR(255) NOT NULL,
geofence_area POLYGON NOT NULL,
geofence_type VARCHAR(50) DEFAULT 'notification',
is_active BOOLEAN DEFAULT true,
-- Trigger configuration
config DOCUMENT {
on_enter BOOLEAN DEFAULT true,
on_exit BOOLEAN DEFAULT true,
on_dwell BOOLEAN DEFAULT false,
dwell_time_minutes INTEGER DEFAULT 5,
max_triggers_per_day INTEGER DEFAULT 10
},
-- Analytics tracking
analytics DOCUMENT {
total_enters INTEGER DEFAULT 0,
total_exits INTEGER DEFAULT 0,
unique_users INTEGER DEFAULT 0,
avg_dwell_minutes DECIMAL(8,2) DEFAULT 0
},
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE SPATIAL INDEX idx_geofences_area ON geofences (geofence_area);
-- Check geofence triggers for user location
SELECT
gf.id,
gf.name,
gf.geofence_type,
-- Check if user location triggers geofence
ST_CONTAINS(gf.geofence_area, ST_POINT(-122.4150, 37.7750)) as is_triggered,
-- Calculate distance to geofence edge
ST_DISTANCE(
ST_POINT(-122.4150, 37.7750),
ST_BOUNDARY(gf.geofence_area)
) as distance_to_edge_m,
-- Geofence area and perimeter
ST_AREA(gf.geofence_area, true) as area_sqm,
ST_PERIMETER(gf.geofence_area, true) as perimeter_m,
-- Configuration and analytics
gf.config,
gf.analytics
FROM geofences gf
WHERE
gf.is_active = true
AND (
ST_CONTAINS(gf.geofence_area, ST_POINT(-122.4150, 37.7750)) -- Inside geofence
OR ST_DISTANCE(
ST_POINT(-122.4150, 37.7750),
gf.geofence_area
) <= 100 -- Within 100m of geofence
);
-- Time-based geospatial analysis
CREATE TABLE user_location_history (
id OBJECTID PRIMARY KEY,
user_id OBJECTID NOT NULL,
location POINT NOT NULL,
recorded_at TIMESTAMP NOT NULL,
accuracy_meters DECIMAL(8,2),
activity_type VARCHAR(50),
-- Movement data
speed_mps DECIMAL(8,2), -- meters per second
heading_degrees INTEGER, -- 0-360 degrees from north
-- Context information
context DOCUMENT {
battery_level INTEGER,
connection_type VARCHAR(50),
app_state VARCHAR(50)
}
);
CREATE COMPOUND INDEX idx_user_location_time_geo ON user_location_history (
user_id, recorded_at, location
);
-- Movement pattern analysis
WITH user_movements AS (
SELECT
user_id,
location,
recorded_at,
-- Calculate distance from previous location
ST_DISTANCE(
location,
LAG(location) OVER (
PARTITION BY user_id
ORDER BY recorded_at
)
) as movement_distance,
-- Time since previous location
EXTRACT(EPOCH FROM (
recorded_at - LAG(recorded_at) OVER (
PARTITION BY user_id
ORDER BY recorded_at
)
)) as time_elapsed_seconds,
-- Previous location for trajectory analysis
LAG(location) OVER (
PARTITION BY user_id
ORDER BY recorded_at
) as previous_location
FROM user_location_history
WHERE recorded_at >= CURRENT_TIMESTAMP - INTERVAL '24 hours'
),
movement_metrics AS (
SELECT
user_id,
COUNT(*) as location_points,
SUM(movement_distance) as total_distance_m,
AVG(movement_distance / NULLIF(time_elapsed_seconds, 0)) as avg_speed_mps,
MAX(movement_distance / NULLIF(time_elapsed_seconds, 0)) as max_speed_mps,
-- Create trajectory line
ST_MAKELINE(ARRAY_AGG(location ORDER BY recorded_at)) as trajectory,
-- Calculate bounding box of movement
ST_ENVELOPE(ST_COLLECT(location)) as movement_bounds,
-- Time-based metrics
MIN(recorded_at) as journey_start,
MAX(recorded_at) as journey_end,
EXTRACT(EPOCH FROM (MAX(recorded_at) - MIN(recorded_at))) as journey_duration_seconds,
-- Movement patterns
COUNT(DISTINCT ST_SNAPGRID(location, 0.001, 0.001)) as unique_areas_visited
FROM user_movements
WHERE movement_distance IS NOT NULL
AND time_elapsed_seconds > 0
AND movement_distance < 10000 -- Filter out GPS errors
GROUP BY user_id
)
SELECT
user_id,
location_points,
ROUND(total_distance_m) as total_distance_m,
ROUND(total_distance_m / 1000.0, 2) as total_distance_km,
ROUND(avg_speed_mps * 3.6, 1) as avg_speed_kmh, -- Convert to km/h
ROUND(max_speed_mps * 3.6, 1) as max_speed_kmh,
-- Journey characteristics
journey_start,
journey_end,
ROUND(journey_duration_seconds / 3600.0, 1) as journey_hours,
unique_areas_visited,
-- Trajectory analysis
ST_LENGTH(trajectory, true) as trajectory_length_m,
ST_AREA(movement_bounds, true) as coverage_area_sqm,
-- Movement efficiency (straight-line vs actual distance)
ROUND(
ST_DISTANCE(
ST_STARTPOINT(trajectory),
ST_ENDPOINT(trajectory)
) / NULLIF(ST_LENGTH(trajectory, true), 0) * 100, 1
) as movement_efficiency_pct,
-- Geographic extent
ST_XMIN(movement_bounds) as min_longitude,
ST_XMAX(movement_bounds) as max_longitude,
ST_YMIN(movement_bounds) as min_latitude,
ST_YMAX(movement_bounds) as max_latitude
FROM movement_metrics
WHERE total_distance_m > 100 -- Minimum movement threshold
ORDER BY total_distance_m DESC
LIMIT 50;
-- Location-based recommendations engine
WITH user_preferences AS (
SELECT
u.user_id,
u.location as current_location,
-- User preference analysis based on visit history
up.preferred_categories,
up.avg_rating_threshold,
up.max_distance_preference,
up.price_range_preference
FROM user_profiles u
JOIN user_preferences up ON u.user_id = up.user_id
WHERE u.is_active = true
),
location_scoring AS (
SELECT
l.*,
up.user_id,
-- Distance scoring
ST_DISTANCE(l.location, up.current_location) as distance_m,
EXP(-ST_DISTANCE(l.location, up.current_location) / 2000.0) as distance_score,
-- Category preference scoring
CASE
WHEN l.category = ANY(up.preferred_categories) THEN 1.0
WHEN ARRAY_LENGTH(up.preferred_categories, 1) = 0 THEN 0.5
ELSE 0.2
END as category_score,
-- Rating scoring
l.rating / 5.0 as rating_score,
-- Popularity scoring based on reviews
LN(l.total_reviews + 1) / LN(1000) as popularity_score,
-- Time-based scoring (open/closed)
CASE
WHEN EXTRACT(DOW FROM CURRENT_TIMESTAMP) = 0 THEN -- Sunday
CASE WHEN l.hours.sunday.is_open THEN 1.0 ELSE 0.3 END
WHEN EXTRACT(DOW FROM CURRENT_TIMESTAMP) = 1 THEN -- Monday
CASE WHEN l.hours.monday.is_open THEN 1.0 ELSE 0.3 END
-- ... other days
ELSE 0.8
END as availability_score
FROM locations l
CROSS JOIN user_preferences up
WHERE
l.is_active = true
AND ST_DISTANCE(l.location, up.current_location) <= up.max_distance_preference
AND l.rating >= up.avg_rating_threshold
),
final_recommendations AS (
SELECT *,
-- Combined relevance score
(
distance_score * 0.25 +
category_score * 0.30 +
rating_score * 0.20 +
popularity_score * 0.15 +
availability_score * 0.10
) as relevance_score
FROM location_scoring
)
SELECT
user_id,
name as location_name,
category,
rating,
total_reviews,
ROUND(distance_m) as distance_meters,
ROUND(relevance_score, 3) as relevance,
-- Location details for display
ST_X(location) as longitude,
ST_Y(location) as latitude,
address.street || ', ' || address.city as display_address,
-- Recommendation reasoning
CASE
WHEN category_score = 1.0 THEN 'Matches your preferences'
WHEN distance_score > 0.8 THEN 'Very close to you'
WHEN rating_score >= 0.9 THEN 'Highly rated'
WHEN popularity_score > 0.5 THEN 'Popular destination'
ELSE 'Good option nearby'
END as recommendation_reason
FROM final_recommendations
WHERE relevance_score > 0.3
ORDER BY user_id, relevance_score DESC
LIMIT 10 PER user_id;
-- QueryLeaf geospatial features provide:
-- 1. Native GeoJSON support with SQL-familiar geometry functions
-- 2. Spatial indexing with automatic optimization for Earth-based coordinates
-- 3. Distance calculations and proximity queries with intuitive syntax
-- 4. Complex geospatial aggregations and analytics using familiar SQL patterns
-- 5. Geofencing capabilities with real-time trigger detection
-- 6. Movement pattern analysis and trajectory tracking
-- 7. Location-based recommendation engines with multi-factor scoring
-- 8. Integration with MongoDB's native geospatial operators and functions
-- 9. Performance optimization through intelligent query planning
-- 10. Seamless scaling from simple proximity queries to complex spatial analytics
Best Practices for Geospatial Implementation
Coordinate System Selection
Choose the appropriate coordinate system and indexing strategy:
- 2dsphere Index: Use for Earth-based coordinates with spherical geometry calculations
- 2d Index: Use for flat coordinate systems like game maps or floor plans
- Coordinate Format: MongoDB uses [longitude, latitude] format (opposite of many mapping APIs)
- Precision Considerations: Balance coordinate precision with storage and performance requirements
- Projection Selection: Choose appropriate coordinate reference system for your geographic region
- Distance Units: Ensure consistent distance units throughout your application
Performance Optimization
Optimize geospatial queries for high performance and scalability:
- Index Strategy: Create compound indexes that support your most common query patterns
- Query Limits: Use $maxDistance and $minDistance to limit search scope
- Result Pagination: Implement proper pagination for large result sets
- Memory Management: Monitor working set size and optimize document structure
- Aggregation Optimization: Use $geoNear for distance-based aggregations when possible
- Sharding Strategy: Consider geospatial distribution when designing sharding keys
Conclusion
MongoDB geospatial capabilities provide comprehensive location-aware functionality that eliminates the complexity of traditional spatial database extensions while delivering superior performance and scalability. The native support for GeoJSON, multiple coordinate systems, and sophisticated spatial operations makes building location-based applications both powerful and intuitive.
Key geospatial benefits include:
- Native Spatial Support: Built-in GeoJSON support without additional extensions or setup
- High Performance: Optimized spatial indexing and query execution for billions of documents
- Rich Query Capabilities: Comprehensive spatial operators for proximity, intersection, and containment
- Flexible Data Models: Store complex location data with business context in single documents
- Real-time Processing: Efficient geofencing and location tracking for live applications
- Scalable Architecture: Horizontal scaling across distributed clusters with location-aware sharding
Whether you're building ride-sharing platforms, delivery applications, location-based social networks, or IoT sensor networks, MongoDB's geospatial features with QueryLeaf's familiar SQL interface provides the foundation for sophisticated location-aware applications. This combination enables you to implement complex spatial functionality while preserving familiar database interaction patterns.
QueryLeaf Integration: QueryLeaf automatically manages MongoDB geospatial operations while providing SQL-familiar spatial query syntax, coordinate system handling, and geographic analysis functions. Advanced geospatial indexing, proximity calculations, and spatial analytics are seamlessly handled through familiar SQL patterns, making location-based application development both powerful and accessible.
The integration of native geospatial capabilities with SQL-style spatial operations makes MongoDB an ideal platform for applications requiring both sophisticated location functionality and familiar database interaction patterns, ensuring your geospatial solutions remain both effective and maintainable as they scale and evolve.