Map Reduce in Mongo DB

Akanksha Chhattri
4 min readJul 16, 2021

Introduction to MongoDB

MongoDB is a document database designed for ease of development and scaling. The Manual introduces key concepts in MongoDB, presents the query language, and provides operational and administrative considerations and procedures as well as a comprehensive reference section

Introduction to MapReduce

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.

IT is divided into two parts :

1. Mapper: It performs filtering and sorting

2. Reducer: which performs a summary operation (such as counting, aggression )

Map-Reduce in MongoDB

Map-reduce is a data processing pattern for condensing large big data into useful aggregated results. To perform map-reduce operations, MongoDB provides the mapReduce database command.

Map-reduce operations use custom JavaScript functions to map, or associate, values to a key.

DATASET:

db.orders.insertMany([
{ _id: 1, cust_id: “Ant O. Knee”, ord_date: new Date(“2020–03–01”), price: 25, items: [ { sku: “oranges”, qty: 5, price: 2.5 }, { sku: “apples”, qty: 5, price: 2.5 } ], status: “A” },
{ _id: 2, cust_id: “Ant O. Knee”, ord_date: new Date(“2020–03–08”), price: 70, items: [ { sku: “oranges”, qty: 8, price: 2.5 }, { sku: “chocolates”, qty: 5, price: 10 } ], status: “A” },
{ _id: 3, cust_id: “Busby Bee”, ord_date: new Date(“2020–03–08”), price: 50, items: [ { sku: “oranges”, qty: 10, price: 2.5 }, { sku: “pears”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 4, cust_id: “Busby Bee”, ord_date: new Date(“2020–03–18”), price: 25, items: [ { sku: “oranges”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 5, cust_id: “Busby Bee”, ord_date: new Date(“2020–03–19”), price: 50, items: [ { sku: “chocolates”, qty: 5, price: 10 } ], status: “A”},
{ _id: 6, cust_id: “Cam Elot”, ord_date: new Date(“2020–03–19”), price: 35, items: [ { sku: “carrots”, qty: 10, price: 1.0 }, { sku: “apples”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 7, cust_id: “Cam Elot”, ord_date: new Date(“2020–03–20”), price: 25, items: [ { sku: “oranges”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 8, cust_id: “Don Quis”, ord_date: new Date(“2020–03–20”), price: 75, items: [ { sku: “chocolates”, qty: 5, price: 10 }, { sku: “apples”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 9, cust_id: “Don Quis”, ord_date: new Date(“2020–03–20”), price: 55, items: [ { sku: “carrots”, qty: 5, price: 1.0 }, { sku: “apples”, qty: 10, price: 2.5 }, { sku: “oranges”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 10, cust_id: “Don Quis”, ord_date: new Date(“2020–03–23”), price: 25, items: [ { sku: “oranges”, qty: 10, price: 2.5 } ], status: “A” }
])

Example Statement :

Find the average day-wise sale of fruits, Perform the map-reduce operation on the orders collection to group by the Order_date, and calculate the sum of the price for each Day :

  1. Define the map function to process each input document:
  • Create Javascript function to maps the price to the order_date for each document
  • emits function help to take the output to the next function.

var mapFun = function() {
emit(this.ord_date, this.price); };

2. Define the corresponding reduce function with two arguments key and value

  • The value is an element of the price emitted by the map function and grouped by order_date.
  • The reduce function averages the elements of price.

var redFun= function( key , value ) {
return Array.avg(value);};

3. Perform map-reduce on all documents

  • mapFun and redFun are the mapper and reducer functions
  • query specifies the optional selection criteria for selecting documents
  • out specifies the location of the map-reduce query result.

db.orders.mapReduce(
mapFun ,redFun , { query: { ord_date: { $gt: ISODate(“2020–03–01”)}},
out: “output” })

The Result stored in the collection output

***************************THANK YOU **************************

--

--