Map Reduce in Mongo DB

Introduction to MongoDB

MongoDB is a document database designed for ease of development and scaling. The Manual introduces key concepts in MongoDB, presents the query language, and provides operational and administrative considerations and procedures as well as a comprehensive reference section

Introduction to MapReduce

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.

IT is divided into two parts :

1. Mapper: It performs filtering and sorting

2. Reducer: which performs a summary operation (such as counting, aggression )

Map-Reduce in MongoDB

Map-reduce is a data processing pattern for condensing large big data into useful aggregated results. To perform map-reduce operations, MongoDB provides the mapReduce database command.

Map-reduce operations use custom JavaScript functions to map, or associate, values to a key.

DATASET:

db.orders.insertMany([
{ _id: 1, cust_id: “Ant O. Knee”, ord_date: new Date(“2020–03–01”), price: 25, items: [ { sku: “oranges”, qty: 5, price: 2.5 }, { sku: “apples”, qty: 5, price: 2.5 } ], status: “A” },
{ _id: 2, cust_id: “Ant O. Knee”, ord_date: new Date(“2020–03–08”), price: 70, items: [ { sku: “oranges”, qty: 8, price: 2.5 }, { sku: “chocolates”, qty: 5, price: 10 } ], status: “A” },
{ _id: 3, cust_id: “Busby Bee”, ord_date: new Date(“2020–03–08”), price: 50, items: [ { sku: “oranges”, qty: 10, price: 2.5 }, { sku: “pears”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 4, cust_id: “Busby Bee”, ord_date: new Date(“2020–03–18”), price: 25, items: [ { sku: “oranges”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 5, cust_id: “Busby Bee”, ord_date: new Date(“2020–03–19”), price: 50, items: [ { sku: “chocolates”, qty: 5, price: 10 } ], status: “A”},
{ _id: 6, cust_id: “Cam Elot”, ord_date: new Date(“2020–03–19”), price: 35, items: [ { sku: “carrots”, qty: 10, price: 1.0 }, { sku: “apples”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 7, cust_id: “Cam Elot”, ord_date: new Date(“2020–03–20”), price: 25, items: [ { sku: “oranges”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 8, cust_id: “Don Quis”, ord_date: new Date(“2020–03–20”), price: 75, items: [ { sku: “chocolates”, qty: 5, price: 10 }, { sku: “apples”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 9, cust_id: “Don Quis”, ord_date: new Date(“2020–03–20”), price: 55, items: [ { sku: “carrots”, qty: 5, price: 1.0 }, { sku: “apples”, qty: 10, price: 2.5 }, { sku: “oranges”, qty: 10, price: 2.5 } ], status: “A” },
{ _id: 10, cust_id: “Don Quis”, ord_date: new Date(“2020–03–23”), price: 25, items: [ { sku: “oranges”, qty: 10, price: 2.5 } ], status: “A” }
])

Example Statement :

Find the average day-wise sale of fruits, Perform the map-reduce operation on the orders collection to group by the Order_date, and calculate the sum of the price for each Day :

  1. Define the map function to process each input document:
  • Create Javascript function to maps the price to the order_date for each document
  • emits function help to take the output to the next function.

var mapFun = function() {
emit(this.ord_date, this.price); };

2. Define the corresponding reduce function with two arguments key and value

  • The value is an element of the price emitted by the map function and grouped by order_date.
  • The reduce function averages the elements of price.

var redFun= function( key , value ) {
return Array.avg(value);};

3. Perform map-reduce on all documents

  • mapFun and redFun are the mapper and reducer functions
  • query specifies the optional selection criteria for selecting documents
  • out specifies the location of the map-reduce query result.

db.orders.mapReduce(
mapFun ,redFun , { query: { ord_date: { $gt: ISODate(“2020–03–01”)}},
out: “output” })

The Result stored in the collection output

***************************THANK YOU **************************

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How to make your network less time consuming to manage and more secure

Chaos engineering — why and how?

Redis using Python Client

ProxySQL Migration with Zero Downtime

Coding playgrounds

Navigating COVID-19 | Office Edition

It’s really fascinating to see how computers can understand what we write, considering that they…

Dependency Injection — Service Lifetimes

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Akanksha Chhattri

Akanksha Chhattri

More from Medium

The trade-off between query performance & data consistency when working with connected data in…

A user schema defined in user.js file

4 Amazing Things to Boost Your Next Node.js Project

Everything you need to know about MongoDB explain()

Version Controlling | NoSql