Map/Reduce Script | NetSuite Blog

The Map/Reduce script is used to process large volumes of data. It works well in instances when the data can be broken down into smaller parts. When you run this script, a structured framework creates enough jobs to process all of these smaller parts. This technique does not require user management, it manages automatically.
This script also has the benefit of allowing these jobs to be processed in parallel. While deploying the script, the user can select the level of parallelism.
A map/reduce script, like a scheduled script, can be run manually or on a set schedule. Compared to scheduled scripts, this script has a few advantages. One is that if a map/reduce task breaches certain features of NetSuite governance, the map/reduce framework will automatically force the job to yield and its work to be rescheduled for a later time without disrupting the script.
Map/reduce should be used in any circumstance where you need to handle multiple records and your logic may be broken down into little chunks. Map/reduce, on the other hand, is not suitable for instances in which you need to execute sophisticated functions for each part of your data collection. The loading and saving of various records are part of a lengthy process.

Map/Reduce Key Concepts

The essential principle of a Map/Reduce script is as follows:

Your script finds data that needs to be processed. This data is broken down into key/value pairs, and your script specifies a function that the system calls once for each pair. Your script can optionally perform a second round of processing. The system can create numerous jobs for each cycle of processing automatically, depending on script deployment, and then process the data in parallel. Please consider the following things before beginning with writing a map/reduce script:

These scripts are executed in stages
Your logic is supplemented by the system.
The system offers dependable context objects.
Multiple jobs are executed by one script
Map/Reduce scripts allows yielding of job and other interruptions

Map/Reduce Scripts Execution

For example:

For the getInputData stage, Users must construct a method that returns an object that can be turned into a key/value pair. For example, if your function returns a NetSuite record search, the system will perform the search and then deliver the results. The key/value pairs would be the search results: Each value would be a JSON containing the record’s field IDs and values, with each key being the record’s internal ID.

For the map stage, the user can optionally write a function that the system invokes one time for each key/value pair. If appropriate, also map function can write output data, in the form of new key/value pairs. If the script also includes a reduce function, the output data is given to the shuffle stage as input before being delivered to the reduce stage. If not, it will be forwarded straight to the summarize stage.

For the shuffle stage, the user does not write a function. If a reduce function is present, the system sorts through any key/value pairs that were sent to the reduce stage. If a map is utilized, the map function may have provided these pairings; if a map function is not present, the shuffle stage takes data from the getInputData stage. The shuffle stage divides the data into key/value pairs, with each key being unique and each value being an array of values.

In the summarize stage, the function can retrieve logs about the script’s work. It can also take appropriate actions for data sent by the reduce stage.

The system supplements the logic written in the script

The functionality of most script kinds is determined by the code contained within the script. The logic in your file is important with map/reduce, but the system also supplements your logic written with its own standardized logic. The system, for example, transports data from one step to the next.

In addition, the system uses your map and reduces functions several times. As a result, the map and reduce functions have logic that is comparable to what a user might use in a loop. These functions should only do a little bit of work.

The system provides robust context objects in Map/Reduce

For each entry point function that the user creates, the system gives a context object. The map/reduce entry point functions are given with resilient objects. These objects include the data and properties necessary for writing a good script. These objects can be used to access data from earlier stages and to write output data that is passed to the next step, for example. Context objects can also store information about faults and occupied usage units.

Multiple jobs are used to execute one script

SuiteCloud Processors, which handle work through a sequence of jobs, power Map/Reduce scripts. A SuiteCloud processor, which is a virtual unit of processing power, executes each work. Processors like these are also used to run scheduled scripts.

When processing a scheduled script, the system always creates only one job. However, to perform a single map/reduce script, the system produces numerous jobs. To execute phases, the system creates at least one job automatically. Furthermore, numerous tasks can be formed to perform the work of the map function, as well as many jobs for the reduce stage. When the system creates multiple maps and reduces jobs, these jobs work independently of each other and may work in parallel across multiple processors. For this reason, the map and reduced stages are called parallel stages.

The get Input Data and summarize stages are each executed by one job. In each case, this job invokes your function only one time. These stages are called serial stages. The shuffle stage is also called a serial stage.

Map/reduce scripts permit yielding jobs & other interruptions

Since the map and reduce stages consist of multiple independent function invocations, the work of these stages can be divided among multiple jobs. The structure of the stage is flexible. This enables parallel processing, as well as allowing map and reduce jobs to handle some elements of their own resource usage.

If a job keeps busy a processor for too long, the system can naturally finish the job after the current map or reduce function has been completed. In this case, the system creates a new job to continue executing the remaining key/value pairs. Based on priority and submission timestamp, the new job will start right after the previous/original job has finished processing, or it will start later, to allow higher-priority jobs processing other scripts to execute.

There are five phases in a map/reduce script. The shuffle stage, for example, does not correspond to an entry point. The other processes, such as a map and reduce, are completed. Their points of entry are listed in the table below.

Map/Reduce Script Samples

/**

* @NApiVersion 2.x

* @NScriptType MapReduceScript

define([‘N/file’], function (file) {

// Define characters that should not be counted when the script performs its analysis of the text.

Const PUNCTUATION_REGEXP = /[\u2000-\u206F\u2E00-\u2E7F\\’!”#\$%&\*\+,\-\.\/:;<=>\?@\[\]\^_`\{\|\}~]/g;

// Use the getInputData function to return two strings.

function getInputData() {

return “the quick brown fox \njumps over the lazy dog.”.split(‘\n’);

}

// Once after the getInputData function is executed, the system creates the below key/value pairs:

// key: 0, value: ‘the quick brown fox’

// key: 1, value: ‘jumps over the lazy dog.’

// The map function is invoked one time for each key/value pair. Each time the function is invoked, the relevant key/value pair is made available through the context.key and context.value properties.

function map(context) {

// Create a loop that will examines each character in string. Exclude space and punctuation marks.

for (var i = 0; context.value && i < context.value.length; i++) {

if (context.value[i] !== ‘ ‘ && !PUNCTUATION_REGEXP.test(context.value[i])) {

// For each character, invoke the context.write() method. This method saves

// a new key/value pair. For the new key, save the character currently being

// examined by the loop. For each value, save the number 1.

context.write({

key: context.value[i],

value: 1

});

}

// After the map function has been invoked for the last time, the shuffle stage

// begins. In this stage, the system sorts the 35 key/value pairs that were saved

// by the map function during its two invocations. From those pairs, the shuffle

// stage creates a new set of key/value pairs, where each key is unique. In

// this way, the number of key/value pairs is reduced to 25. For example, the map

// stage saved three instances of {‘e’,’1′}. In place of those pairs, the shuffle

// stage creates one pair: {‘e’, [‘1′,’1′,’1’]}. These pairs are made available to

// the reduce stage through the context.key and context.values properties.

// The reduce function is invoked one time for each of the 25 key/value pairs

// provided.

function reduce(context) {

// Use the context.write() method to save a new key/value pair, where the new key

// equals the key currently being processed by the function. This key is a letter

// in the alphabet. Make the value equal to the length of the context.values array.

// This number represents the number of times the letter occurred in the original

// string.

context.write({

key: context.key,

value: context.values.length

});

}

// The summarize stage is a serial stage, so this function is invoked only one time.

function summarize(context) {

// Log details about the script’s execution.

log.audit({

title: ‘Usage units consumed’,

details: context.usage

});

log.audit({

title: ‘Concurrency’,

details: context.concurrency

});

log.audit({

title: ‘Number of yields’,

details: context.yields

});

// Use the context object’s output iterator to gather the key/value pairs saved at the end of the reduce stage. Also, tabulate the number of key/value pairs

// that were saved. This number represents the total number of unique letters used in the original string.

var text = ”;

var totalKeysSaved = 0;

context.output.iterator().each(function (key, value) {

text += (key + ‘ ‘ + value + ‘\n’);

totalKeysSaved++;

return true;

});

// Log details about the total number of pairs saved.

log.audit({

title: ‘Unique number of letters used in string’,

details: totalKeysSaved