Welcome to the Harris Geospatial product documentation center. Here you will find reference guides, help documents, and product libraries.


Harris Geospatial / Docs Center / Geospatial Services Framework / GSF Tutorial: Custom Metatasks

Geospatial Services Framework

GSF Tutorial: Custom Metatasks

GSF - Tutorial - Custom Metatasks

The gsf-javascript-engine allows JavaScript code to be executed as tasks in GSF. With this engine it is possible to have a javascript task that is able to submit additional jobs to the job manager's queue. A task that submits other tasks is known as a metatask. A metatask is able to map the output of one task to the input of another task. This can be very useful for chaining tasks while taking advantage of concurrent processing on multiple workers or nodes.

For this tutorial, we will focus on running a metatask with the help of the gsf-dag-utilities module. This module facilitates the mapping of input and output task parameters as well as controlling the sequence of execution and possible parallelism. Instructions regarding the composition of the metatask will need to be in the form of a Directed Acyclic Graph, or DAG.

What is a Directed Acyclic Graph (DAG)?

A Directed Acyclic Graph is a concept describing a path that cannot start and end at the same point. For our purposes, a DAG is used to describe a chain of tasks and how their parameters are connected. It is essentially the blueprint for the metatask.

The illustration below shows an example of how metatasks can be used to chain together individual tasks. The gray rectangle represents the metatask itself.

The DAG will be an input to the gsf-dag-utilities and it will be represented by a JavaScript object. Each task in the metatask is a key in the DAG. The task name, service, inputs, and outputs are defined for each task in the metatask. Input parameters can be broken into three distinct types:

external_input: A parameter that will be set on the metatask externally. This value corresponds to an input parameter on the metatask. A common example of this would be input data or anything that should be controlled at the metatask level.

internal_input: A parameter that is derived from a parameter on another task. This is the key to chaining tasks together. For example, a hypothetical task named task1 may create an output that is required by task2. In the DAG, task2 would set this parameter to the output of task1. Examples of this are shown below.

static_input: Static input may be used to define a default value for a task; something that cannot be changed using the metatask's interface. For example, if you always want the KERNEL_SIZE for a classification smoothing task to be 3, you would set this parameter as a static input parameter.

An example DAG json excerpt using fictional tasks:

{
    "taskName": "ExampleMetatask",
    "description": "A metatask that uses a DAG to chain processing.",
    "DAG": {
        "PreprocessStep1": {
            "name": "Task_A",
            "service": "ENVI",
            "external_input": {
                "INPUT_PARAMETER1": "INPUT_A"
            },
            "static_input": {
                "INPUT_PARAMETER2": 42
            }
        },
        "PreprocessStep2": {
            "name": "Task_B",
            "service": "ENVI",
            "external_input": {
                "INPUT_PARAMETER1": "INPUT_A"
            }
        },
        "AnalysisStep": {
            "name": "Task_C",
            "service": "ENVI",
            "external_input": {
                "INPUT_PARAMETER1": "INPUT_B"
            },
            "internal_input": {
                "INPUT_PARAMETER2": "Preprocess1.OUTPUT",
                "INPUT_PARAMETER3": "Preprocess2.OUTPUT"
            },
            "output": {
                "OUTPUT_PARAMETER1": "OUTPUT_PARAMETER"
            }
        }
    },
    "inputParameters": {
        "INPUT_A": {
            "type": "ENVIRASTER",
            "required": true
        },
        "INPUT_B": {
            "type": "ENVIRASTER",
            "required": true
        }
    },
    "outputParameters": {
        "OUTPUT_PARAMETER1": {
            "type": "ENVIRASTER",
            "required": true
        }
    }
}

First, take a look at the DAG key of the json structure. In the example above, we have three tasks in the DAG. There are two external parameters: INPUT_A and INPUT_B. PreprocessStep1 has a static_input parameter set to 42, which will remain constant for all metatask executions. This can be thought of as a default value provided by the metatask. PreprocessStep1 and PreprocessStep2 do not have any dependency on each other and will therefore be able to run concurrently if there are enough workers available. AnalysisStep has dependencies on both PreprocessStep1 and PreprocessStep2, so the analysis job will not be queued until both preprocessing steps are complete.

The metatask is self describing and contains keys such as name, description, inputParameters, and outputParameters, which let a user know how to use it. Parameters in the "inputParameters" object are linked to "external_input" and any item in "outputParameters" is linked to the "output" parameters in the DAG.

Creating a Custom Metatask Using the DAG Utilities


This tutorial illustrates how to create a simple metatask that connects two ENVITasks. The first task in the metatask is QuerySpectralIndices task. This task takes a raster as input and outputs a string array of all the spectral indices that can be run on that raster based on its wavelength metadata. Next, we will pass that array into the SpectralIndices task, which will generate a raster in which each band corresponds to one of the listed spectral indices.

Important: Metatasks are tasks and thus require a worker of their own. Before executing metatasks, ensure that at least two workers are configured in the config.json file.

First, create a new folder under the GSFxx directory named 'metatasks'. This folder will contain all the files created during this tutorial.

Create the following file within the 'metatasks' folder:

Note: As in other tutorials, the full example code blocks are provided at the bottom of the page.

ComputeAllSpectralIndices.task

Open the newly created ComputeAllSpectralIndices.task file in a text editor and add the following json formatted text:

{
    "taskName": "ComputeAllSpectralIndices",
    "description": "A custom DAG metatask that runs all possible spectral indices on an input raster.",
    "DAG": {
        "QueryIndices": {
            "name": "QuerySpectralIndices",
            "service": "ENVI",
            "external_input": {
                "input_raster": "INPUT_RASTER"
            }
        },
        "ComputeIndices": {
            "name": "SpectralIndices",
            "service": "ENVI",
            "external_input": {
                "input_raster": "INPUT_RASTER"
            },
            "internal_input": {
                "index": "QueryIndices.AVAILABLE_INDICES"
            },
            "output": {
                "OUTPUT_INDEX_RASTER": "OUTPUT_RASTER"
            }
        }
    },
    "inputParameters": {
        "input_raster": {
            "type": "ENVIRASTER",
            "required": true
        }
    },
    "outputParameters": {
        "OUTPUT_INDEX_RASTER": {
            "type": "ENVIRASTER",
            "required": true
        }
    }
}

Configure The Server

To enable the gsf-javascript-engine, add it to the list of engines in the config.json file.

To do this from a command line, start a command prompt in the GSFxx directory and execute the following command:

node updateConfig.js config.json --arrayadd engines={\"type\":\"gsf-javascript-engine\"}

*Note that each time you execute this command, another engine will be added to the list (so only call it once).

Next, configure the location of the task directory to be searched by the gsf-javascript-engine.

node updateConfig.js config.json --set engines[type:gsf-javascript-engine].taskDir=metatasks

The updateConfig.js script will automatically back up the original config.json file for you.

You may also manually enable the gsf-javascript-engine by editing the config.json as shown below. It is highly recommended that you save a backup copy of the config.json file before making any changes.

  "engines": [
    ...,
    {
      "type": "gsf-javascript-engine",
      "taskDir": "metatasks"
    }
  ],

Restart the server any time this file is changed so it reflects the new configuration.

Submitting the Metatask to the Server

With the server running and at least two workers enabled, submit the following request by opening this URL in a browser:

http://localhost:9191/ese/services/javascript/ComputeAllSpectralIndices/submitJob?INPUT_RASTER={"url":"http://localhost:9191/ese/data/qb_boulder_msi","factory":"URLRaster"}

Hitting refresh or checking the job console will allow you to see the status of the jobs.

Full Example Code Blocks

ComputeAllSpectralIndices.task

{
    "taskName": "ComputeAllSpectralIndices",
    "description": "A custom DAG metatask that runs all possible spectral indices on an input raster.",
    "DAG": {
        "QueryIndices": {
            "name": "QuerySpectralIndices",
            "service": "ENVI",
            "external_input": {
                "input_raster": "INPUT_RASTER"
            }
        },
        "ComputeIndices": {
            "name": "SpectralIndices",
            "service": "ENVI",
            "external_input": {
                "input_raster": "INPUT_RASTER"
            },
            "internal_input": {
                "index": "QueryIndices.AVAILABLE_INDICES"
            },
            "output": {
                "OUTPUT_INDEX_RASTER": "OUTPUT_RASTER"
            }
        }
    },
    "inputParameters": {
        "input_raster": {
            "type": "ENVIRASTER",
            "required": true
        }
    },
    "outputParameters": {
        "OUTPUT_INDEX_RASTER": {
            "type": "ENVIRASTER",
            "required": true
        }
    }
}



© 2017 Exelis Visual Information Solutions, Inc. |  Legal
My Account    |    Buy    |    Contact Us