Advanced Scripting

Initializing Variables

The INIT() function is dedicated for initializing variables to default values in the script (including those of included scripts). Variables declared in INIT() function can also be used as input arguments of a view.

INIT'(' param_name(':'value | '::'type) [, ...] ')'

where:
    param_name - name of the parameter. Must begin with a letter or underscore (reserved for built-in system variables),
    subsequent characters can be letters, numbers or underscore.

    value - the initial value of the parameter.

    type - type of a parameter.
The order of initialization:
  1. evaluate local and included variables
  2. evaluate query configuration parameters.

NB! When setting query configuration parameter values then parameter name must be prefixed by underscore and enclosed in single or double quotes.

Example 1:

1
2
3
4
5
6
7
8
init(
    '_query.now':T('2016-03-13 15:00:00')   //set current time to date of interest
);

@[/user/examples/views/my_webserver_access_logs.sx]
    .filter(timestamp > now()[-2 week])     //no need to change time related code
    .select(timestamp[1 min], count(response = 200) as okResp, count(response != 200) as failResp, count(*) as total)
    .group(@1)

The default values can be overridden when selecting from views. Pass the comma separated list of values assigned to variable and enclosed in parenthesis, to the view:

@[view_name]'(' param_name':'value [ ,... ] ')'

When a parameter is declared using type then its initial value must be set with in the ref:input argument of a view <views_input_arguments> of calling script. I.e the parameter becomes a mandatory input argument.

Query Configuration

Query configuration parameters can be used to control over certain aspects of query script execution. They are basically configuration values which have effect only on the scope of current query.

Configuration parameter names have following syntax: they always begin with _query followed by tuple stream creation command name. Last part is parameter name. Parts are separated by dots:

'_query' '.' [ tuple_stream_create_cmd_name ] '.' param_name

Query configuration parameters can be set in different places with different scope:

  1. System configuration applies parameter values system wide
  2. Init() block applies parameter values in the scope of a script. Overrides system configuration.
  3. LIST and PARSE commands - parameter supplied in argument list applies to the command only. Overrides init() block.

When setting configuration parameters enclose the name in single or double quotes and prefix the name with underscore. The parameters will be renamed by omitting underscore, so that they will be visible in query as in following table:

Name Type Default value Description
query.now
TIMESTAMP or
LONG
now()
Timestamp or LONG value of Unix time setting current
time for a query. This allows you to run script
retrospectively without changing related query.
query.max_pus
INTEGER
# licensed PU’s
Allows to control the processing capacity - set the
number of Processing Units that will process the query.
The number must not exceed the physical and licenced
number of PU’s.
query.list.ignoreErrors
BOOLEAN
false
If set to FALSE then query processing will be
stopped at errors. When set to TRUE then processing
continues and error message is written to Query log tab.
query.list.ignoreFullyDefinedUriNotFoundErrors
BOOLEAN
false
If set to FALSE then query processing will be
stopped at file-not-found errors for fully defined URI
(having no patterns) listings. When set to TRUE then
such errors are silently ignored.
query.parse.ignoreErrors
BOOLEAN
false
When set to TRUE then script execution is not
interrupted at input data related errors, such as file
content not available, decompression failures, etc.
query.parse.maxRecordSize
INTEGER
max length of pattern
Specifies max record length
query.parse.chunkSize
INTEGER
64000000
Size of data split between simultaneous distributed
query tasks processing plaintext blobs
query.parse.chunkSizeCompressed
INTEGER
16000000
Size of data split between simultaneous distributed
query tasks processing compressed blobs
query.batch_mode
BOOLEAN
false
Controls query execution mode: false means interactive
mode, true means batch mode

Example 2. Set a long query to run in batch_mode:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
INIT('_query.batch_mode':true);

/* Increase second argument to enlarge search space and cause query to run longer.
   With search space of entire ipv4 space (0xffffffffL) query execution time
   on 8 CPU Cores is approx 4 minutes
*/

dual(0xfffffff, 0xffff)
 .select(cc(ip), min(ip), max(ip), count(*))
 .group(@1 nosort)
 .save('/user/ip_ranges.sxt', true);

Query configuration parameters can be read using CONFIG() function.

Including Libraries

Complex and long scripts can be hard to read and understand. Capturing reusable parts of code is beneficial both for readability as well as productivity. SpectX allows to accomplish that by INCLUDE directive:

INCLUDE { ' | " } path_to_scriptfile { ' | " }

where path_to_scriptfile is string pointing to script file in resource tree.

path_to_scriptfile must be enclosed in single or double quotes.

path_to_scriptfile can be relative to current script location or absolute.

Declarations from included scripts become accessible in the current script. The names of variables must remain unique.

Example 3:

1
2
3
4
5
6
7
8
INCLUDE 'localincludes.sx';                                             // localincludes.sx script will be searched in
                                                                        // current script directory
INCLUDE '/user/examples/doc/query_lang/script_structure/example1.sx';   // example1.sx is pointed to by absolute path

dual.select(
        $localDefinedTitle,
        $myConstant as remoteConstant
);