User Defined Functions

SpectX Query Language allows user defined functions (UDF). A UDF can be defined using SX Query Language or JavaScript. There are three types of UDFs that differ by language and return value: Expression UDF, Pipetask UDF and JavaScript UDF.

Expression UDF

UDFs using SXQL expressions and returning a single value of SpectX data types are called expression UDF’s. At the declaration, the identifying function name must be prepended with ‘$’:

'$'function_name '(' decl_args, ... ')' '=' STRING::expression ';'
where:

Example 1: /user/examples/doc/query_lang/udf/example1.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// compute the number of requests in a day per US regions from apache access logs

// let's define a function computing the region string from two-letter country codes.

$region(state) =
  CASE
    WHEN $state IN ('WA', 'OR', 'CA', 'AK', 'HI', 'ID','MT', 'WY', 'NV', 'UT', 'CO', 'AZ', 'NM')
      THEN 'West'
    WHEN $state IN ('OK', 'TX', 'AR', 'LA', 'TN', 'MS', 'AL', 'KY', 'GA',
                     'FL', 'SC', 'NC', 'VA', 'WV', 'MD', 'DC', 'DE')
      THEN 'South'
    WHEN $state IN ('ND', 'SD', 'NE', 'KS', 'MN', 'IA', 'MO', 'WI', 'IL', 'IN', 'MI', 'OH')
      THEN 'Midwest'
    WHEN $state IN ('NY', 'PA', 'NJ', 'CT', 'RI', 'MA', 'VT', 'NH', 'ME')
      THEN 'Northeast'
    ELSE 'Other'
  END
;

// next, let's write the query in pipe-style.
@[/user/examples/views/my_webserver_access_logs.sx]     //a view captures the pattern and apache access logs location:
 .select(timestamp[day] as day,                         //truncate the apache_access timestamp field to day:
         $region(cc(clientIp)) as region,             //compute the country code using function cc,
                                                        //then pass the result to $region() function to compute the region:
         count(*) as ipCnt)                             //compute count
 .group(day, region)                                    //aggregate counts by day and reqion fields
 .sort(day, ipCnt DESC);                                //sort ascending by day and descending by ipCnt

Pipetask UDF

A pipetask UDF is a function that uses SX Query Language and has a tuple stream as input and output. It operates in quite a similar way as any of the query pipe commands with optionally accepting a custom set of input arguments:

'@@'function_name [ '(' decl_args, ... ')' ] '=' STRING::sxql_statement ';'
where:

Example 2: /user/examples/doc/query_lang/udf/example2.sx

1
2
3
4
//define a pipetask performing top function:
@@top(field, cnt:5)=sort($field desc).limit($cnt);

dual(10).@@top(i);

JavaScript UDF

UDF using JavaScript and returning a single value of SpectX data types is called JavaScript UDF. At declaration, the identifying function name must be prepended with the type identifier ‘$’:

'$'function_name '(' decl_args, ... ')' '=' 'CREATE FUNCTION'
'('
    lang: [ 'js' | 'javascript' ],
    returns: [ :sx_primitive_data_type | sx_composite_data_prototype ],
    include::STRING
    code::STRING
')' ';'

sx_primitive_data_type:
    [
        INTEGER | LONG | FLOAT | DOUBLE | BOOLEAN | IPV4ADDR | IPV4SOCKET | IPV4NET |
        IPV6ADDR | IPV6SOCKET | MACADDR | GEOPOINT | TIMESTAMP | TIMESTAMP_NANO | STRING
    ]
where:
  • decl_args is one or more function input parameters of any SpectX data types. See more on declaring input parameters.
  • lang is a STRING-type named argument specifying the language. Currently only js and javascript are supported.
  • returns is a named parameter specifying the function return value type. More precisely, it is the SpectX data type that the return value will be converted to. See the details here: Returning SpectX Data Types from JavaScript.
  • include is a named STRING parameter specifying the path to the included source file
  • code is a named STRING parameter containing the body of a function. The function body must end with the return statement.

Example 3: /user/examples/doc/query_lang/udf/example3.sx

1
2
3
4
5
6
7
8
9
$multiplyInputs(x::long, y::long) = CREATE FUNCTION(
  lang:'js',
  returns::long,
  code:'return x*y;'
);

dual(10)
 .select(long(i) as x, long(i+4) as y)
 .select(*, $multiplyInputs(x, y));

JavaScript functions can be also included from source files.

Example 4: /user/examples/doc/query_lang/udf/example4.sx

/user/examples/doc/query_lang/udf/example4.js
1
2
3
4
5
6
function isYoung(age) {
  ret = true;
  if(age > 18)
    ret = false;
  return ret;
}
1
2
3
4
5
6
7
8
$person(nameStr::STRING, ageInt::INTEGER) = CREATE FUNCTION(
  lang:'js',
  library:'/user/examples/doc/query_lang/udf/example4.js',
  code:'return {name:nameStr, age:ageInt, underAge:isYoung(ageInt)};',
  returns:{name:"", age:0, underAge:true}
);

dual.select($person('John', 8), $person('Anna', 22));

SpectX is quite relaxed in handling return value conversions. If conversion from the returned value to a declared SpectX data type is possible, it will do so. In case the conversion is impossible or function does not return the declared value, NULL return value is assumed.

Example 5: /user/examples/doc/query_lang/udf/example5.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$myCode="return x+10;";
$f_int(x)=CREATE FUNCTION(lang:"js", code:$myCode, returns::INTEGER);
$f_string(x)=CREATE FUNCTION(lang:"js", code:$myCode, returns::STRING);
$f_float(x)=CREATE FUNCTION(lang:"js", code:$myCode, returns::FLOAT);
$f_ipaddr(x)=CREATE FUNCTION(lang:"js", code:$myCode, returns::IPADDR);
$f_mac(x)=CREATE FUNCTION(lang:"js", code:$myCode, returns::MACADDR);

dual(10)
 .select($f_int(i), $f_string(i), $f_float(i), $f_ipaddr(i), $f_mac(i), $f_float(i))
;

Returning SpectX Data Types from JavaScript

INTEGER, LONG, FLOAT, DOUBLE, BOOLEAN and STRING

JavaScript Number, Boolean and String types are directly converted to SpectX numeric types.

returns declaration must be specified accordingly.

Example 6: /user/examples/doc/query_lang/udf/example6.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
$ret_int()=CREATE FUNCTION(
    lang:"js",
    code:"return 1",
    returns::INTEGER);

$ret_long()=CREATE FUNCTION(
    lang:"js",
    code:"return 1",
    returns::LONG);

$ret_float()=CREATE FUNCTION(
    lang:"js",
    code:"return 1.1",
    returns::FLOAT);

$ret_double()=CREATE FUNCTION(
    lang:"js",
    code:"return 1.1",
    returns::DOUBLE);

$ret_boolean()=CREATE FUNCTION(
    lang:"js",
    code:"return true",
    returns::BOOLEAN);

$ret_string()=CREATE FUNCTION(
    lang:"js",
    code:"return 'Hello world!'; ",
    returns::STRING);

dual.select($ret_int(), $ret_long(), $ret_float(), $ret_double(),
$ret_boolean(), $ret_string());

TIMESTAMP

Use the Timestamp object to return the SpectX TIMESTAMP type with a millisecond precision. The timestamp constructor takes the long integer parameter representing milliseconds from the beginning of Unix time.

public class Timestamp {
    public Timestamp(long millis);
}

Alternatively, you can return the millisecond value directly and it will be converted to the SpectX TIMESTAMP value.

Declaration of returns:

returns::TIMESTAMP

Example 7: /user/examples/doc/query_lang/udf/example7.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ret_timestamp_ts()=CREATE FUNCTION(
    lang:"js",
    code:"return new Timestamp(1510569553403)",     //return Timestamp object
    returns::timestamp);

$ret_timestamp_long()=CREATE FUNCTION(
    lang:"js",
    code:"return 1510569553403",                    //return milliseconds from Unix time beginning
    returns::timestamp);

dual.select($ret_timestamp_long, $ret_timestamp_ts);

Use the TimestampNano object to return SpectX TIMESTAMP type with a nanosecond precision. TimestampNano constructor takes the long integer parameter representing nanoseconds from the beginning of Unix time.

public class TimestampNano {
    public TimestampNano(long nanos);
}

Alternatively, you can return the value of nanoseconds directly and it will be converted to SpectX TIMESTAMP value.

Declaration of returns:

returns::TIMESTAMP_NANO

Example 8: /user/examples/doc/query_lang/udf/example8.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ret_timestamp_nano_tsnano()=CREATE FUNCTION(
    lang:"js",
    code:"return new TimestampNano(1510569553403000064)",   //return TimestampNano object
    returns::timestamp_nano);

$ret_timestamp_nano_long()=CREATE FUNCTION(
    lang:"js",
    code:"return 1510569553403000064",                      //return nanoseconds from Unix time beginning
    returns::timestamp_nano);

dual.select($ret_timestamp_nano_tsnano, $ret_timestamp_nano_long);

IPV4ADDR

Use the Ipv4Addr object and its static parse method to return IPV4ADDR from JavaScript:

public class Ipv4Addr {
    public Ipv4Addr(int addr);
    public static Ipv4Addr parse(String ipv4Str);
}

Alternatively, you can return an integer value representing an IPv4 address directly from JavaScript.

Declaration of returns:

returns::IPV4ADDR

Example 9: /user/examples/doc/query_lang/udf/example9.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
$ret_ipaddr_from_int(addr::INTEGER)=CREATE FUNCTION(
    lang:"js",
    code:"return new IpAddr(addr);",
    returns::IPADDR);

$ret_ipaddr_from_str(addr::STRING)=CREATE FUNCTION(
    lang:"js",
    code:"return IpAddr.parse(addr);",
    returns::IPADDR);

$ret_ipaddr()=CREATE FUNCTION(
    lang:"js",
    code:"return -1392442369;",
    returns::IPADDR);

dual
 .select($ret_ipaddr_from_int(-1392442369),
         $ret_ipaddr_from_str('173.1.3.255'),
         $ret_ipaddr)
;

IPV4NET

You can use the Ipv4Network class and its static parse method to return IPV4NET from JavaScript:

public class Ipv4Network {
    public Ipv4Network(int addr, int mask)
    public static Ipv4Network parse(String ipv4Net);
}

Alternatively, you can return a string value representing the IPv4 network from JavaScript (SpectX will convert the string to IPV4NET).

Declaration of returns:

returns::IPV4NET

Example 10: /user/examples/doc/query_lang/udf/example10.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
$ret_ipnet_from_int(addr::INTEGER, mask::INTEGER)=CREATE FUNCTION(
    lang:"js",
    code:"return new IpNet(addr, mask);",
    returns::IPNET);

$ret_ipnet_from_str(netStr::STRING)=CREATE FUNCTION(
    lang:"js",
    code:"return IpNet.parse(netStr);",
    returns::IPNET);

$ret_ipnet()=CREATE FUNCTION(
    lang:"js",
    code:"return '173.1.3.255/24'; ",
    returns::IPNET);

dual
 .select($ret_ipnet_from_int(-1392442369, 24),
         $ret_ipnet_from_str('173.1.3.255/24'),
         $ret_ipnet)
;

IPV4SOCKET

Use the Ipv4Socket object and its static parse method to return IPV4SOCKET from JavaScript:

public class Ipv4Socket{
    public Ipv4Socket(int addr, int port);
    public static Ipv4Socket parse(String ipv4Socket);
}

Alternatively, you can return a string value representing the IPv4 address and port from JavaScript (SpectX will convert the string to IPV4SOCKET).

Declaration of returns:

returns::IPV4SOCKET

Example 11: /user/examples/doc/query_lang/udf/example11.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
$ret_ipsocket_from_int(addr::INTEGER, port::INTEGER)=CREATE FUNCTION(
    lang:"js",
    code:"return new IpSocket(addr, port);",
    returns::IPSOCKET);

$ret_ipsocket_from_str(socketStr::STRING)=CREATE FUNCTION(
    lang:"js",
    code:"return IpSocket.parse(socketStr);",
    returns::IPSOCKET);

$ret_ipsocket()=CREATE FUNCTION(
    lang:"js",
    code:"return '173.1.3.255:22'; ",
    returns::IPSOCKET);

dual
 .select($ret_ipsocket_from_int(-1392442369, 80),
         $ret_ipsocket_from_str('173.1.3.255:443'),
         $ret_ipsocket)
;

IPV6ADDR

Use the Ipv6Addr class and its static parse method to return IPV6ADDR from JavaScript:

public class Ipv6Addr {
    public Ipv6Addr(long hiAddr, long loAddr);
    public static Ipv6Addr parse(String ipv6Str);
}

Alternatively, you can return a string representing the IPv6 address directly from JavaScript (SpectX will convert the string to IPV6ADDR).

Declaration of returns:

returns::IPV6ADDR

Example 12: /user/examples/doc/query_lang/udf/example12.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
$ret_ipv6addr_from_longs(highAddr::LONG, lowAddr::LONG)=CREATE FUNCTION(
    lang:"js",
    code:"return new IpAddr(highAddr, lowAddr);",
    returns::IPADDR);

$ret_ipv6addr_from_str(ipv6addrStr::STRING)=CREATE FUNCTION(
    lang:"js",
    code:"return IpAddr.parse(ipv6addrStr);",
    returns::IPADDR);

$ret_ipv6addr()=CREATE FUNCTION(
    lang:"js",
    code:"return '0102:0304::f1f2'; ",
    returns::IPADDR);

dual
 .select($ret_ipv6addr_from_longs(72623859706101760,61938L),
         $ret_ipv6addr_from_str('0102:0304::f1f2'),
         $ret_ipv6addr)
;

IPV6SOCKET

Use Ipv6Socket class and its static parse method to return IPV6SOCKET from JavaScript:

public class Ipv6Socket {
    public Ipv6Socket(long hiAddr, long loAddr, int port);
    public static Ipv6Socket parse(String ipv6SocketStr);
}

Alternatively, you can return a string representing IPv6 address and port directly from JavaScript (SpectX will convert string to IPV6SOCKET).

Declaration of returns:

returns::IPV6SOCKET

Example 13: /user/examples/doc/query_lang/udf/example13.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
$ret_ipv6socket_from_long(highAddr::LONG, lowAddr::LONG, port::INTEGER)=CREATE FUNCTION(
    lang:"js",
    code:"return new IpSocket(highAddr, lowAddr, port);",
    returns::IPSOCKET);

$ret_ipv6socket_from_str(ipv6SocketStr::STRING)=CREATE FUNCTION(
    lang:"js",
    code:"return IpSocket.parse(ipv6SocketStr);",
    returns::IPSOCKET);

$ret_ipv6socket()=CREATE FUNCTION(
    lang:"js",
    code:"return '[0102:0304::f1f2]:8080'; ",
    returns::IPSOCKET);

dual.select(
    $ret_ipv6socket_from_long(72623859706101760l, 61938l, 443),
    $ret_ipv6socket_from_str('[0102:0304::f1f2]:80'),
    $ret_ipv6socket());

MACADDR

Use MacAddr class and its static parse method to return MACADDR from JavaScript:

public class MacAddr {
    public MacAddr(long macAddr);
    public static MacAddr parse(String macAddrStr);
}

Alternatively, you can return a long value representing the MAC-address directly from JavaScript.

Declaration of returns:

returns::MACADDR

Example 14: /user/examples/doc/query_lang/udf/example14.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
$ret_macaddr_from_long(addr::LONG)=CREATE FUNCTION(
    lang:"js",
    code:"return new MacAddr(addr);",
    returns::macaddr);

$ret_macaddr_from_str(addr::STRING)=CREATE FUNCTION(
    lang:"js",
    code:"return MacAddr.parse(addr);",
    returns::macaddr);

$ret_macaddr()=CREATE FUNCTION(
                    lang:"js",
                    code:"return 1234;",
                    returns::macaddr);

dual.select($ret_macaddr_from_long(12345l),
$ret_macaddr_from_str('00-02-12-23-22-05'), $ret_macaddr);

GEOPOINT

Use the Geopoint object to return GEOPOINT from JavaScript:

public class GeoPoint {
    public GeoPoint(float latitude, float longitude);
}

Declaration of returns:

returns::GEOPOINT

Example 15: /user/examples/doc/query_lang/udf/example15.sx

1
2
3
4
5
6
$ret_geopoint(longitude::FLOAT, latitude::FLOAT)=CREATE FUNCTION(
    lang:"js",
    code:"return new GeoPoint(longitude, latitude);",
    returns::geopoint);

dual.select($ret_geopoint(40.2f, -70.35f));

ARRAY

Construct a string with explicit array values to return ARRAY from JavaScript.

Declaration of returns: string value of array prototype.

Example 16: /user/examples/doc/query_lang/udf/example16.sx

1
2
3
4
5
6
7
$ret_array()=CREATE FUNCTION (
  lang:'js',
  returns:[ipsocket()],
  code:"return ['127.0.0.1:22',null,IpSocket.parse('127.0.0.1:80')];"
);

dual.select($ret_array);

TUPLE

Construct a string with explicit tuple values to return TUPLE from JavaScript.

Declaration of returns: string value of tuple prototype.

Example 4: /user/examples/doc/query_lang/udf/example4.sx

1
2
3
4
5
6
7
8
$person(nameStr::STRING, ageInt::INTEGER) = CREATE FUNCTION(
  lang:'js',
  library:'/user/examples/doc/query_lang/udf/example4.js',
  code:'return {name:nameStr, age:ageInt, underAge:isYoung(ageInt)};',
  returns:{name:"", age:0, underAge:true}
);

dual.select($person('John', 8), $person('Anna', 22));

Logging

You can use console methods for logging in Query log> tab for debugging purposes. It has following methods:

console.log(String arg)
console.debug(String arg)
console.error(String arg)
console.info(String arg)
console.trace(String arg)
console.warn(String arg)

to log with different log levels.

Parameter Declaration and Calling

Each function input parameter may be declared by type, with the default value or leaving it indefinite.

When referring an argument in the body of the UDF, the argument must be prepended with its type identifier.

1. Declaring explicit SpectX data type parameter:

param_id '::' type

Assigning type to a parameter will enforce a runtime argument type check at compile.

Example 17: /user/examples/doc/query_lang/udf/example17.sx

1
2
3
4
5
6
//declare udf with long integer type parameter
$withType(x::long) = 'Element ' || TYPE($x) || ' value: ' || $x;

dual().select($withType(5l));           //calling with long succeeds
//dual().select($withType(5));          //calling with integer results in compile error
//dual().select($withType('five'));     //calling with string results in compile error

2. Declaring parameters with default value:

// parameter with the default value of expression, returning SpectX data type
param_id ':' expr

// parameter with the default value of reference to tuple stream
param_id ':' '@'t_stream_ref_id

Declaring parameter with the default value allows omitting that parameter at calling.

Example 18: /user/examples/doc/query_lang/udf/example18.sx

1
2
3
4
5
6
7
8
//declare UDF with default string value assigned parameter:
$withDefault(x:'default')='Element ' || TYPE($x) || ' value: ' || $x;

dual()
    .select(
      $withDefault() as defaultArgVal,
      $withDefault("a") as argVal
    );
defaultArgVal argVal
‘Element STRING value: default’ ‘Element STRING value: a’

3. Declaring a parameter with indefinite type:

param_id

Leaving parameter indefinite allows specifying different types of arguments at calling.

Example 19: /user/examples/doc/query_lang/udf/example19.sx

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$indefinite(x)=         //an indefinite parameter declaration
    'Element '
    || TYPE($x)
    || ' value: '
    || $x;

dual()
    .select(
        $indefinite('suss') as strArgVal,
        $indefinite(50) as intArgVal
    );
strArgVal intArgVal
‘Element STRING value: suss’ ‘Element STRING value: 50’