SpectX Data Types

SpectX query language operates with strongly typed data, i.e the functions and operators accept only declared types of data. The type is assigned to data during parsing or using casting functions. SpectX also recognizes value types expressed in literal notation (for example using constant values in functions or input arguments of views) except of VARIANT and VARIANT_ARRAY types.

Primitive Types

INTEGER

32-bit signed two’s complement integer that has a minimum value of -2^31 and a maximum value of 2^31-1.

LITERAL NOTATION: Integers can be expressed either in decimal or hexadecimal notation:

decimal: -2147483648 to 2147483647
hexadecimal: 0x0 to 0xFFFFFFFF

LONG

64-bit two’s complement integer. The signed long has a minimum value of -2^63 and a maximum value of 2^63-1

LITERAL NOTATION: Long can be expressed in decimal or hexadecimal notation and followed by the ‘l’ or ‘L’ letter:

decimal: -9223372036854775807l to 223372036854775806l
hexadecimal: 0x0L to 0xFFFFFFFFFFFFFFFFl

FLOAT

single-precision 32-bit IEEE 754 floating point. The range of its values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification.

LITERAL NOTATION: A float numeric can be expressed in decimal or scientific notation. The values may be preceded by a minus sign and must be followed by the letter ‘f’ or ‘F’:

decimal: 2.0 -4.44f
scientific: 2.4E-24F 3e2

DOUBLE

double-precision 64-bit IEEE 754 floating point. The range of its values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification.

LITERAL NOTATION: A double numeric can be expressed in decimal or scientific notation. The values can be preceded by a minus sign and followed by the letter ‘d’ or ‘D’:

decimal: 2.0 -4.44d
scientific: 2.4E-24D 3e2

BOOLEAN

has only two possible values: true and false.

LITERAL NOTATION: A Boolean value can be expressed using either upper or lowercase letters:

true or TRUE
false or FALSE

IPADDR

Contains an IPv4 or IPv6 address. Internally they are kept as IPv6 addresses (IPv4-mapped Ipv6 addresses)

LITERAL NOTATION:

Ipv4 addresses can be expressed using dot decimal notation:

192.168.1.0

IPv6 addresses can be expressed with hextet notation using IPADDR cast function:

IPADDR('2001:0db8:85a3:0000:0000:8a2e:0370:7334')
IPADDR('2001:db8:85a3::8a2e:370:7334')

You can use IS_IPV4() and IS_IPV6() functions to identify if an IPADDR contains IPv4 or IPv6 address.

IPSOCKET

A pair of ip-address and port.

LITERAL NOTATION:

IPv4 socket Expressed as an IPv4 address followed by a colon ‘:’ and a port number:

232.5.16.44:443

IPv6 socket. Expressed as an IPv6 address and port using IPSOCKET cast function. The address must be enclosed in square brackets as requested in https://tools.ietf.org/html/rfc5952#section-6 :

IPSOCKET(‘[2a00:1450:4010:c05::69]:443’)

You can use IS_IPV4() and IS_IPV6() functions to identify if IPSOCKET contains IPv4 or IPv6 address.

IPNET

IPv4 or IPv6 address range expressed using CIDR notation.

LITERAL NOTATION:

192.168.100.14/24
IPNET('2001:db8::/48')

Note that IPv6 network must be expressed using IPNET cast function.

You can use IS_IPV4() and IS_IPV6() functions to identify if IPNET contains IPv4 or IPv6 address.

MACADDR

MAC-48 address following standard IEEE notation.

LITERAL NOTATION:

01-23-45-67-89-ab
MACADDR('01:23:45:67:89:ab')

GEOPOINT

Pair of float numbers representing geographical longitude and latitude coordinates.

LITERAL NOTATION: A string with two comma separated float numerics in decimal notation.

'59.398401, 24.656800'

Geopoint coordinates can be converted to GEOPOINT data object using GEO() function and then used further by other network functions.

TIMESTAMP

A reference to a point in time with precision of millisecond.

LITERAL NOTATION: A string in the form of yyyy-MM-dd HH:mm:ss.SSS Z.

'2016-03-14 09:00:00.001 +0000'

TIMESTAMP_NANO

A reference to a point in time with precision of nanosecond.

LITERAL NOTATION: A string in the form of yyyy-MM-dd HH:mm:ss.SSSSSSSSS Z.

'2016-03-14 09:00:00.000000003 +0000'

STRING

Sequence of characters with a specified character set.

LITERAL NOTATION:

1. Quoted string - enclose the string in single quotes. Escape a single quote in the string with a backslash ‘ \ ‘ if needed. Example: specifying format string in TIMESTAMP matcher:

TIMESTAMP('yyyy-MM-dd')

2. Double quoted string - enclose the string in double quotes. Escape double quote in the string with a backslash ‘ \ ‘ if needed. Note that the string may contain single quotes.

Example 1:

1
2
3
4
5
6
7
8
//parsing timestamp from string:
$string = "year:2002, month:May, day:22, hour:22, minute:56";

$timePattern = PATTERN {
TIMESTAMP("'year:'yyyy', month:'MMM', day:'dd', hour:'HH', minute:'mm"):ts
};

dual.select(PARSE($timePattern, $string));
  1. heredoc syntax:

    After operator ‘<<<’, an identifier is provided, then a newline. The string itself follows, and then the same identifier again to close the quotation.

Example2: rewrite Example 1.

1
2
3
4
5
6
7
8
//parsing timestamp from string:
$string = "year:2002, month:May, day:22, hour:22, minute:56";

$timePattern = <<<END
TIMESTAMP("'year:'yyyy', month:'MMM', day:'dd', hour:'HH', minute:'mm"):ts
END;

dual.select(PARSE($timePattern, $string));

VARIANT

Data type which is evaluated dynamically at runtime. VARIANT can represent any of primitive data types.

Sometimes the type of an extracted data element is not known before the parser is run. For instance JSON structures contain also the type of data which could be used for extraction. In this case parser uses VARIANT data to capture extracted data elements.

Example 5:

1
2
3
4
5
$jsonString = '{"name":"John","age":33}';

dual
 .select(PARSE("JSON:person",$jsonString))
;

will result in following extracted fields:

person[name] = "John" (VARIANT:STRING)
person[age] = 33 (VARIANT:LONG)

Note that when using VARIANT data in functions you must cast the data into respective primitive types required by the function:

1
2
3
4
5
@[/user/examples/doc/query_lang/spectx_data_types/example5.sx]
 .select(STRLEN(STRING(person[name])),        //in computing length of a name we need to cast it to STRING
         LONG(person[age]) > 18 as is_adult   //in computing if person is adult we need to cast it to numeric primitive type
 )
;

Composite Types

ARRAY

One-dimensional array of one primitive type data. The array object is most commonly used when parsing repeated elements from the source data (see Array).

LITERAL NOTATION: enclose data elements in square brackets, separated by a comma:

[0, 2, 3]

A data element can be accessed by its position in the array. Example 3:

1
2
3
4
//accessing elements of an array
dual(5)
 .select(array)    //select the array from a generated dataset
 .select(array[0]) //select element 0 from the array (of IP-addresses)

VARIANT_ARRAY

Array of VARIANT type data. VARIANT_ARRAY is created when parsing Json arrays (see JSON_ARRAY). This is a two-dimensional array, where first column is the name of a Json array:

1
2
3
4
5
$jsonString = '[223423,-343.8e7,null,"1\u00393.40.32.33","3.14"]';

dual
 .select(PARSE("JSON_ARRAY:j", $jsonString))
;

will produce following VARIANT_ARRAY as result:

j = [223423,-3.438E9,null,193.40.32.33,3.14] (VARIANT_ARRAY)

You can access array elements by specifying the name and position:

1
2
@[/user/examples/doc/query_lang/spectx_data_types/example7.sx]
 .select(j[array][0] as elem_1, j[array][1] as elem_2, j[array][2] as elem_3, j[array][3] as elem_4, j[array][4] as elem_5)

VARIANT_OBJECT

A set of key-value pairs with STRING type keys and VARIANT type values. VARIANT_OBJECT is created when parsing Json objects (see JSON).

Example 9:

1
2
3
4
5
$jsonstr = '{"a":1, "b":"foo"}';

dual
 .select(PARSE("JSON:j EOS", $jsonstr))
 .select(TYPE(@1),*);

produces VARIANT_OBJECT as result (double click to see details):

type = 'VARIANT_OBJECT'
j[a] = 1L // (VARIANT:LONG)
j[b] = 'foo' // (VARIANT:STRING)

TUPLE

A set of key-value pair data whose value can be any of SpectX data types. The tuple object can be constructed using the STRUCTURE matcher when parsing source data.

LITERAL NOTATION: enclose data elements in curly braces separated by a comma:

{seq:1, name:'string', isMember:true}

Data elements can be accessed by the key. Example 4:

1
2
3
4
5
6
7
8
dual(5)
 .select({      //select elements into a tuple
    a:i,        //assign the field i from dual-generated records into the tuple member named 'a'
    b:"string", //tuple member 'b' contains the constant string "string"
    c:true      //tuple member 'c' contains the boolean value true
   } as s1      //name tuple to 's1'
 )
 .select(s1[a] as sequence, s1[b] as name, s1[c] as isMember)