Array

ARRAY { matcher_expr … }

The ARRAY allows parsing repeated sequences of variable number data elements, specified by pattern supplied as an argument.

The specified pattern is applied repeatedly until:
  • an unmatch occurs, or
  • the maximum number of matches has been reached.
output type:

ARRAY

quantifier:

no default value, must be explicitly set. The array can hold a maximum of 32768 elements.

configuration:

locale = string specifying IETF BCP 47 language tag enclosed in single or double quotes (see the list here ). The default locale is English.

charset = character set name enclosed in single or double quotes (for example charset="ISO-8859-1")

Array captures exported data elements in array data type. You must assign an export name to ARRAY to make exported members visible for the query.

Example: Consider data where each line has integers, separated by forward slash “/” (i.e an array), where:

  • There is no separator after the last integer.
  • The number of integers can be different on each line but has to be no less than 3 and no more than 5.
  • Additionally the integers may be omitted (missing) in the array.
101/102/103
201//203//205
/302/303/304

In the pattern, we define an ARRAY (lines 1 and 4). It matches for an integer (which can be missing since the quantifier ‘*’ allows to match zero times) followed by a forward slash (which can also be missing by applying optional modifier ‘?’, to match the last element of the array). The array has quantifier expression on line 4, specifying min 3 and max 5 elements to be matched and it also assigns the export name to array expression. Out record ends with an EOL matching with line-feed:

1
2
3
4
5
ARRAY{
    INT*:i
    '/'?
}{3,5}:int_array
EOL;

Parsing results with line 1 evaluated to an array with 3 members, line 2 array with five members and line 3 array with four members (note NULL values for missing integers).

int_array _unmatched
[101, 102, 103] NULL
[201, null ,203, null, 205] NULL
[null, 301, 302, 304] NULL