Aggregate

Aggregate functions compute result from a collection of values. The collection is determined by group command in the query pipeline and can be one of the following:

  1. grouping by all non-aggregated fields in select command will yield computing aggregations for combinations of grouped field unique values:
1
... | select(a, b, COUNT(*)) | group(a, b)
  1. grouping by a subset of the non-aggregated fields in the select clause will yield computing aggregations for for combinations of grouped field unique values. The values of non-aggregated fields in resultset are undetermined (i.e the ordering of these fields are not determined):
1
... | select(a, b, COUNT(c*)) | group(a)
  1. when grouping is omitted the resultset will contain one computed aggregation for the whole selected dataset. The values of non-aggregated fields in resultset are undetermined.
1
... | select(COUNT(*))

AVG

AVG(numeric_expr)

Computes arithmetic mean (average) of not NULL numeric.

output type:DOUBLE

Example: compute the average of integers from 0 to 9:

1
dual(10) | select(average:AVG(i));
average
4.5

AVG(IF(boolean_expr, numeric_value))

Computes arithmetic mean (average) of numeric_value arguments, when boolean_expr evaluates to true.

Example: compute average of odd numbers between 0 to 9:

1
dual(10)| select(AVG(IF(i%2==0, i)));
avg
4.0

COUNT STAR

COUNT(*)

Computes count of all rows in result set.

output type:LONG

Example: count the lines of DUAL data generator:

1
dual(10) | select(count:count(*));
count
10

COUNT

COUNT(expr)

Computes count of not NULL values returned by expr.

output type:LONG

Example: compute count of integers greater than 5:

1
dual(10) | select(count:count(i>5));
count
4

CORRELATION

CORR(numeric_expr1, numeric_expr2)

Computes Pearson correlation coefficient between not NULL numeric values returned by numeric_expr1 and numeric_expr2.

output type:DOUBLE

Example: compute correlation coefficient between sequential value val1 from 0 to 99, and random value val2:

1
dual(100) | select(val1:d, val2:d*RANDOM()) | select(CORR(val1, val2));

FIRST

FIRST(column_name)

Returns first non-NULL value in the group

output type:the type of the value in the group

Example return first values of even and odd numbers in the range from 54 to 63:

1
dual(54, 10) | select(first:FIRST(i)) | group(i%2);
first
54
55

FIRST(IF(boolean_expr, column_name))

Returns first non-NULL value in the group, when boolean_expr evaluates to true.

Example: compute the first non-NULL value which is greater than 3:

1
dual(10)| select(FIRST(IF(i>3, i)));
first
4

LAST

LAST(column_name)

Returns last non-NULL value in the group.

output type:the type of the value in the group

Example return last values of even and odd numbers in the range from 54 to 63:

1
dual(54, 10) | select(last:LAST(i)) | group(i%2);
last
62
63

LAST(IF(boolean_expr, column_name))

Returns last non-NULL value in the group, when boolean_expr evaluates to true.

Example: compute the last non-NULL value which is greater than 3:

1
dual(10)| select(LAST(IF(i>3, i)));
last
9

MAX

MAX(numeric_expr)

Computes maximum value of not NULL numeric values returned by numeric_expr.

output type:DOUBLE

Example compute maximum between ten random numbers:

1
dual(10) | select(rand:RANDOM()) | select(MAX(rand));

MAX(IF(boolean_expr, column_name))

Returns maximum non-NULL value in the group, when boolean_expr evaluates to true.

Example: compute the maximum non-NULL value which is greater than 3:

1
dual(10)| select(MAX(IF(i>3, i)));
max
9

MIN

MIN(numeric_expr)

Computes minimum value of not NULL numeric values returned by numeric_expr.

output type:DOUBLE

Example compute minimum between ten random numbers:

1
dual(10) | select(rand:RANDOM()) | select(MIN(rand));

MIN(IF(boolean_expr, column_name))

Returns minimum non-NULL value in the group, when boolean_expr evaluates to true.

Example: compute the minimum non-NULL value which is less than 9:

1
dual(10)| select(MIN(IF(i<9, i)));
min
3

SUM

SUM(numeric_expr)

Computes sum of not NULL numeric_expr values.

output type:DOUBLE

Example compute the sum of even and odd integers in the range of 0 to 10:

1
dual(10) | select(sum:SUM(i)) | group(i%2);
sum
20.0
25.0

SUM(numeric_expr, …)

Computes the sum of numeric_expr arguments.

output type:DOUBLE

Example:

1
dual(2) | select(sum_args:SUM(i, 3));
sum_args
3.0
4.0

SUM(IF(boolean_expr, column_name))

Returns sum of non-NULL values in the group, when boolean_expr evaluates to true.

Example: compute the sum of even numbers between 0 and 9:

1
dual(10) | select(SUM(IF(i%2==0, i)));
sum
20.0

SUM_LONG

SUM_LONG(long)

Returns sum of not NULL LONG values.

output type:LONG

Example compute the sum of even and odd long values in the range of 0 to 10:

1
dual(10) | select(sum_long:SUM_LONG(l)) | group(l%2);
sum_long
20
25

SUM_LONG(IF(boolean_expr, column_name))

Returns sum of non-NULL values in the group, when boolean_expr evaluates to true.

Example: compute the sum of even long values between 0 and 9:

1
dual(10) | select(SUM_LONG(IF(l%2==0, l)));
sum_long
20

STDDEV

STDDEV(numeric_expr), STDDEV_SAMP(numeric_expr),

Computes the estimate of standard deviation for the population from a sample of not NULL numeric values.

output type:DOUBLE

Example compute standard deviation of even and odd integers in the range of 0 to 10:

1
dual(10) | select(stddev:STDDEV(i)) | group(i%2);
stddev
3.1622776601683795
3.1622776601683795

STDDEV_POP(numeric_expr)

Computes the population standard deviation from not NULL numeric values.

output type:DOUBLE

Example compute population standard deviation of even and odd integers in the range of 0 to 10:

1
dual(10) | select(stddev_pop:STDDEV_POP(i)) | group(i%2);
stddev_pop
2.8284271247461903
2.8284271247461903

VARIANCE

VARIANCE(numeric_expr), VAR_SAMP(numeric_expr)

Computes the estimate of the population variance from a sample of not NULL numeric values.

output type:DOUBLE

Example compute variance of even and odd integers in the range of 0 to 10:

1
dual(10) | select(variance:VARIANCE(i)) | group(i%2);
variance
10.0
10.0

VAR_POP(numeric_expr)

Computes the population variation of not NULL numeric values.

output type:DOUBLE

Example compute variance of even and odd integers in the range of 0 to 10:

1
dual(10) | select(var_pop:VAR_POP(i)) | group(i%2);
var_pop
8.0
8.0