SpectX query expression consists of a sequence of consecutive Query Process Commands chained together using pipe | character. Conceptually it is similar to Unix command pipelining where the output of a command (to the left of the pipe) is used as input of the next command (to the right of the pipe).

The pipeline must always start with a query input command producing a stream of structured data for next commands.


LIST('s3://spectx-docs/formats/***/*') | sort(length desc) | limit(10)

To better understand how the Query Process commands act on your data, it helps to visualize them as a table. Each command redefines the shape of the table passed between commands:


The data columns within the stream are always associated with a type. The extensive library of built-in Query Functions for manipulating and enriching data. This can be extended even more by User Defined Functions.

SpectX queries are written as a script. This allows composing complex analysis tasks, consisting of many queries manipulating retrieved data. All this is in an easily readable manner accompanied by explaining comments.

Query configuration allows to choose between different options for processing data, error handling, controlling various aspects of data formatting (for instance timezone, locale), etc.

Each saved script automatically becomes usable as a view - i.e you can use them in other query scripts. The views are allowed to take input arguments and they can be executed via SpectX REST API.

SpectX processes data as a snapshot. With each query execution, data is read from specified input resources. When caching is enabled then only the relevant delta of missing data is transferred.