Developing Queries

Using Query Editor

Query Editor tab window is divided to two: upper side contains editor for query script. Lower half contains three tab windows:

  • Results - displays query results.
  • Explanation - displays detailed information of query processing stages.
  • Query log - displays error or notification messages.

Action bar has buttons for:

  • Run - executing the query
  • Stop - terminating running query
  • Save - saving query script
  • Chart - visualize query results
  • Follow - opens new query tab with current results. This comes handy when you want to work with intermediate results and return back to original script when done. Note that the results will change if the original script is changed and executed. The resultset is not quaranteed to be available after closing original script tab.
  • Data opens up the data browser. Chart visualizes the results.
  • Insert - helpers to create init() block for initializing variables or insert a TIMESTAMP() function.

When you execute the query then SpectX starts to show the results immediately as they start coming in. Note that due to browsers limited scrolling capability displayed resultset is limited to approx about to ~1 000 000 rows (depending on browser).

You can cancel the query at any time. The resultset ready up to that point is available for scrolling and further querying using Follow. Therefore if you’re executing a query which takes unexpectedly long time then you can safely terminate it if you feel you could work with already gathered results.

Shortcut keys:

  • CTRL+E - execute query
  • CTRL+P - display selected rows of resultset in selectable format (CSV, TSV, JSON, ...). Intented to be used
    for copy/paste.
  • CTRL+S - save query script
  • CTRL+SPACE - displays context help on query and parse language commands and syntax.

Use Data Browser to select source data files and create a skeleton query script for ad-hoc queries. You can use views for developing queries on already defined virtual structure and location of source data.

Refer to Query Language of how to write queries.

Query Execution Mode

Normally the queries are executed in so called interactive mode - when user logs out or client (web browser) connection is dropped for more than a minute during the execution, then query processing gets cancelled.

The opposite to that is batch_mode - query is executed regardless of client connection state. This is useful in case of long-running queries storing results.

The execution mode can be set by query configuration parameter query.batch_mode, in query init block. See details here.

Ad hoc vs. Regular Queries

The preparation phase when analysing new data is often long and complicated. You have to get access to data, import it to a system, discover the structure, do transformations and consider necessary enrichments even before starting to think about your analysis questions. SpectX makes this phase smooth and flexible in order to quickly conduct ad-hoc analysis. There are many features to support this: the ability to include data from very different storage locations, no need to worry about importing the data, automated discovery and interactive pattern developer for identifying structure in the underlying data, etc.

However, the queries you make on a regular basis are no less important. What are the features offered by SpectX here?

When making regular queries on data with known structure, it would be nice not to have to specify the pattern and location of source data for each query. This is what SpectX offers with views. It feels and looks like writing a query in a relational database. Capturing pattern and location in a view also allows role separation between source data management, data structure definition and analytics.

SpectX also provides an API for automating execution and integration with other applications.

Last but not least, when you’re dealing with log management in a large organization and experiencing long implementation cycles of changes in data structure, you might want to think about rearranging the log management process. SpectX’s real-time data extract and transformation offers possibilities for eliminating bottlenecks in data preparation pipeline. Read more in this whitepaper.

Understanding Timezones

Timezones come into play on two occasions: a) when the time info enters into SpectX (i.e when parsing time fields) and when it leaves SpectX (i.e timestamps are being displayed or outputted via the API).

During parsing, timezone information may or may not be present in the time field. The default behaviour in this case is to use timezone from the field or a default timezone (UTC). In case you need any other timezone to be used, you need to specify this as argument to TIMESTAMP matcher. See more at parsing Date and Time.

When outputting to SpectX browser, based UI or API timestamps are converted to strings using timezone according to User Properties Timezone setting.

See also the case when time field does not have year here.