Executing Saved Script¶
When executing a script, the execution statement of the saved script is run.
scripts which does not have an execution statement can not be executed.
Saved scripts allow to encapsulate normalized view to parsed log data.
Example: suppose we have a script stored as /user/example.sx in the Resource Tree:
@src = dual(5) | select(datetime:t, string:s, ipaddr:ip); @src;
Execute the saved script:
@[/user/example.sx] | filter(ipaddr != 0.0.0.0);
|2019-10-17 10:43:04.423 +0000||1ho1||0.0.0.1|
|2019-10-17 10:43:04.424 +0000||2ho2||0.0.0.2|
|2019-10-17 10:43:04.425 +0000||3ho3||0.0.0.3|
|2019-10-17 10:43:04.426 +0000||4ho4||0.0.0.4|
Executing Script With Arguments¶
You can pass values of input arguments to the called script. These must be declared in the INIT block of the called script.
@[./path/to/script](arg_name:value ,… );
The input arguments are supplied as a comma-separated list of name-value pairs, enclosed in parenthesis. The argument name and value are separated by a colon “:”.
When calling a script from SpectX REST API the input arguments are supplied as uri parameters:
Example: Script stored as /user/example1.sx returns rows of data, where field user contains value specified by input argument “customer”:
1 2 3 4 5 6 7 8 9 10 11 12
init( customer::STRING //declare input argument 'customer' ); // declare statement returning five rows of data consisting of three fields @src = dual(5) | select(datetime:t, user:s, ipaddr:ip) | filter(user = $customer) ; //execute declared statement @src;
Execute the saved script, with passing the value “1ho1” in the input argument “customer”:
|2019-10-17 11:15:15.123 +0000||1ho1||0.0.0.1|
Stored scripts provide a way to implement business logic within a script file. The query script can be used by end-users “as-is”, without the need of copying or re-implementing it. While the code is abstracted away it stays captured in well-organized files in the Resource Tree according to the needs and structure of teams in an enterprise.
Even further it enables separation of roles between analytics and source data management.
There are many use cases related to stored scripts.
Normalized view to parsed source data
Data normalization means preparing parsed data for making queries: cleaning (i.e filtering out undesired values), enriching (i.e compute new fields), transforming to desired type or format, joining additional datasets, etc.
When working with large datasets it is worth to implement default limitations to prevent users from overloading SpectX with long-running queries.
Speeding up query processing
Logs stored in files are naturally partitioned by the log rotation process. Their creation time (either the files last modified timestamp or a date embedded in the name or storage path) can be used in choosing which ones are relevant to a particular query, thereby avoiding a full scan of all source files.
The recommended way of implementing this is by applying a filter statement to the output of LIST selecting relevant files based on their last modified or path time. The desired period can be specified with script input arguments.
Integrating enterprise applications
Executing stored queries over SpectX API provides a simple way for enterprise applications to extract information from logs. The ability to supply scripts with input parameters enables controlling various aspects of the query (for instance the scope) without the need to implement the query in the application.
Sharing data securely within an enterprise or with 3’rd parties
Stored scripts allow removing a restricted set of fields from log records, leaving out sensitive data. These can also be anonymized.
Example: Integrating SpectX with a customer support web application. To assist a customer with logins to the company provided portal the customer support application has a web page with three input parameter fields: from and to dates, username. By pressing “Get login records” button, the form launches stored SpectX query with given input parameter values.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
init( from::TIMESTAMP, to::TIMESTAMP, user::STRING ); $mypattern = <<< END TIMESTAMP('yyyy-MM-dd HH:mm:ss Z'):time '\t' IPADDR:ip '\t' LD:username '\t' INT:response EOL END; LIST(src:'s3s://spectx-docs/logs/auth/$yyyy$/$MM$$dd$.*.*.log') | filter(path_time >= $from AND path_time <= $to) | PARSE(pattern:$mypattern) | filter(time >= $from AND time <= $to AND username = $user) | select(time, ip, country:CC(ip), username, response_code:response) ;
- lines 2-4 declare input arguments
- lines 7-12 define the pattern for extracting fields from source data
- line 14 executes LIST command using time patterns
- line 15 includes only those files with path_time within the desired timeframe (avoiding parsing all files)
- line 16 parses data from listed files
- line 17 includes in the resultset only records within desired timeframe and username
- line 18 enriches the resultset with the country code of the IP-address
When saving the script as /user/customer_login_view.sx and executing it over SpectX API using curl utility (right click on saved script file to display Properties, and select API access tab to generate curl command):
curl -XPOST -G \ -H "Accept: application/ccsv" \ -H "Authorization: Bearer <insert_your_token_here>" \ -d "scriptPath=%2Fuser%2Fcustomer_login_view.sx" \ --data-urlencode "from=2016-01-01 00:00:00.000 +0000" \ --data-urlencode "to=2016-02-01 00:00:00.000 +0000" \ -d "user=overmelt" \ http://localhost:8388/API/v1.0/
use should get following result:
|2016-01-04 13:26:43.000 +0000||22.214.171.124||JP||overmelt||404|