Configuration¶
This section details the configuration settings for SpectX Server. They can be changed by overriding the default values specified during the first run, or by editing the SpectX configuration file later on.
Configuration items in SpectX configuration file conf/sx.conf
follow
Java properties format.
Values specified in config file override default values used by the server. Any change in the configuration
requires server restart for the change to take effect.
The values can include any number of constructions of the form “${PROP}” where PROP denotes a name of the environment variable, or, if the variable is undefined, then a system property name set using -D command-line option for java virtual machine. During configuration reading, the SpectX is substituting these with actual values of the environment variables/system properties. These can be set in environment variable definition script.
One of the properties actively used in default configuration template is SPECTX_HOME, which is set by startup script to the value of environment variable of the same name.
Configuration properties can be defined as environment variables instead of a configuration file or to override certain
properties of the configuration file. When setting environment variables, you’ll need to prefix the configuration
property key name with spectx.
, i.e. spectx.wgui.port
or spectx.engine.pu_count
.
NB! Do not rename the SpectX configuration file. SpectX searches for configuration file only by the name ${SPECTX_HOME}/conf/sx.conf.
This file will not be modified or deleted during upgrades in the future.
Limiting usage of CPU cores¶
engine.pu_count
- integer value sets the max number of CPU cores SpectX can use for processing queries. Max value can not exceed the number of real CPU cores in the machine and the number of allowed CPU cores by license. Default value 0 instructs SpectX to use the max number of CPU cores (either of real or allowed by license).
Directories¶
sx.user_data.dir
- specifies name and location of main directory of users resources (scripts, patterns, datastore definitions). Default:${SPECTX_HOME}/data
sx.pu_data.dir
- specifies name and location of processing data directory. Default:${SPECTX_HOME}/pudata
sx.pu_data.temp.dir
- specifies name and location of directory for temporary data. Default:${SPECTX_HOME}/pudata/temp
sx.pu_data.store.dir
- specifies name and location of directory for persisted data. Default:${SPECTX_HOME}/pudata/store
sx.pu_data.inetdb.dir
- specifies name and location of directory for temporary geoip data. Default:${SPECTX_HOME}/pudata/inetdb
sx.pu_data.cache.dir
- specifies name and location of directory for cache. Default:${SPECTX_HOME}/pudata/cache
sx.db.dir
- specifies name and location of directory of user database (sxwgui.db). Default:${SPECTX_HOME}
sx.engine_data.dir
- specifies name and location of a directory used by SpectX to keep data like query history and logs fetched via Google API. Default:${SPECTX_HOME}/engine_data
sx.engine_data.user_history.dir
- specifies name and location of directory for query history data. Default:${SPECTX_HOME}/engine_data/user_history
sx.engine_data.google_api.dir
- specifies name and location of directory for logs fetched via Google API (like G Suite reports). Default:${SPECTX_HOME}/engine_data/google_api
sx.engine_data.microsoft_api.dir
- specifies name and location of directory for logs fetched via Microsoft API (like Office 365 and Azure AD audit logs). Default:${SPECTX_HOME}/engine_data/microsoft_api
sx.pu_data.cache.enabled
- enables or disables source data caching. Default disabled.sx.pu_data.cache.max_size
- specifies max disk space allocated for source data caching. Units: ‘GB’ - gigabytes, ‘MB’ - megabytes, ‘KB’ - kilobytes. Default value: 0GB, which means no limit.
Web UI server parameters¶
wgui.host
- specifies hostname or ip-address of interface where web UI server is listening. Default: 127.0.0.1wgui.port
- specifies listening port. Default: 8388wgui.instanceName
- string to display in Web UI under SpectX logo on login page and in main view. Optional, empty by default. Available only in Server Editionwgui.maxReqHeaderSize
- maximum size of a request header in bytes. Read: not individual HTTP header line, but whole request header containing all header lines. Default value is 8192 bytes.wgui.dataBrowser.preview_size
- specifies the amount of bytes fetched for file preview in Data Browser. Default: 16Kbwgui.dataBrowser.max_items_to_fetch
- specifies the max number of items for listing in Data Browser. When there are more items then a warning is displayed. Default: 4000wgui.dataBrowser.download.enabled
- specifies if the download of a blob is enabled in Data Browser. Default: truewgui.dataBrowser.showBlobsDisallowedInACL
- if set to false (the default) then blobs disallowed from reading by blob read ACL are not displayed in the data browserwgui.remoteIPAddressHeader
- specifies name of a HTTP header containing clients remote IP address. The header name is case insensitive. Use this only when SpectX runs behind a trusted frontend server which is configured to forward real client remote address to the backend.wgui.log.dir
- path to existing writable directory to write server logs to. If not specified then filesystem logging is disabled.wgui.log.rotate
- boolean parameter enabling automatic daily log rotation in the log directory. Default value is true.wgui.log.tz
- time zone ID (as defined in IANA Time Zone Database) to be used for creating log file names (when wgui.log.rotate = true) and timestamps in log file records. Default value is “UTC”.wgui.log.level
- Log level of debug logging. Possible values are: trace, debug, info, warn, error. The warn is the default. Note that the specified log level can be overridden by -v command line switches, as described in Log verbosity section.wgui.userAdminGroup
- name of the group for UserAdmin role assignment.wgui.api.allowedCORSOrigins
- comma-separated list of allowed CORS origins (wildcard * is allowed as a separate list element). If not specified, CORS is not supported.wgui.api.query_retention_period
- timeunit. period to retain in progress async API query for, after last client status request. Default: 5 minutes.wgui.api.result_retention_period
- timeunit. period to retain finished async API query result for, after last client fetch request. Default: 5 minutes.
GeoIP, ASN, MAC databases¶
For performing GeoIP, ASN and MAC manufacturer information lookups SpectX needs respective databases. The following configuration items allow us to set up downloading and updating them in different environments. For example, if the host has direct access to the Internet, the databases can be updated directly from the supplier’s websites. In the case of closed environments, the update location can be set to the local filesystem, therefore, leaving control over the updating process entirely to the customer.
Note
Starting December 30, 2019, MaxMind will be requiring users of GeoLite2 databases (providing GeoIP and ASN information without charge) to register for a MaxMind account and obtain a license key in order to download GeoLite2 databases. See step-by-step instructions here.
inetdb.maxmindLicenseKey
- when set to your MaxMind license key, geoip & ASN database updates are fetched from MaxMind.inetdb.geoip.resourceUrl
- a http/https url or local filesystem path specifying MaxMind geoip database update location. Takes precedence over maxmindLicenseKey.inetdb.as.resourceUrl
- a http/https url or local filesystem path specifying the MaxMind ASN database update location. Takes precedence over maxmindLicenseKey.inetdb.macmanuf.resourceUrl
- a http/https url or local filesystem path specifying the MAC manufactures database update location. Default value is: http://update.spectx.com/mac_manuf/mac_manuf.tsv.gzinetdb.*.updateInterval
- specifies interval SpectX looks for updates of respective databases. The value is in the following time units: ms, sec, min, hour, day, week. Default value: 1 day.
Note
You cannot use resource URIs and the license key simultaneously. Pick one option and comment out the other(s).
Database connectivity drivers¶
The SQLite database connectivity driver is included in the default installation. Additional JDBC driver libraries (.jar) must be installed to the ${SPECTX_HOME}/lib directory. SpectX must be restarted before a newly added driver becomes active.
The drivers are loaded according to the list in configuration entry engine.jdbc.allowed_drivers
, driver
names are separated by colon symbols, for instance:
engine.jdbc.allowed_drivers=oracle.jdbc.OracleDriver:org.postgresql.Driver
Logging¶
SpectX produces logs of the following types:
- audit log - login, logout, password change, account modification events
- query execution log - query execution details
- query execution error log - contains failed/cancelled query execution events with stack traces
- debug log - containing details of query processing for debugging purposes.
Record format¶
The audit log contains single line records with the following tab-separated fields (length is limited to maximum 1000 chars):
- timestamp in format
yyyy-MM-dd HH:mm:ss.SSS Z
- log_type optional field containing string value audit. Is present only when log destination is set to stdout
- user’s IP address
- session ID
- action name (login/logout/passwordChange etc)
- username of a user performing the action
- outcome (ok/failure)
- authentication type
- optional descriptive message.
Query execution log contains single line records with the following tab-separated fields (field value length is restricted to not exceed 1000 chars):
- timestamp in format
yyyy-MM-dd HH:mm:ss.SSS Z
- log_type optional field containing string value execution. Is present only when log destination is set to stdout
- user’s IP address
- session ID
- query ID
- action name (submit, schedule, exec, etc)
- outcome (ok/canceled/failure)
- username of a user performing the action
- JSON with payload depending on the action (executed script’s path and base64-encoded script, execution stats info)
- descriptive message, if any.
Execution error log record has the same format as execution audit log record with one additional field, which contains the error’s stack trace which spans multiple lines. This type of logging is performed only for unsuccessful query execution events.
Debug log contains single line records with the following tab-separated fields:
- timestamp in format
yyyy-MM-dd HH:mm:ss.SSS Z
- log record’s log level indicator
- thread name
- logger name (java class name)
- log message (can expand over multiple lines).
Destination¶
Unless the -q
command-line switch is specified for the SpectX server Java process,
the server prints all log messages to standard output.
In this case, the log records can be distinguished by an additional log_type field inserted after the timestamp. The field
contains values: audit/execution/execution_error. Note that debug log messages do not have log_type field.
The logging directory path for standard output and error used by startup scripts can be set in the SpectX environment variable definition script as a value for SPECTX_STD_LOG_DIR variable.
Note that SpectX startup script on Linux, Arch Linux ARM and Mac OSX does specify this switch.
To enable logging to files, you must specify valid directory path to logging directory in the configuration file using wgui.log.dir option. The server then produces daily-rotated log files under that directory, each being put under the monthly-rotated directory, which in turn is located in the yearly-rotated parent directory:
logs/
└── YYYY/
├── MM/
│ ├── YYYY.MM.DD.debug.log
│ ├── YYYY.MM.DD.audit.log
│ ├── YYYY.MM.DD.execution.log
│ ├── YYYY.MM.DD.execution_error.log
│ └── ...
└── ...
If value of wgui.log.rotate is set explicitly to false, the layout of the log directory will be flat, and names of produced log files will not contain timestamps:
logs/
├── debug.log
├── audit.log
├── execution.log
└── execution_error.log
The rotation of log files then can be accomplished utilizing external tools (e.g. logrotate) supporting copy-and-truncate log rotation scenarios.
Timestamps in log records printed to stdout are in system default time zone, however timestamps in log file records and logfile names are in time zone specified by wgui.log.tz in the configuration.
Note that if the default log configuration gets overridden by any external means, the -q
command line argument gets
unsupported, as well as configuration options wgui.log.dir,
wgui.log.rotate and wgui.log.tz.
The verbosity of debug log¶
The verbosity of debug logging is controlled by configuration setting
wgui.log.level. However, the value set in configuration can be overridden by
specifying command line argument -v
as follows:
-v
sets log level to INFO-vv
sets log level to DEBUG-vvv
sets log level to TRACE
The argument can be specified as follows:
- On Windows it is supplied as an argument following “run” argument” to the startup script.
- On Linux, Arch Linux ARM and Mac OSX it is specified in the SPECTX_LAUNCHER_ARGS environment variable in the environment variable definition script.
Note
The verbosity can be changed only on debug logging. Audit, query execution, and error logging takes place with built-in verbosity.
Data Access Protocols¶
Access modes¶
Configuration key engine.da.protocol.<protocol>
specifies the mode of access of the named
data access protocol for all SpectX users. The value of the <protocol>
in the key
name must conform to the URI scheme standard and must be in lower case.
For each data access protocol, the corresponding key engine.da.protocol.<protocol>
can have have one of the following values:
- unmanaged - protocol targets can be specified arbitrarily by all SpectX users
- managed - protocol targets can be defined only using data store located in /system/datastores (allowed only for users with Administrator role).
- disabled - the protocol is disabled.
The default access mode of a protocol is unmanaged. To prevent arbitrary access to the local file system the default configuration specifies has explicitly defined file protocol as managed.
Note that sx protocol is not configurable, thus specifying protocol permission in configuration causes an error at startup.
HTTP User-Agent¶
Configuration keys engine.da.http.user-agent
and engine.da.https.user-agent
specify values for HTTP “User-Agent”
header SpectX must use when communicating with http and https datastores respectively.
The engine.da.https.user-agent
defaults to a value set for the engine.da.http.user-agent
, and the latter
defaults to a string composed of the current software version string prepended with “SpectX/”.
The values set by these keys can be overridden either by a respective datastore configuration
or by specifying custom values in scripts.
License file¶
SpectX license file is named spectx.lic
and is located in ${SPECTX_HOME} directory.
Prior to version v1.4.51, the name of the license used to be spectx-license.jar
with location in
${SPECTX_HOME}/lib directory.
Admin password reset¶
Should you need to reset password of ‘admin’ user, you can call command-line startup script as follows:
On Linux and OSX:
user@host:~/spectx$ bin/spectx.sh passwordreset
On Windows:
PS C:\spectx> bin\spectx.exe run --passwordreset
This invokes password reset dialog.
Deprecated¶
engine.fs_unmanaged_access
DEPRECATED. enables or disables unmanaged file system access using file:// protocol. When enabled then all SpectX users can use file:// protocol to access local file system within the rights of the local machine user, under which SpectX is executed. When disabled then file system can be accessed only by datastores defined in /system/datastores. (Note that defining datastores in /system/datastores is allowed only for users with the admin role.) Default disabled.
For the sake of configuration backwards compatibility the values of these keys, unless the new key “engine.da.protocol.file” is explicitly specified, are being converted internally as follows:
engine.fs_access | engine.fs_unmanaged_access | engine.da.protocol.file |
---|---|---|
true | false or unset | managed |
true | true | unmanaged |
false or unset | disabled |