Configuration

This section details the configuration settings for SpectX Server. They can be changed by overriding the default values specified during the first run, or by editing the SpectX configuration file later on.

Configuration items in SpectX configuration file conf/sx.conf follow Java properties format. Values specified in config file override default values used by the server. Any change in the configuration requires server restart for the change to take effect.

The values can include any number of constructions of the form “${PROP}” where PROP denotes a name of the environment variable, or, if the variable is undefined, then a system property name set using -D command-line option for java virtual machine. During configuration reading, the SpectX is substituting these with actual values of the environment variables/system properties. These can be set in environment variable definition script.

One of the properties actively used in default configuration template is SPECTX_HOME, which is set by startup script to the value of environment variable of the same name.

NB! Do not rename the SpectX configuration file. SpectX searches for configuration file only by the name ${SPECTX_HOME}/conf/sx.conf.

This file will not be modified or deleted during upgrades in the future.

Limiting usage of CPU cores

  • engine.pu_count - integer value sets the max number of CPU cores SpectX can use for processing queries. Max value can not exceed the number of real CPU cores in the machine and the number of allowed CPU cores by license. Default value 0 instructs SpectX to use the max number of CPU cores (either of real or allowed by license).

Directories

  • sx.user_data.dir - specifies name and location of main directory of users resources (scripts, patterns, datastore definitions). Default: ${SPECTX_HOME}/data
  • sx.pu_data.dir - specifies name and location of processing data directory. Default: ${SPECTX_HOME}/pudata
  • sx.pu_data.temp.dir - specifies name and location of directory for temporary data. Default: ${SPECTX_HOME}/pudata/temp
  • sx.pu_data.store.dir - specifies name and location of directory for persisted data. Default: ${SPECTX_HOME}/pudata/store
  • sx.pu_data.inetdb.dir - specifies name and location of directory for temporary geoip data. Default: ${SPECTX_HOME}/pudata/inetdb
  • sx.pu_data.cache.dir - specifies name and location of directory for cache. Default: ${SPECTX_HOME}/pudata/cache
  • sx.db.dir - specifies name and location of directory of user database (sxwgui.db). Default: ${SPECTX_HOME}
  • sx.pu_data.cache.enabled - enables or disables source data caching. Default enabled.
  • sx.pu_data.cache.max_size - specifies max disk space allocated for source data caching. Units: ‘GB’ - gigabytes, ‘MB’ - megabytes, ‘KB’ - kilobytes. Default value: 0GB, which means no limit.

Web UI server parameters

  • wgui.host - specifies hostname or ip-address of interface where web UI server is listening. Default: 127.0.0.1
  • wgui.port - specifies listening port. Default: 8388
  • wgui.maxReqHeaderSize - maximum size of a request header in bytes. Read: not individual HTTP header line, but whole request header containing all header lines. Default value is 8192 bytes.

  • wgui.dataBrowser.preview_size - specifies the amount of bytes fetched for file preview in Data Browser. Default: 16Kb
  • wgui.dataBrowser.max_items_to_fetch - specifies the max number of items for listing in Data Browser. When there are more items then a warning is displayed. Default: 4000
  • wgui.dataBrowser.download.enabled - specifies if the download of a blob is enabled in Data Browser. Default: true
  • wgui.dataBrowser.showBlobsDisallowedInACL - if set to false (the default) then blobs disallowed from reading by blob read ACL are not displayed in the data browser

  • wgui.remoteIPAddressHeader - specifies name of a HTTP header containing clients remote IP address. The header name is case insensitive. Use this only when SpectX runs behind a trusted frontend server which is configured to forward real client remote address to the backend.

  • wgui.log.dir - path to existing writable directory to write server logs to. If not specified then filesystem logging is disabled.

  • wgui.log.rotate - boolean parameter enabling automatic daily log rotation in the log directory. Default value is true.

  • wgui.log.tz - time zone ID (as defined in IANA Time Zone Database) to be used for creating log file names (when wgui.log.rotate = true) and timestamps in log file records. Default value is “UTC”.

  • wgui.log.level - Log level of debug logging. Possible values are: trace, debug, info, warn, error. The warn is the default. Note that the specified log level can be overridden by -v command line switches, as described in Log verbosity section.

  • wgui.userAdminGroup - name of the group for UserAdmin role assignment.

GeoIP, ASN, MAC databases

For performing GeoIP, ASN and MAC manufacturer information lookups SpectX needs respective databases. The following configuration items allow us to set up downloading and updating them in different environments. For example, if the host has direct access to the Internet, the databases can be updated directly from the supplier’s websites. In the case of closed environments, the update location can be set to the local filesystem, therefore, leaving control over the updating process entirely to the customer.

Note

Starting December 30, 2019, MaxMind will be requiring users of GeoLite2 databases (providing GeoIP and ASN information without charge) to register for a MaxMind account and obtain a license key in order to download GeoLite2 databases. See step-by-step instructions here.

  • inetdb.maxmindLicenseKey - when set to your MaxMind license key, geoip & ASN database updates are fetched from MaxMind.
  • inetdb.geoip.resourceUrl - a http/https url or local filesystem path specifying MaxMind geoip database update location. Takes precedence over maxmindLicenseKey.
  • inetdb.as.resourceUrl - a http/https url or local filesystem path specifying the MaxMind ASN database update location. Takes precedence over maxmindLicenseKey.
  • inetdb.macmanuf.resourceUrl - a http/https url or local filesystem path specifying the MAC manufactures database update location. Default value is: http://update.spectx.com/mac_manuf/mac_manuf.tsv.gz
  • inetdb.*.updateInterval - specifies interval SpectX looks for updates of respective databases. The value is in the following time units: ms, sec, min, hour, day, week. Default value: 1 day.

Note

You cannot use resource URIs and the license key simultaneously. Pick one option and comment out the other(s).

Database connectivity drivers

The SQLite database connectivity driver is included in the default installation. Additional JDBC driver libraries (.jar) must be installed to the ${SPECTX_HOME}/lib directory. SpectX must be restarted after a new driver could be loaded before the driver becomes active.

The drivers are loaded according to the list in configuration entry engine.db_table.allowed_jdbc_drivers, driver names are separated by colon symbols, for instance:

engine.db_table.allowed_jdbc_drivers=oracle.jdbc.OracleDriver:org.postgresql.Driver

Logging

SpectX produces logs of the following types:

  • audit log - login, logout, password change, account modification events
  • query execution log - query execution details
  • query execution error log - contains failed/cancelled query execution events with stack traces
  • debug log - containing details of query processing for debugging purposes.

Record format

The audit log contains single line records with the following tab-separated fields (length is limited to maximum 1000 chars):

  • timestamp in format yyyy-MM-dd HH:mm:ss.SSS Z
  • log_type optional field containing string value audit. Is present only when log destination is set to stdout
  • user’s IP address
  • session ID
  • action name (login/logout/passwordChange etc)
  • username of a user performing the action
  • outcome (ok/failure)
  • authentication type
  • optional descriptive message.

Query execution log contains single line records with the following tab-separated fields (field value length is restricted to not exceed 1000 chars):

  • timestamp in format yyyy-MM-dd HH:mm:ss.SSS Z
  • log_type optional field containing string value execution. Is present only when log destination is set to stdout
  • user’s IP address
  • session ID
  • query ID
  • action name (submit, schedule, exec, etc)
  • outcome (ok/canceled/failure)
  • username of a user performing the action
  • JSON with payload depending on the action (executed script’s path and base64-encoded script, execution stats info)
  • descriptive message, if any.

Execution error log record has the same format as execution audit log record with one additional field, which contains the error’s stack trace which spans multiple lines. This type of logging is performed only for unsuccessful query execution events.

Debug log contains single line records with the following tab-separated fields:

  • timestamp in format yyyy-MM-dd HH:mm:ss.SSS Z
  • log record’s log level indicator
  • thread name
  • logger name (java class name)
  • log message (can expand over multiple lines).

Destination

Unless the -q command-line switch is specified for the SpectX server Java process, the server prints all log messages to standard output. In this case, the log records can be distinguished by an additional log_type field inserted after the timestamp. The field contains values: audit/execution/execution_error. Note that debug log messages do not have log_type field.

The logging directory path for standard output and error used by startup scripts can be set in the SpectX environment variable definition script as a value for SPECTX_STD_LOG_DIR variable.

Note that SpectX startup script on Linux, Arch Linux ARM and Mac OSX does specify this switch.

To enable logging to files, you must specify valid directory path to logging directory in the configuration file using wgui.log.dir option. The server then produces daily-rotated log files under that directory, each being put under the monthly-rotated directory, which in turn is located in the yearly-rotated parent directory:

logs/
└── YYYY/
    ├── MM/
    │   ├── YYYY.MM.DD.debug.log
    │   ├── YYYY.MM.DD.audit.log
    │   ├── YYYY.MM.DD.execution.log
    │   ├── YYYY.MM.DD.execution_error.log
    │   └── ...
    └── ...

If value of wgui.log.rotate is set explicitly to false, the layout of the log directory will be flat, and names of produced log files will not contain timestamps:

logs/
├── debug.log
├── audit.log
├── execution.log
└── execution_error.log

The rotation of log files then can be accomplished utilizing external tools (e.g. logrotate) supporting copy-and-truncate log rotation scenarios.

Timestamps in log records printed to stdout are in system default time zone, however timestamps in log file records and logfile names are in time zone specified by wgui.log.tz in the configuration.

Note that if the default log configuration gets overridden by any external means, the -q command line argument gets unsupported, as well as configuration options wgui.log.dir, wgui.log.rotate and wgui.log.tz.

The verbosity of debug log

The verbosity of debug logging is controlled by configuration setting wgui.log.level. However, the value set in configuration can be overridden by specifying command line argument -v as follows:

  • -v sets log level to INFO
  • -vv sets log level to DEBUG
  • -vvv sets log level to TRACE

The argument can be specified as follows:

Note

The verbosity can be changed only on debug logging. Audit, query execution, and error logging takes place with built-in verbosity.

Data Access Protocols

Access modes

Configuration key engine.da.protocol.<protocol> specifies the mode of access of the named data access protocol for all SpectX users. The value of the <protocol> in the key name must conform to the URI scheme standard and must be in lower case.

For each data access protocol, the corresponding key engine.da.protocol.<protocol> can have have one of the following values:

  • unmanaged - protocol targets can be specified arbitrarily by all SpectX users
  • managed - protocol targets can be defined only using data store located in /system/datastores (allowed only for users with Administrator role).
  • disabled - the protocol is disabled.

The default access mode of a protocol is unmanaged. To prevent arbitrary access to the local file system the default configuration specifies has explicitly defined file protocol as managed.

Note that sx protocol is not configurable, thus specifying protocol permission in configuration causes an error at startup.

HTTP User-Agent

Configuration keys engine.da.http.user-agent and engine.da.https.user-agent specify values for HTTP “User-Agent” header SpectX must use when communicating with http and https datastores respectively. The engine.da.https.user-agent defaults to a value set for the engine.da.http.user-agent, and the latter defaults to a string composed of the current software version string prepended with “SpectX/”. The values set by these keys can be overridden either by a respective datastore configuration or by specifying custom values in scripts.

License file

SpectX license file is named spectx.lic and is located in ${SPECTX_HOME} directory.

Prior to version v1.4.51, the name of the license used to be spectx-license.jar with location in ${SPECTX_HOME}/lib directory.

Deprecated

  • engine.fs_unmanaged_access DEPRECATED. enables or disables unmanaged file system access using file:// protocol. When enabled then all SpectX users can use file:// protocol to access local file system within the rights of the local machine user, under which SpectX is executed. When disabled then file system can be accessed only by datastores defined in /system/datastores. (Note that defining datastores in /system/datastores is allowed only for users with the admin role.) Default disabled.

For the sake of configuration backwards compatibility the values of these keys, unless the new key “engine.da.protocol.file” is explicitly specified, are being converted internally as follows:

engine.fs_access engine.fs_unmanaged_access engine.da.protocol.file
true false or unset managed
true true unmanaged
false or unset   disabled