Configuration

This section details the configuration settings for ExtAPI SourceAgent’s plugin. These settings can be changed by overriding the default values set using the plugin’s configuration wizard, or by editing the plugin’s configuration file conf/sa-plugin-extapi.conf later on.

The configuration file is of the same format as SourceAgent’s configuration file, and gets processed by the SourceAgent the same way as its own configuration file.

It contains the following settings.

Plugin settings

  • plugin.extapi.fetch.thread.count - optional integer declaring maximum number of IO threads the plugin has to use for fetching. Value of 0 means it uses twice as many threads as there processors/cores available (the default).
  • plugin.extapi.fetch.direct - optional boolean telling if the plugin should fetch the content directly to target files (value of true), or the content should be fetched to temporary randomly named files first (value of false, the default).
  • plugin.extapi.db.file - optional string declaring a path to a writable database file to hold fetching metainfo. The default value is ${SA_HOME}/extapi.db.
  • plugin.extapi.compression.enabled - optional boolean telling if sxgzip compression of fetched content is enabled (true) by default.
  • plugin.extapi.compression.thread.count - optional integer declaring maximum number of IO threads the plugin has to use for compression of fetched files. Value of 0 means it uses twice as many threads as there processors/cores available. By default, the lesser of 2 or number of cores is used.
  • plugin.extapi.compression.level - optional integer specifying a compression level. Allowed values are from 1 fastest (worst) compression to 9 best (slowest) compression. Default is 6.
  • plugin.extapi.compression.ptBlockSize - optional integer specifying length of plain text block to be compressed in parallel, in bytes. Default 1000000.
  • plugin.extapi.compression.includeFileInfo - optional boolean telling if original filename to be stored in the output compreesed file. Default is true.
  • plugin.extapi.compression.compressEmptyFiles - optional boolean telling if compression of empty files is required. Default is false (compression will produce zero-sized files for inptu empty files).

API fetching common settings

The plugin supports fetching from multiple APIs. To distinguish between fetching configurations for each API, it utilizes a notion of name for each individual fetcher. This way, settings for each are addressed as:

plugin.extapi.api.<name>.<parameter>

where <name> declares an internal name for a fetching configuration, and the <parameter> stands for a name of a specific parameter to be applied to the configuration.

Each fetching configuration shares a set of common parameters, and also uses parameters specific to the target API.

The common settings are:

  • plugin.extapi.api.<name>.type - mandatory string setting declaring a type of API for fetching configuration. The valid values for this setting are: microsoft_azure, microsoft_azure_ad, microsoft_office365, google_workspace_activities, google_workspace_reports.
  • plugin.extapi.api.<name>.container - string setting specifying SourceAgent’s container to be used for storing fetched content. Note that the filesystem directory behind this container must be writable to the SourceAgent’s process. Default value is “extapi”, but it is always checked on startup that the container with specified name exists in SourceAgent’s configuration.
  • plugin.extapi.api.<name>.path - optional parametrized path specifying layout of the directories and files within the specified above container. The plugin will fetch the content from target API and store it in the container according to the specified layout. The following placeholders are allowed to be in the value of this setting:

    • $app$ - will be replaced with a content feed name (“application”) in the API
    • $yyyy$, $MM$, $dd$, $HH - will be replaced with an year, month, day and hour of a time period the the fetched content is for, correspondingly. You also can use $yy$ for 2-digit year, and $M$, $d$, $H for month, day and hour having 1 digit if the corresponding value is lesser than 10.

    Each API type has it’s own default value for this setting:

    • microsoft_azure- /microsoft/azure/$yyyy$/$MM$/$yyyy$$MM$$dd$.$app$.json
    • microsoft_azure_ad- /microsoft/azure-ad/$yyyy$/$MM$/$yyyy$$MM$$dd$.$app$.json
    • microsoft_office365 - /microsoft/office365/$yyyy$/$MM$/$yyyy$$MM$$dd$.$app$.json
    • google_workspace_activities - /google_workspace/activity/$yyyy$/$MM$/$yyyy$$MM$$dd$.$app$.json
    • google_workspace_reports - /google_workspace/reports/$yyyy$/$MM$/$yyyy$$MM$$dd$.$app$.json

    thus declaring the content to be fetched into application-specific daily files under directories designating month, year, API sub-type and type correspondingly. In case of high volumes we recommend to use hourly files, that is to include $HH into layout specification.

  • plugin.extapi.api.<name>.timeZone - optional string specifying timezone name to be used for building the above specified layout (as defined in IANA Time Zone Database). Default is system default time zone.

  • plugin.extapi.api.<name>.apps - optional string containing comma-separated list of colon-separated key-value pairs. Key specifies an “uri” name of content feed name ($app$ used in plugin.extapi.api.<name>.path), value is composed of |-separated triple: the API content feed name, expected lag time (timeunit) and so-called update frequency (timeunit). The update frequency reflects how often the API’s content feed is expected to be updated. The value is empirically determined and should not exceed the availability lag time declared by the API maintainer. It is used to re-fetch events from the feed which last fetch time is within the availability lag and the time since last fetch is bigger than update frequency.

    These values are API-specific and will be described below for each supported external API separately.

  • plugin.extapi.api.<name>.timeWindow - optional timeunit value specifying a period events to be retrieved from API’s content feeds for. Each API has its own default value for this.

  • plugin.extapi.api.<name>.connectTimeout - optional non-negative long integer specifying connection timeout in milliseconds. A timeout of zero is interpreted as an infinite timeout. The default is 10000.

  • plugin.extapi.api.<name>.readTimeout - optional non-negative long integer specifying read timeout in milliseconds. A timeout of zero is interpreted as an infinite timeout. The default is 60000.

  • plugin.extapi.api.<name>.maxErrorRetries - optional integer specifying a number of times SpectX tries to get access to a requested resource before giving up in case it is inaccessible due to network problems. The default is 3.

  • plugin.extapi.api.<name>.userAgent - optional string specifying a software agent name to be used when communicating with the external APIs. Default value is composed of a string “SpectX” and a current plugin version designator.

  • plugin.extapi.api.<name>.rate.limit and plugin.extapi.api.<name>.unit specify the maximum amount of requests to the API the plugin can issue within specified time period correspondingly. So the first is long integer, and the second is a timeunit. If both are unspecified, no limitation is applied.

API-specific settings

Google Workspace Admin SDK Reports API

The plugin uses Google’s Server-to-Server Oauth 2.0 protocol for authentication and authorization. You need a service account to authenticate SpectX to Google Workspace Admin API. The service account must be granted the domain wide authority in order to access end-users activity logs.

Hint

See step-by-step instructions for configuring access to Google Workspace.

For accessing both Activities and Reports, the plugin requires the following two settings to be defined in configuration:

  • plugin.extapi.api.<name>.email - a string specifying email address of a user having access to Google Admin APIs. The service account accesses the API on behalf of this user.
  • plugin.extapi.api.<name>.privateKey - a string containing the service account’s private key file content (JSON), encoded in Base64 format.

The default values for the above defined settings are shown below:

For Activities (having name e.g “gsuite-act”):

plugin.extapi.api.gsuite-act.timeWindow = 180 d
plugin.extapi.api.gsuite-act.apps = \
     admin:admin|2 m|1 m, \
     calendar:calendar|2 h|10 m, \
     drive:drive|2 m|10 m, \
     login:login|2 d|15 m, \
     mobile:mobile|12 h|15 m, \
     token:token|2 h|1 m, \
     groups:groups|2 h|15 m, \
     saml:saml|12 h|15 m, \
     chat:chat|3 d|15 m, \
     gplus:gplus|3 d|15 m, \
     rules:rules|2 h|15 m, \
     jamboard:jamboard|3 d|15 m, \
     meet:meet|20 m|10 m, \
     user_accounts:user_accounts|2 h|10 m, \
     access_transparency:access_transparency|2 h|10 m, \
     groups_enterprise:groups_enterprise|2 h|10 m, \
     gcp:gcp|3 d|15 m

For Reports (having name e.g “gsuite-rep”):

plugin.extapi.api.gsuite-rep.timeWindow = 450 d
plugin.extapi.api.gsuite-rep.apps = \
     customer:customer|3 d|15 m, \
     user:user|3 d|15 m

The lag times and update frequency values in the above default values are based on Google’s data availability lag times.

Microsoft APIs

The plugin uses OAuth 2.0 protocol for authentication and authorization with Microsoft API endpoints. It needs to be registered in the Azure Management Portal and granted permissions to access the respective API’s and log data.

Hint

See step-by-step instructions for configuring access to Microsoft APIs.

For accessing all three supported Microsoft APIs, the plugin requires the following three settings to be defined in configuration:

  • plugin.extapi.api.<name>.tenantId - a string specifying ID of the tenant in the Azure AD, assigned during application registration.
  • plugin.extapi.api.<name>.clientId - a string specifying an Application (client) ID assigned by Azure AD during application registration.
  • plugin.extapi.api.<name>.secretKey - a string specifying client secret.

For accessing Office 365 Management Activity API, the following setting is required additionally:

  • plugin.extapi.api.<name>.plan - a string specifying a type of Microsoft 365 or Office 365 subscription plan for your organization (ent for Enterprise plan, gcc for GCC government plan, gcc_high for GCC High government plan, dod for DoD government plan). The setting is optional and defaults to ent.

The default values for the common API fetching settings described above are as follows.

For Office 365 Management Activity API (having name e.g “o365”):

plugin.extapi.api.o365.timeWindow = 7d
plugin.extapi.api.o365.apps = \
     ad:Audit.AzureActiveDirectory|24 h|1 m, \
     exchange:Audit.Exchange|30 m|1 m, \
     sharepoint:Audit.SharePoint|30 m|1 m, \
     general:Audit.General|24 h|1 m, \
     dlp:DLP.All|30 m|1 m

For Office 365 Management Activity API (having name e.g “o365”):

plugin.extapi.api.o365.timeWindow = 7d
plugin.extapi.api.o365.apps = \
     ad:Audit.AzureActiveDirectory|24 h|1 m, \
     exchange:Audit.Exchange|30 m|1 m, \
     sharepoint:Audit.SharePoint|30 m|1 m, \
     general:Audit.General|24 h|1 m, \
     dlp:DLP.All|30 m|1 m

For Azure Active Directory Activity reports API (having name e.g “azure-ad”):

plugin.extapi.api.azure-ad.timeWindow = 30 d
plugin.extapi.api.azure-ad.apps = \
     audits:audits|2 m|1 m, \
     interactivesignins:signins|2 m|1 m, \
     noninteractivesignins:ni-signins|2 m|1 m, \
     serviceprincipalsignins:sp-signins|2 m|1 m, \
     managedidentitysignins:mi-signins|2 m|1 m

For Azure Monitor REST API (having name e.g “azure”):

plugin.extapi.api.azure.timeWindow = 90 d
plugin.extapi.api.azure.apps = \
     Tenant-level:|2 m|1 m, \
     <Subscription log name>:<Subscription ID>|2 m|1 m

The latter uses empty API content feed name in it’s plugin.extapi.api.<name>.apps for specifying tenant level logs.