gs://

The schemes (gs:// and gss://) provide access to the Google Cloud Storage (GCS). The latter uses TLS to secure underlying communication with GCS.

Using gs:// in SpectX

The implementation of gs:// and gss:// protocols do not support host-less URI notations, and always require either a Google Cloud bucket name or datastore name to be specified as host in the URI.

Datastore configuration

UI

Configuration parameters for both gs:// and gss:// datastore definition:

Name Description
Store name unique name among all defined DataStores
Bucket name of the target Google Cloud Storage bucket storing blobs.
Private Key Google service account private key. Optional, needed for accessing private buckets.
Directory delimiter directory separator in a file name. Optional. When empty then default “/” is assumed.
Is cacheable enables caching data by Processing Units
Hot cache period limits time related data caching to the period specified
Read ACL specifies blob read ACL

Filesystem

gs:// and gss:// datastore definition files are of JSON structure of the following formats correspondingly (optional parameters can be omitted):

{
  "type": "GS",
  "gsStore": {
    "bucket": "<bucket>",
    "privateKey": "<privateKey>",
    "directoryDelimiter": "<directoryDelimiter>",
    "isCacheable": <isCacheable>,
    "hotCachePeriod": "<hotCachePeriod>",
    "connectTimeout": <connectTimeout>,
    "readTimeout": <readTimeout>,
    "maxErrorRetries": <maxErrorRetries>,
    "userAgent": "<userAgent>",
    "acl": {<rACL>}
  }
}
{
  "type": "GSS",
  "gssStore": {
    "bucket": "<bucket>",
    "privateKey": "<privateKey>",
    "directoryDelimiter": "<directoryDelimiter>",
    "isCacheable": <isCacheable>,
    "hotCachePeriod": "<hotCachePeriod>",
    "connectTimeout": <connectTimeout>,
    "readTimeout": <readTimeout>,
    "maxErrorRetries": <maxErrorRetries>,
    "userAgent": "<userAgent>",
    "anonymousTtl": "<anonymousTtl>",
    "acl": {<rACL>}
  }
}

where

  • <bucket> name of the target Google Cloud Storage bucket storing blobs. A string. Mandatory parameter
  • <privateKey> Google service account private key. Optional, needed for accessing private buckets. A string. Optional
  • <directoryDelimiter> directory separator in a file name. Optional, when empty then default “/” is assumed. A string
  • <isCacheable> enables caching data by Processing Units. Optional. Default is “false”. A boolean (“true” or “false”)
  • <hotCachePeriod> limits time-related data caching to the period specified. A time period
  • <connectTimeout> is a connection timeout in milliseconds. A timeout of zero is interpreted as an infinite timeout. The default is 10000. A non-negative long integer
  • <readTimeout> is a read timeout in milliseconds. A timeout of zero is interpreted as an infinite timeout. The default is 60000. A non-negative long integer
  • <maxErrorRetries> is number of times the SpectX tries to get access to a requested resource in case it is inaccessible due to network problems until giving up. The default is 3. An integer
  • userAgent is a value for software agent name to be used when communicating with the cloud. Default value is composed of a string “SpectX” and a current software version designator. A string
  • <anonymousTtl> is a TTL for an anonymous access token to be used by processing units for processing blob content during query execution, in milliseconds. The default is 30000. A non-negative long integer
  • <rACL> is a definition of a blob read ACL for the datastore. A map.