AVRO

Parses Apache AVRO format.

AVRO(src:uri_expr, schema:schema_str)
where:
  • uri_expr specifies URI(s) of input files. This can be either in the same format as src argument of LIST command or reference to the LIST command.
  • schema_str is Json formatted STRING specifying schema.

Data is deserialized according to the type defined in the schema. The schema is automatically extracted from the header and can be modified.

Example: Open Input Data Browser, navigate to https://github.com/Teradata/kylo/blob/master/samples/sample-data/avro/userdata1.avro?raw=true. Pressing “Prepare Query” will generate the following query:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$schema = <<<AVRO_SCHEMA
{"type":"record","name":"kylosample","doc":"Schema generated by Kite",
 "fields":[{"name":"registration_dttm","type":"string","doc":"Type inferred from '2016-02-03T07:55:29Z'"},
           {"name":"id","type":"long","doc":"Type inferred from '1'"},
           {"name":"first_name","type":"string","doc":"Type inferred from 'Amanda'"},
           {"name":"last_name","type":"string","doc":"Type inferred from 'Jordan'"},
           {"name":"email","type":"string","doc":"Type inferred from 'ajordan0@com.com'"},
           {"name":"gender","type":"string","doc":"Type inferred from 'Female'"},
           {"name":"ip_address","type":"string","doc":"Type inferred from '1.197.201.2'"},
           {"name":"cc","type":["null","long"],"doc":"Type inferred from '6759521864920116'","default":null},
           {"name":"country","type":"string","doc":"Type inferred from 'Indonesia'"},
           {"name":"birthdate","type":"string","doc":"Type inferred from '3/8/1971'"},
           {"name":"salary","type":["null","double"],"doc":"Type inferred from '49756.53'","default":null},
           {"name":"title","type":"string","doc":"Type inferred from 'Internal Auditor'"},
           {"name":"comments","type":"string","doc":"Type inferred from '1E+02'"}]}
AVRO_SCHEMA;

@list    = LIST('https://github.com/Teradata/kylo/blob/master/samples/sample-data/avro/userdata1.avro?raw=true');
@kylosample = AVRO(schema:$schema, src:@list);

@kylosample | limit(1000);