Visualization

SpectX can visualize resultset data using Line Chart , Bar Chart or geographical Map charts.

Line and bar can be displayed by clicking on Chart menu button (made visible when resultset is created).

Map can be displayed by clicking on Map menu button (made visible when resultset contains at least one column with GEOPOINT data type).

Line Chart

Line chart is displayed when you click on Chart menu button and the resultset contains:

  1. at least one monotonic sequence column (i.e containing monotonically nonincreasing or nondecreasing numeric values), which must appear before (left of) from unique categorical values column (if any). This column is used for values on x-axis. When there is more than one monotonic sequence columns in the resultset then the leftmost column is used for values on x-axis.
  2. at least one numeric column, which will be used for values on y-axis. Columns containing non-numerical values are ignored.

The column aliases or names will be used as series labels. The name of the query script is used for chart title.

Example 1. Let’s compute sin and cos over angle from 0 to 1000 degrees. We can use DUAL generator to generate angle values. To demonstrate that non-numerical data is ignored let’s add a column with categorical values (country code strings):

1
2
3
4
5
6
7
dual(1000)          //generate 1000 rows of data
 .select(
   sin(RADIANS(i))  //non-monotonic numeric column is used as data series with values on y-axis.
   ,cos(RADIANS(i)) //non-monotonic numeric column is used as data series with values on y-axis.
   ,angle:i         //monotonic sequence column is used as x-axis values.
   ,country:cc(IPADDR(i*100000000)) //categorical values are ignored
);

Lower diagram displays full range of data. Grayed box marks zoom area which is displayed in upper diagram. Zoomed area can be changed by dragging or moving grayed box. Hovering mouse pointer over upper diagram will display values of data points. Clicking on data series labels on lower left corner will hide/display respective series.

The menu on upper left corner allows you to save the chart as picture.

../../_images/line_chart.png

Bar Chart

Bar chart is displayed when you click on Chart menu button and the resultset contains:

  1. At least one unique categorical values [1] column, which must appear before (left of) from monotonic sequence column (if any). This column is used for values on x-axis. If there is more than one such column the leftmost will be used for values on x-axis.
  2. at least one numeric column, which will be used for values on y-axis. Columns containing non-numerical values are ignored.

Example 2. Let’s display counts of country codes of a set of generated ip-addresses. As categorical values column (country) appears before the mononic sequence column (rowid) we will get a bar chart. To demonstrate that the categorical values column does not have to be the very first we add a numeric column of random values (foo):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
dual(1000)
 .select(IPADDR(i*100000000) as ipaddr)  //compute 1000 ip addresses
 .select(CC(ipaddr) as country,count(*) as cnt).group(country) //compute unique country codes and counts
 .select(
    foo:RANDOM()*100  //numeric columns can appear before categorical
   ,country           //unique categorical values column: country code strings
   ,ROWID()           //monotonically increasing numeric column
   ,cnt               //numeric column: count of country codes
 )
;
../../_images/bar_chart.png
[1]A set of unique non-numeric values (string, ipaddr, etc).

Map

Map chart is intended to display numerical information related to a geographical location. Hence the menu button Map is made visible only when resultset contains at least one column with GEOPOINT data. When there is more than one GEOPOINT column in the resultset then the leftmost is used.

Map can display only one series numerical data.

Map computes grouped minimum, maximum, average, sum or count. To do so it looks for explicit column names in the resultset: min, max, avg, sum. If none of those is present then count is performed. If there are more than one of explicitly names columns in resultset then the leftmost is used.

To display:

  1. minimum include in resultset a numerical column named min. For each value in GEOPOINT column the minimum of numeric values is computed and displayed.
  2. maximum include in resultset a numerical column named max. For each value in GEOPOINT column the maximum of numeric values is computed and displayed.
  3. average include in resultset a numerical column named avg. For each value in GEOPOINT column the average of numeric values is computed and displayed.
  4. sum include in resultset a numerical column named sum. For each value in GEOPOINT column the sum of numeric values is computed and displayed.
  5. count is computed directly on GEOPOINT column values when none of the explicit columns are present in resultset.
  6. arbitrary numeric value place it in the column named count (or sum). The values must be computed for unique GEOPOINT values (to prevent Map summing them).

The aggregation range is set by the zoom area.

Example 3. Displaying count of requests from different geolocations. We’ll use apache access log from our examples dataset. We just need to add GEOPOINT column to the parsed data:

1
2
3
4
5
6
7
8
$pattern = $[/user/examples/patterns/apache_access.sxp];
@list    = LIST('sx:/user/examples/data/apache_access.log.sx.gz');
@stream  = PARSE(pattern:$pattern, src:@list);

@stream
 .select(GEO(clientIp), *)  //add a GEOPOINT column to resultset: geolocation of customer ip-address
 .sort(timestamp)           //sort resultset by time
;

Lower diagram shows the range of data: the row numbers of resultset. Grayed box marks the range of data used for computing aggregated values displayed in the map above. You can change the range by stretching or moving the gray box.

You can also zoom in or out on the map by clicking ‘+’ or ‘-‘ icons. The polygon icon allows you to select an area, which produces a GEO_POINT_IN() function statement which can be copy-pasted to filter or WHERE statements.

Note that since our dataset is ordered by time we can nicely let the map play the activity over time. Just set the zoom area and map mode (cluster or heatmap) according to your taste and click on play.

../../_images/map.png

Example 4. Displaying arbitrary values on map. Let’s visualize the population of 10 biggest cities of the world.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
@cities = VALUES([
   {city: 'Shanghai', lat: 31.252282D, lon: 121.472832D, population: 24256800}
  ,{city: 'Beijing', lat: 39.918204D, lon: 116.396002D, population: 21516000}
  ,{city: 'Delhi', lat: 28.672417D, lon: 77.052872D, population: 16787941}
  ,{city: 'Lagos', lat: 6.511167D, lon: 3.311421D, population: 16060303}
  ,{city: 'Tianjin', lat: 39.139482D, lon: 117.151161D, population: 15200000}
  ,{city: 'Karachi', lat: 24.888304D, lon: 67.000581D, population: 14910352}
  ,{city: 'Istanbul', lat: 41.009915D, lon: 28.972359D, population: 14160467}
  ,{city: 'Tokyo', lat: 35.726500D, lon: 139.729800D, population: 13513734}
  ,{city: 'Guangzhou', lat: 23.147794D, lon: 113.252625D, population: 13080500}
  ,{city: 'Mumbai', lat: 18.969996D, lon: 72.819438D, population: 12442373}
]);

@cities
 .select(GEO(lat, lon) as loc  //compute GEOPOINT column
         ,population as count  //name column with population as 'count'
);
../../_images/map1.png