How can I find patterns in logs?ΒΆ

Sample scripts: scripts.tar.gz Sample data: sample_data.tar.gz

Logs contain a lot of behavioural information as they reflect the behaviour of a system that generates them. In many cases, it also includes the behaviour of the end user interacting with the system. Finding behavioural patterns is necessary in many ways. For example, to find new target groups for sales, to improve user experience, or to find fraudulent usage.

Just as sounds are characterized by oscillations of different frequencies, logs contain certain data elements that characterise the behaviour of interacting components of the system. For example, usernames, IP addresses and local time are usually relevant to the end user behaviour. The frequency of their occurrence in time (or more precisely, the distribution of frequencies, a histogram) is important behaviour metrics. It mostly serves as an indicator of change in behaviour. When you observe that your daily visitors have changed from being mostly from North-America to being mostly from Asia, it certainly means a change in the end user behaviour.

Still, it could be quite complicated to trace reasons behind this change based on the dynamics of data elements alone. The behaviour of the entities you’re interested in is usually just one contributing component to overall frequencies. In this situation, looking at relative frequencies (i.e how often a data element appears in relation to another one) is proven to be very helpful. For instance, successful/failed logins per IP address, different user agents per username, etc. Their trends, outliers and other statistical metrics provide a fertile ground to narrow down your search space; to re-iterate the process and to derive common criteria or just to visually observe patterns of events.

Let’s take an example. Suppose you’re running an e-commerce web service and want to explore the behaviour of your customers. This is especially relevant when it comes to fraud prevention - “know your customer” is the basic principle here. Let’s have a look at our customers’ logins on the basis of this sample log <https://docs.spectx.com/_downloads/psw_scan.log.sx.gz>. It is a simple tab-separated log, where each customer login is recorded as one row, detailing its time and date, username, IP address, the user agent of the browser and the result of the login event. This very simple pattern will extract all these fields:

// define pattern to extract fields from login service log
$srcp = <<<PATTERN_END
(
  TIMESTAMP('yyyy-MM-dd HH:mm:ss.SSS Z'):dateTime
  LD:userName
  IPV4:ipAddr
  DQS:userAgent
  UPPER:result
)(fs='\t')
EOL
PATTERN_END;

// use PARSE command to execute parsing.
@src = PARSE(pattern:$srcp,
             src:'https://docs.spectx.com/_downloads/psw_scan.log.sx.gz');

Now, being in the shoes of a fraud analyst, we can start by looking at our login event rates:

//compute login event rate per minute
@loginRate = @src
 .select(
   dateTime[1 min] as period,    //truncate dateTime value to 1 min precision
   count(*)  as total_logins,
   count(result='SUCCESS') as successful_logins,
   count(result='FAILED') as failed_logins
 )
 .group(@1);

@loginRate;

Successful login rate looks very consistent at about 140 - 160 events per minute. Failed logins seem to fluctuate mostly between 10 to 25 events per minute, however, during brief 12 minutes between 15:40 and 15:52 EET they seem to be a little higher around 30 - 35 events per minute.

../../../_images/login_rates.png

Just looking at those events does not tell us much, so let’s keep this time period in the back of our minds for now.

Next, we want to know how mobile (or stationary) our customers are - i.e how often do they change IP addresses? Let’s compute how many different IP addresses there are per username:

@ipAddrPerUserName = @src
 .select(userName, ipAddr, count(*) as ipPerUnCnt)
 .group(userName, ipAddr)       //count the userName and ipAddress pair occurrences
 .select(ipPerUnCnt, count(*))
 .group(@1)                     //compute the distribution of different ipAddress per userName frequencies
;

@ipAddrPerUserName;

gives us:

ipsPerUser f_2
1 6872
2 1665

Nothing really stands out. Right, let’s move on and have a look at the opposite: how many unique users do the IP addresses “produce”:

@usernamesPerIpaddrFrequency = @src
 .select(userName, ipAddr, count(*))
 .group(ipAddr, userName)
 .select(ipAddr, count(*) as unIpCnt).group(@1)
 .sort(@2 desc)
;

@ipAddrStats = @usernamesPerIpaddrFrequency
 .select(min(unIpCnt), max(unIpCnt), avg(unIpCnt) as avgUsersPerIp)  //compute min, max and average
;
@ipAddrStats;
min_1 max_1 avg_1
1 10 1.021538829723585

Notice how close the average is to the minimum? This means there should be very few ipAddr’s producing max number of users. Let’s have a look at only those with a username count above the average:

ipAddr usersPerIp
41.231.53.101 10
46.182.106.190 10
78.24.221.230 10
78.142.175.70 10
93.174.93.133 10
104.223.123.98 10
106.187.37.101 10
146.0.43.126 10
162.247.72.201 10
163.172.217.50 10
171.25.193.132 10
176.10.104.240 10
185.100.85.61 10
185.100.86.217 10
193.90.12.87 10
193.171.202.150 10
198.50.159.155 10
199.249.223.62 10
216.218.134.12 10
223.26.48.248 10

So there are 20 (out of 1685 total) ipAddr’s “producing” 5-10 times more usernames than average. Looking at their login records seems definitely worthwhile:

@suspects = @src
 .select(dateTime, ipAddr, userName, result, userAgent)
 .filter(ipAddr in (@aboveNormalFrequencyIpAddresses.select(ipAddr)))
;
@suspects;
dateTime ipAddr userName result userAgent
2017-05-03 15:41:59.759 +0300 162.247.72.201 aam FAILED Mozilla/5.0 (Linux; Android 6.0.1; SM-G920V Build/MMB29K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36.lsc
2017-05-03 15:42:02.346 +0300 162.247.72.201 aani FAILED Mozilla/5.0 (Linux; Android 5.1.1; SM-G928X Build/LMY47X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.83 Mobile Safari/537.36.lsc
2017-05-03 15:42:06.000 +0300 162.247.72.201 aardvark FAILED Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Microsoft; Lumia 950) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Mobile Safari/537.36 Edge/13.10586.lsc
2017-05-03 15:42:06.764 +0300 162.247.72.201 anna3535 SUCCESS Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246.lsc
2017-05-03 15:42:09.195 +0300 162.247.72.201 aaren FAILED Mozilla/5.0 (Linux; Android 6.0.1; E6653 Build/32.2.A.0.253) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36.lsc
2017-05-03 15:42:15.000 +0300 162.247.72.201 aarhus FAILED Mozilla/5.0 (Linux; Android 6.0; HTC One M9 Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36.lsc
2017-05-03 15:42:18.000 +0300 162.247.72.201 aarika FAILED Mozilla/5.0 (Linux; Android 7.0; Pixel C Build/NRD90M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/52.0.2743.98 Safari/537.36.lsc
2017-05-03 15:42:20.507 +0300 162.247.72.201 berta3535 SUCCESS Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246.lsc
2017-05-03 15:42:24.000 +0300 162.247.72.201 aaronic FAILED Mozilla/5.0 (Linux; Android 5.1.1; SHIELD Tablet Build/LMY48C) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Safari/537.36.lsc
2017-05-03 15:42:27.000 +0300 162.247.72.201 aaronical FAILED Mozilla/5.0 (Linux; Android 5.0.2; SAMSUNG SM-T550 Build/LRX22G) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/3.3 Chrome/38.0.2125.102 Safari/537.36.lsc
2017-05-03 15:42:30.000 +0300 198.50.159.155 aaronite FAILED Mozilla/5.0 (Linux; Android 4.4.3; KFTHWI Build/KTU84M) AppleWebKit/537.36 (KHTML, like Gecko) Silk/47.1.79 like Chrome/47.0.2526.80 Safari/537.36.lsc
2017-05-03 15:42:33.000 +0300 198.50.159.155 celia3535 SUCCESS Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246.lsc
2017-05-03 15:42:36.000 +0300 198.50.159.155 aaru FAILED Mozilla/5.0 (CrKey armv7l 1.5.16041) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.0 Safari/537.36.lsc
2017-05-03 15:42:39.000 +0300 198.50.159.155 aau FAILED Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Safari/534.30.lsc
2017-05-03 15:42:42.000 +0300 198.50.159.155 ab FAILED Mozilla/5.0 (Linux; Android 4.2.2; AFTB Build/JDQ39) AppleWebKit/537.22 (KHTML, like Gecko) Chrome/25.0.1364.173 Mobile Safari/537.22.lsc
2017-05-03 15:42:43.498 +0300 198.50.159.155 donna3535 SUCCESS Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246.lsc
2017-05-03 15:42:48.000 +0300 198.50.159.155 ababa FAILED AppleTV5,3/9.1.1.lsc
2017-05-03 15:42:48.506 +0300 198.50.159.155 ababdeh FAILED Mozilla/5.0 (Nintendo WiiU) AppleWebKit/536.30 (KHTML, like Gecko) NX/3.0.4.2.12 NintendoBrowser/4.3.1.11264.US.lsc
2017-05-03 15:42:53.784 +0300 198.50.159.155 ababua FAILED Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Xbox; Xbox One) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Mobile Safari/537.36 Edge/13.10586.lsc
2017-05-03 15:42:54.322 +0300 198.50.159.155 elena3535 SUCCESS Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246.lsc
2017-05-03 15:43:00.000 +0300 199.249.223.62 abaca FAILED Mozilla/5.0 (PlayStation Vita 3.61) AppleWebKit/537.73 (KHTML, like Gecko) Silk/3.2.lsc
2017-05-03 15:43:01.022 +0300 199.249.223.62 abacate FAILED Mozilla/5.0 (Nintendo 3DS; U; ; en) Version/1.7412.EU.lsc
. . .        

This looks clearly as someone is trying out passwords for the accounts. First you notice that they change their ip-address after 10 attempts, probably to avoid ip-address-based velocity check. Also, they do a successful login after three failed attempts - with username ending with ‘3535’. How long has this been going on and how many accounts have they managed to compromise?

@scanSuccess = @suspects
 .filter(userName not like '%3535') //exclude accounts owned by attackers that they use for fooling velocity checks
 .select(min(dateTime),
         max(dateTime),
         count(result='SUCCESS') as successCnt,
         count(*) as total)
 .select(*, STRING((successCnt/FLOAT(total))*100) || '%' as successRatio)
;
min_1 max_1 successCnt total successRatio
2017-05-03 15:41:59.759 +0300 2017-05-03 15:51:57.000 +0300 5 153 3.2679738562091507%

Aha - seems that the timeframe with slightly increased failed login rate was indeed related to a malicious activity.

Can you spot another careless mistake the attacker has made and that would help us to identify their password scanning pattern in the future?

Disclaimer: all data used in this example is artificially generated. Any correlation to real life usernames, IP addresses or any other data is entirely coincidental