How can I find patterns in logs?¶
Logs contain a lot of behavioural information as they reflect the behaviour of a system that generates them. In many cases, it also includes the behaviour of the end user interacting with the system. Finding behavioural patterns is necessary in many ways. For example, to find new target groups for sales, to improve user experience, or to find fraudulent usage.
Just as sounds are characterized by oscillations of different frequencies, logs contain certain data elements that characterise the behaviour of interacting components of the system. For example, usernames, IP addresses and local time are usually relevant to the end user behaviour. The frequency of their occurrence in time (or more precisely, the distribution of frequencies, a histogram) is important behaviour metrics. It mostly serves as an indicator of change in behaviour. When you observe that your daily visitors have changed from being mostly from North-America to being mostly from Asia, it certainly means a change in the end user behaviour.
Still, it could be quite complicated to trace reasons behind this change based on the dynamics of data elements alone. The behaviour of the entities you’re interested in is usually just one contributing component to overall frequencies. In this situation, looking at relative frequencies (i.e how often a data element appears in relation to another one) is proven to be very helpful. For instance, successful/failed logins per IP address, different user agents per username, etc. Their trends, outliers and other statistical metrics provide a fertile ground to narrow down your search space; to re-iterate the process and to derive common criteria or just to visually observe patterns of events.
Let’s take an example. Suppose you’re running an e-commerce web service and want to explore the behaviour of your customers. This is especially relevant when it comes to fraud prevention - “know your customer” is the basic principle here. Let’s have a look at our customers’ logins on the basis of this sample log <https://logsamples.spectx.com/custom/psw_scan.log.sx.gz>. It is a simple tab-separated log, where each customer login is recorded as one row, detailing its time and date, username, IP address, the user agent of the browser and the result of the login event. This very simple pattern will extract all these fields:
// define pattern to extract fields from login service log $srcp = <<<PATTERN_END ( TIMESTAMP('yyyy-MM-dd HH:mm:ss.SSS Z'):dateTime LD:userName IPV4:ipAddr DQS:userAgent UPPER:result )(fs='\t') EOL PATTERN_END; // use PARSE command to execute parsing. @src = PARSE(pattern:$srcp, src:'https://logsamples.spectx.com/custom/psw_scan.log.sx.gz');
Now, being in the shoes of a fraud analyst, we can start by looking at our login event rates:
//compute login event rate per minute @loginRate = @src .select( dateTime[1 min] as period, //truncate dateTime value to 1 min precision count(*) as total_logins, count(result='SUCCESS') as successful_logins, count(result='FAILED') as failed_logins ) .group(@1); @loginRate;
Successful login rate looks very consistent at about 140 - 160 events per minute. Failed logins seem to fluctuate mostly between 10 to 25 events per minute, however, during brief 12 minutes between 15:40 and 15:52 EET they seem to be a little higher around 30 - 35 events per minute.
Just looking at those events does not tell us much, so let’s keep this time period in the back of our minds for now.
Next, we want to know how mobile (or stationary) our customers are - i.e how often do they change IP addresses? Let’s compute how many different IP addresses there are per username:
@ipAddrPerUserName = @src .select(userName, ipAddr, count(*) as ipPerUnCnt) .group(userName, ipAddr) //count the userName and ipAddress pair occurrences .select(ipPerUnCnt, count(*)) .group(@1) //compute the distribution of different ipAddress per userName frequencies ; @ipAddrPerUserName;
Nothing really stands out. Right, let’s move on and have a look at the opposite: how many unique users do the IP addresses “produce”:
@usernamesPerIpaddrFrequency = @src .select(userName, ipAddr, count(*)) .group(ipAddr, userName) .select(ipAddr, count(*) as unIpCnt).group(@1) .sort(@2 desc) ; @ipAddrStats = @usernamesPerIpaddrFrequency .select(min(unIpCnt), max(unIpCnt), avg(unIpCnt) as avgUsersPerIp) //compute min, max and average ; @ipAddrStats;
Notice how close the average is to the minimum? This means there should be very few ipAddr’s producing max number of users. Let’s have a look at only those with a username count above the average:
So there are 20 (out of 1685 total) ipAddr’s “producing” 5-10 times more usernames than average. Looking at their login records seems definitely worthwhile:
@suspects = @src .select(dateTime, ipAddr, userName, result, userAgent) .filter(ipAddr in (@aboveNormalFrequencyIpAddresses.select(ipAddr))) ; @suspects;
|2017-05-03 15:41:59.759 +0300||22.214.171.124||aam||FAILED||Mozilla/5.0 (Linux; Android 6.0.1; SM-G920V Build/MMB29K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36.lsc|
|2017-05-03 15:42:02.346 +0300||126.96.36.199||aani||FAILED||Mozilla/5.0 (Linux; Android 5.1.1; SM-G928X Build/LMY47X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.83 Mobile Safari/537.36.lsc|
|2017-05-03 15:42:06.000 +0300||188.8.131.52||aardvark||FAILED||Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Microsoft; Lumia 950) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Mobile Safari/537.36 Edge/13.10586.lsc|
|2017-05-03 15:42:06.764 +0300||184.108.40.206||anna3535||SUCCESS||Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246.lsc|
|2017-05-03 15:42:09.195 +0300||220.127.116.11||aaren||FAILED||Mozilla/5.0 (Linux; Android 6.0.1; E6653 Build/32.2.A.0.253) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36.lsc|
|2017-05-03 15:42:15.000 +0300||18.104.22.168||aarhus||FAILED||Mozilla/5.0 (Linux; Android 6.0; HTC One M9 Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36.lsc|
|2017-05-03 15:42:18.000 +0300||22.214.171.124||aarika||FAILED||Mozilla/5.0 (Linux; Android 7.0; Pixel C Build/NRD90M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/52.0.2743.98 Safari/537.36.lsc|
|2017-05-03 15:42:20.507 +0300||126.96.36.199||berta3535||SUCCESS||Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246.lsc|
|2017-05-03 15:42:24.000 +0300||188.8.131.52||aaronic||FAILED||Mozilla/5.0 (Linux; Android 5.1.1; SHIELD Tablet Build/LMY48C) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Safari/537.36.lsc|
|2017-05-03 15:42:27.000 +0300||184.108.40.206||aaronical||FAILED||Mozilla/5.0 (Linux; Android 5.0.2; SAMSUNG SM-T550 Build/LRX22G) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/3.3 Chrome/38.0.2125.102 Safari/537.36.lsc|
|2017-05-03 15:42:30.000 +0300||220.127.116.11||aaronite||FAILED||Mozilla/5.0 (Linux; Android 4.4.3; KFTHWI Build/KTU84M) AppleWebKit/537.36 (KHTML, like Gecko) Silk/47.1.79 like Chrome/47.0.2526.80 Safari/537.36.lsc|
|2017-05-03 15:42:33.000 +0300||18.104.22.168||celia3535||SUCCESS||Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246.lsc|
|2017-05-03 15:42:36.000 +0300||22.214.171.124||aaru||FAILED||Mozilla/5.0 (CrKey armv7l 1.5.16041) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.0 Safari/537.36.lsc|
|2017-05-03 15:42:39.000 +0300||126.96.36.199||aau||FAILED||Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Safari/534.30.lsc|
|2017-05-03 15:42:42.000 +0300||188.8.131.52||ab||FAILED||Mozilla/5.0 (Linux; Android 4.2.2; AFTB Build/JDQ39) AppleWebKit/537.22 (KHTML, like Gecko) Chrome/25.0.1364.173 Mobile Safari/537.22.lsc|
|2017-05-03 15:42:43.498 +0300||184.108.40.206||donna3535||SUCCESS|
|2017-05-03 15:42:48.000 +0300||220.127.116.11||ababa||FAILED||AppleTV5,3/9.1.1.lsc|
|2017-05-03 15:42:48.506 +0300||18.104.22.168||ababdeh||FAILED||Mozilla/5.0 (Nintendo WiiU) AppleWebKit/536.30 (KHTML, like Gecko) NX/22.214.171.124.12 NintendoBrowser/126.96.36.19964.US.lsc|
|2017-05-03 15:42:53.784 +0300||188.8.131.52||ababua||FAILED||Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Xbox; Xbox One) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Mobile Safari/537.36 Edge/13.10586.lsc|
|2017-05-03 15:42:54.322 +0300||184.108.40.206||elena3535||SUCCESS|
|2017-05-03 15:43:00.000 +0300||220.127.116.11||abaca||FAILED||Mozilla/5.0 (PlayStation Vita 3.61) AppleWebKit/537.73 (KHTML, like Gecko) Silk/3.2.lsc|
|2017-05-03 15:43:01.022 +0300||18.104.22.168||abacate||FAILED||Mozilla/5.0 (Nintendo 3DS; U; ; en) Version/1.7412.EU.lsc|
|. . .|
This looks clearly as someone is trying out passwords for the accounts. First you notice that they change their ip-address after 10 attempts, probably to avoid ip-address-based velocity check. Also, they do a successful login after three failed attempts - with username ending with ‘3535’. How long has this been going on and how many accounts have they managed to compromise?
@scanSuccess = @suspects .filter(userName not like '%3535') //exclude accounts owned by attackers that they use for fooling velocity checks .select(min(dateTime), max(dateTime), count(result='SUCCESS') as successCnt, count(*) as total) .select(*, STRING((successCnt/FLOAT(total))*100) || '%' as successRatio) ;
|2017-05-03 15:41:59.759 +0300||2017-05-03 15:51:57.000 +0300||5||153||3.2679738562091507%|
Aha - seems that the timeframe with slightly increased failed login rate was indeed related to a malicious activity.
Can you spot another careless mistake the attacker has made and that would help us to identify their password scanning pattern in the future?
Disclaimer: all data used in this example is artificially generated. Any correlation to real life usernames, IP addresses or any other data is entirely coincidental