Parsing LDAP Logs

Our objective is to extract user activity related records from LDAP logs and provide the basis for normalizing them later in query layer (i.e join the originating IP-address from connect records to action records).

Note

We’ve already prepared the pattern, you can find it in https://github.com/spectx/resources/blob/master/examples/patterns/ldap.sxp.

Here’s a sample from OpenLDAP log:

Dec 31 16:53:53 server1 slapd[1010]: conn=7448 fd=43 connection from IP=192.168.4.36:40629 (IP=:: 389) accepted.
Dec 31 16:53:53 server1 slapd[1010]: conn=7448 op=0 BIND dn="uid=user1,ou=people,dc=example,dc=com" method=128
Dec 31 16:53:53 server1 slapd[1010]: conn=7448 op=0 RESULT tag=97 err=0 text=
Dec 31 16:53:53 server1 slapd[1010]: conn=7448 op=1 SRCH base="ou=people,dc=example,dc=com" scope=2 filter="(objectClass=*)"
Dec 31 16:53:53 server1 slapd[1010]: conn=7448 op=1 SEARCH RESULT tag=101 err=0 text=
Dec 31 16:53:54 server1 slapd[1010]: conn=7448 op=2 UNBIND
Dec 31 16:53:54 server1 slapd[1010]: conn=-1 fd=43 closed

Note

you can find the sample OpenLDAP log by navigating Input Data Browser to s3s://spectx-docs/formats/log/ldap/2015-12-31_slapd_access-syslog.log

Let’s walk through the pattern:

As each record (line in the log) begins with the same elements, let’s declare them as hdr - it’s easier to use and code is also less cluttered:

1
2
3
4
5
6
7
// every request always begins with a timestamp, server and process info, let's capture it as the header:

$hdr =
  TIMESTAMP('MMM d HH:mm:ss'):timestamp ' ' LD:host ' ' LD:process '[' INT:pid ']:'
  LD                    //skip spaces between pid and conn
  'conn=' INT:connId    // connection id will be our session identifier
;

Declare all different types of records: connect, operation and close. To make it easier to filter them in queries we can add a meta field identifying the record type:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// connect record:
$connRec =
 <'connect'>:type           //metafield (added after parsing) recType contains record type: 'connect'
 $hdr                       //record begins with header
 ' fd=' INT:fd              //followed by 'fd=' key value pair
 LD* ' connection from IP=' //skip everything until string const (note modifier * allows to skip 0 bytes too)
 IPV4SOCKET:c_sock          //client socket exported as 'c_sock'
 LD                         //skip until
 EOL                        //the end of line
;

// operation record:
$opRec =
 <'op'>:type                //metafield (added after parsing) recType contains record type: 'op'
 $hdr                       //record begins with header
 ' op=' INT:opId            //followed by 'op=' key value pair
 ' ' [A-Z ]+:op             //followed by single space and uppercase keyword describing operation
 LD*:details                //capture details as string exported as 'details'
 EOL                        //until the end of line
 ;

// connection close record:
$closeRec =
 <'close'>:type             //metafield (added after parsing) recType contains record type: 'close'
 $hdr                       //record begins with header
 ' fd=' INT:fd              //string 'fd=' key value pair
 ' closed'                  //followed by string constant
 EOL                        //until the end of line
 ;

And finally bring declarations together in the root expression, which basically says that our pattern consists of three different types of records: connect or op or close.

1
2
3
4
5
/* root pattern expression:
   uses alternative group, i.e we may have either connection or operation or close records.
   Everything else is unmatched
*/
( $connRec | $opRec | $closeRec );

That’s it, we’re done. See how this pattern can be used in analyzing LDAP logs.