Parsing LDAP LogsΒΆ

Our objective is to extract user activity related records from LDAP logs and provide basis for normalizing them later in query layer (i.e join the originating ip-address from connect records to other action records).

We’ve already prepared the pattern, you can find it in /user/examples/patterns/ldap.sxp (see how to extract examples here). Here’s a sample from OpenLDAP log:

Jul 17 16:53:53 server1 slapd[1010]: conn=7448 fd=43 connection from IP=192.168.4.36:40629 (IP=:: 389) accepted.
Jul 17 16:53:53 server1 slapd[1010]: conn=7448 op=0 BIND dn="uid=user1,ou=people,dc=example,dc=com" method=128
Jul 17 16:53:53 server1 slapd[1010]: conn=7448 op=0 RESULT tag=97 err=0 text=
Jul 17 16:53:53 server1 slapd[1010]: conn=7448 op=1 SRCH base="ou=people,dc=example,dc=com" scope=2 filter="(objectClass=*)"
Jul 17 16:53:53 server1 slapd[1010]: conn=7448 op=1 SEARCH RESULT tag=101 err=0 text=
Jul 17 16:53:54 server1 slapd[1010]: conn=7448 op=2 UNBIND
Jul 17 16:53:54 server1 slapd[1010]: conn=-1 fd=43 closed

Let’s walk through the pattern:

As each record (line in the log) begins with the same elements, let’s declare them as hdr - it’s easier to use and code is also less cluttered:

// declare our record parsers
// every request always begins with timestamp, server and process info, let's capture it as header:
$hdr =
  TIMESTAMP(format='MMM d HH:mm:ss', tz='UTC'):dateTime
  LD                    // we don't care about server and process info, don't export it
  'conn=' INT:connId    // connId will be our session identifier
;

Our main objective was to identify connect records from operations and parse originating ip-address. To make it simpler in queries to separate connect records from others we can add metafield identifying the record type:

// connect record:
$connRec =
 <'connect'>:recType        //metafield (added after parsing) recType contains record type: 'connect'
 $hdr                       //record begins with header
 ' fd=' INT:fd              //followed by 'fd=' key value pair
 LD* ' connection from IP=' //skip everything until string const (note modifier * allows to skip 0 bytes too)
 IPV4SOCKET:ipFrom          //originating ip-address exported as 'ipFrom'
 LD                         //skip until
 EOL                        //the end of line
;

// operation record:
$opRec =
 <'op'>:recType             //metafield (added after parsing) recType contains record type: 'op'
 $hdr                       //record begins with header
 ' op=' INT:opId            //followed by 'op=' key value pair
 ' ' UPPER:op               //followed by single space and uppercase keyword describing operation
 LD*:message                //capture details as string exported as 'message'
 EOL                        //until the end of line
 ;

// connection close record:
$closeRec =
 <'close'>:recType          //metafield (added after parsing) recType contains record type: 'close'
 $hdr                       //record begins with header
 LD*                        //skip everything until
 ' fd=' INT:fd              //string 'fd=' key value pair
 ' closed'                  //followed by string constant
 LD*:message                //capture details as string exported as 'message'
 EOL                        //until the end of line
 ;

And finally bring declarations together in the root expression, which basically says that our pattern consists of three different types of records: connect or op or close.

/* root pattern expression:
   uses alternative group, i.e we may have either connection or operation or close records.
   Everything else is unmatched
*/
( $connRec | $opRec | $closeRec );

That’s it, we’re done. See how this pattern can be used in analyzing LDAP logs.