IIS Log – Building a Baseline

Privacy concept: computer keyboard with Contoured Shield and SecThe first step in anomaly detection is to determine what is normal. The goal is to detect abnormal web traffic hitting the web server – whether it’s strange connection patterns or unexplained high traffic loads – there has to be an understanding and a common consensus of what is normal. This is where the network administrator can put the vast amounts of data in the IIS logs to work. By analyzing the IIS logs and creating a baseline, administrators will have benchmark figures for comparison when assessing a potential security event. For example, the IIS logs show 23456 Syncs at 4%. Is that a good figure?

Protect your IIS Server and take a cyber security class at Udemy.com

To answer the question, it is impossible to say without context and a baseline comparison, so the first step an administrator must take when applying security is to determine what normal behavior is by creating a baseline.

Fortunately, the sheer volume of data in the IIS log can provide the basis for creating a benchmark. In order to parse and analyze the log, the administrator requires tools that can mine the data and reduce it to summary totals. The tool used in this article is Microsoft’s own product for parsing logs, Log Parser.

Log Parser uses universal query language to access text-base data from log files. An understanding of SQL is helpful but not required, as there are many sample scripts, which an administrator can tweak to fit his own environment. The scripts in this article are standard Log Parse scripts, which an administrator can use on any IIS log file.

The Log Parser syntax is

Logparser.exe file:sample.sql?logfile=\\servername\wwwlog\w3svc1\ex1401*.log

 

The command line above will analyze the log file ex1401*.log which is the address referenced log file for January 2014. This will provide initial data for the baseline however the larger the sample the more accurate the baseline. When creating a baseline for a web application, there are certain characteristics and criteria that are of interest at the web application layer. These are unique client IP, top client IP, user agent characteristics, IP to user agent characteristics and total number of requests. Therefore, the first parsing of the IIS log will be to extract that data by focusing on the number of hits per page.

 

The query to get the number of ‘URI (Universal Request Identifier) Hits’ is shown below, this is standard Log Parse script that the administrator can run as is without changing any parameters.
<URI HITS SCRIPT<

SELECT
     cs-uri-stem AS URI-Stem,
      cs-host AS HostName,
      COUNT(*) AS Hits
INTO DATAGRID
FROM %logfile%<
GROUP BY cs-uri-stem, cs-host
ORDER BY Hits DESC

 

This will return a list of web pages, hit count and client host name similar to this example

  HostName    URIStem    Hits
  Udemy.com/default.aspx     1200
  Udemy.com/default.aspx     3200
  Udemy.com/default.aspx     2500
  Udemy.com/default.aspx     2670
  Udemy.com/default.aspx     2340

 

In order to understand the normal request patterns directed towards a site an administrator should consider the distribution of client to requests. The following query will extract that information.

CLIENT_IP to REQUSTS SCRIPT
 
SELECT
   c-ip AS ClientIP,
   cs-host AS HostName
    cs-uri-stem AS URIStem,
    sc-status AS Status,
    cs(User-Agent) AS UserAgent,
    count (*) as Requests
INTO DATAGRID
FROM %logfile%
GROUP BY c-ip, cs-uri-stem, cs-host, cs(User-Agent), sc-status
ORDER BY Requests DESC

 

This query selects or extracts the Client IP, Host Name, URI, Status, User Agent and outputs to the display grid.

Need to know SQL? Take a course at Udemy.com

An example of a typical result set from the query above would look something similar to this

ClientIP  HostName    URIStem    Status   UserAgent   Requests
10.1.1.1  Udemy.com/default.aspx     200   Windows+XP…   20289
10.1.1.2  Udemy.com/default.aspx     200   Windows+XP…   20228
10.1.1.4  Udemy.com/default.aspx     200   Mozilla/4.0…   20166
10.1.1.5  Udemy.com/default.aspx     200   Windows+XP…   21384
10.1.1.6  Udemy.com/default.aspx     200   Mozilla/4.0…   25069

 

The figures returned are useful in comparing what normal traffic looks like, and a significant deviation can indicate abnormal request patterns from individual clients. One way to get a birds-eye view of the ratio of unique IP / Total IP is to run this query

UNIQUE/TOTAL REQUEST SCRIPT
SELECT
   COUNT(DISTINCT c-ip) AS UniqueIPs,
    COUNT(ALL c-ip) AS TotalRequests
INTO DATAGRID
FROM %logfile%

 

What this script does is count only unique IPs, then it counts all IPs, which will provide a benchmark figure of unique client / total requests. A significant change in this pattern may indicate unusual request patterns from one or more clients. If the administrator does detect suspicious request patterns for one or a group of hosts then the client IP to Request script could be amended to include a reverse DNS look up to identify the domain from the IP address. This reverse DNS lookup is slow, so it is not advisable to run it against full logs.

By collecting and analyzing the IIS logs an administrator can build up a baseline and a good understanding what normal well behaved traffic looks like. However to get a picture of what are typical levels of poorly behaved requests the administrator must look to creating a baseline for the HTTP.sys error log. This is important because IIS will not show rejected requests, so if an attacker made 456 successful requests from IP 10.1.1.5 these would be in the IIS log. However if the attacker also made 209874 requests which were rejected then these would not be shown in the IIS log, but they would be recorded in the HTTP.sys error log. Similarly the URL Scan logs should be analyzed and be part of the baseline.

When building a baseline the administrator must collate over time a sufficient body of data to leverage the power of large numbers in leveling out discrepancies. The larger the sample, collected over many months, the more accurate the baseline will be. IIS logs provide a wealth of information by default (they can be configured to collect even more) and by utilizing Microsoft’s Log Parser an administrator can build an accurate baseline of what is normal.

Get a better understanding for cyber security with a class at Udemy.com