Originally published July 29, 2017 @ 9:07 pm

Consider a common situation: you would like to select lines from a log file that match a specific string, but only during a specific time frame. For example, my /var/log/messages contains entries from one of the firewalls. I would like to see everything that was ACCESS BLOCK with a source external to my network on July 29 2017 between nine and ten in the morning.

Here’s a sample of the data from /var/log/messages:

The simplest way is to use regex for the timestamp. For example:

A more laborious approach is to convert the timestamp fields from the usual date +'%b %d %H:%M:%S' format to the epoch time (date +'%s'). Then you can use integer comparison in bash to select the relevant lines from the log. Here’s an example:

The advantage here is you can work with log files where entries are not in chronological order, as may often be the case when multiple remote syslog sources are writing to the same file. Not only can you identify the correct time range, you can then also easily rearrange matched entries in a chronological order. Working with timestamps that are integers offers many other advantages.

Let’s imagine a slightly more sophisticated task: we want to run the search above and then build a frequency table that would show a count of matched records for multiple sixty-minute intervals from the current time going back three hours. So, something like this (assuming right now is Jul 29 15:45:23):

Time Range Blocked Connections
Jul 29 14:45:23 – Jul 29 15:45:23 3246
Jul 29 13:45:23 – Jul 29 14:45:23 2435
Jul 29 12:45:23 – Jul 29 13:45:23 4582

Here’s one way to get this done:

The time-consuming portion of this process is the conversion of the original log file into a version using epoch timestamps. On a multi-core system you can speed things up using xargs. Just keep in mind that you’ll need to sort -k1,1n the output of all xargs threads if you want your resulting file to remain chronological. The more cores you have, the faster this will work. Here’s an example:

Finally and mostly for fun: a quick ASCII gnuplot chart of the data. I added a few more data points for visual impact:

And the result: