LogCluster

LogCluster is an experimental Perl-based tool for log file clustering and mining line patterns from log files. The development of LogCluster was inspired by SLCT, but LogCluster includes a number of novel features and data processing options.

LogCluster is distributed under the terms of GNU GPL, with the latest version being 0.10 (released in March 20, 2019).

In order to install LogCluster, copy the 'logcluster.pl' file from the distribution to the appropriate directory. Execute logcluster.pl --help for getting detailed help on usage and command line options.

A detailed discussion of the LogCluster algorithm and its application for security log analysis can be found in papers published at CNSM 2015 and MILCOM 2016. Also, experiments with C-based LogCluster prototype are outlined here.

A paper published at NOMS 2018 describes an unsupervised anomaly detection framework for syslog messages which is powered by LogCluster, while a paper from ICCWS 2020 presents a network anomaly detection framework that employs LogCluster for mining NetFlow data sets.

Finally, chapter 5 of the PhD thesis by Mauno Pihelgas and a paper from CSR 2023 present comparisons of LogCluster with other recent log analysis algorithms.