Siddharthan, Shunmuga Prabhu and Dix, Marcel and Sprick, Barbara and Klöpper, Benjamin - Archives of Data Science, Series A

Article Details

Title Summarizing Industrial Log Data with Latent Dirichlet Allocation
Authors Siddharthan, Shunmuga Prabhu and Dix, Marcel and Sprick, Barbara and Klöpper, Benjamin
Year 2020
Volume 6(1)
Abstract Industrial systems and equipment produce large log files recording their activities and possible problems. This data is often used for troubleshooting and root cause analysis, but using the raw log data is poorly suited for direct human analysis. Existing approaches based on data mining and machine learning focus on troubleshooting and root cause analysis. However, if a good summary of industrial log files was available, the files could be used to monitor equipment and industrial processes and act more proactively on problems. This contribution shows how a topic modeling approach based on Latent Dirichlet Allocation (LDA) helps to understand, organize and summarize industrial log files. The approach was tested on a real-world industrial dataset and evaluated quantitatively by direct annotation.