Automated identification of network anomalies and their causes with interpretable machine learning: The CIAN methodology and TTrees implementation
Articles
Electronic International Standard Serial Number (EISSN)
1873-703X
abstract
Leveraging machine learning (ML) for the detection of network problems dates back to handling call-dropping issues in telephony. However, troubleshooting cellular networks is still a manual task, assigned to experts who monitor the network around the clock. To help in this task we present CIAN (from Causality Inference of Anomalies in Networks), a practical and interpretable ML methodology, which we implement in the form of a software tool named TTrees (from Troubleshooting Trees). We have designed CIAN to automate the identification of the causes of performance anomalies in cellular networks. Our methodology is unsupervised and combines multiple ML algorithms (e.g., decision trees and clustering) and Kolmogorov complexity-inspired data analysis tools that we have developed for this work. CIAN can be used with small volumes of data and is quick at training. Our experiments use diverse data sets obtained from measurements in operational commercial mobile networks. They show that the TTrees implementation of CIAN can automatically identify and accurately classify network anomalies (e.g., cases for which a network low performance is not apparently justified by operational conditions) training with just a few hundreds of data samples. The resulting information hence enables precise troubleshooting actions. In particular, we showcase how TTrees can be flexibly used to monitor the performance of TCP and QUIC protocols when they are adopted to serve mobile users.