Abstract
To perform robust statistical anomaly detection in cyber-security, we must build realistic models of the traffic patterns within a computer network. It is therefore important to understand the dependencies that exist between the large number of routinely interacting communication pathways within such a network. Pairs of interacting nodes in any directed communication network can be modelled as point processes where events in a process indicate information being sent between two nodes. For two processes A and B denoting the interactions between two distinct pairs of computers, called edges, we wish to assess whether events occurring in A trigger events to then occur in B. A test will be introduced to detect such dependence when only a subset of the events in A exhibit a triggering effect on process B; this test will allow us to detect even weakly correlated edges within a computer network graph. Since computer network events occur as a high frequency data stream, we consider the asymptotics of this problem as the number of events goes to infinity, whilst the proportion exhibiting dependence goes to zero, and examine the performance of tests that are provably consistent in this framework. An example of how this method can be used to detect genuine causal dependencies is provided using real world event data from the enterprise computer network of Los Alamos National Laboratory.
Citation information
Price-Williams, M., Heard, N. A. and Rubin-Delanchy, P. (2018) Detecting weak dependence in computer network traffic patterns using higher criticism. In: Journal of the Royal Statistical Society, Series C.