In-Network PCA and Anomaly Detection
Complete Citation
- Ling Huang, XuanLong? Nguyen, Minos Garofalakis, Anthony Joseph, Michael Jordan, and Nina Taft. In the Proceedings of the Advances in Neural Information Processing Systems (NIPS) 19, Vancouver, B.C., December, 2006.
Abstract
We consider the problem of network anomaly detection in large distributed systems. In thissetting, Principal Component Analysis (PCA) has been proposed as a method for discover-ing anomalies by continuously tracking the projection of the data onto a residual subspace.This method was shown to work well empirically in highly aggregated networks, that is,those with a limited number of large nodes and at coarse time scales. This approach, how-ever, has scalability limitations. To overcome these limitations, we develop a PCA-basedanomaly detector in which adaptive local data filters send to a coordinator just enough datato enable accurate global detection. Our method is based on a stochastic matrix perturba-tion analysis that characterizes the tradeoff between the accuracy of anomaly detection andthe amount of data communicated over the network.
Annotations
Problem: too much data sent across network. Send as little data as possible yet not too much reduce anomaly detection accuracy.
Volume anomalies: unusual traffic load levels in a network that are caused b anomalies such as worms,
DDoS? , device failures / misconfigurations, etc.
PCA is a projection method that maps a given set of data points onto principal components ordered by the amount of data variance that they capture. A matrix of m x n, where m = a sliding window of m data points and n = n monitors.
Due to high level of traffic aggregation on ISP backbone links, volume anomalies can often go unnoticed by being buried within normal traffic patterns, which actually lie in a very low-dimensional subspace. Separating out this normal traffic subspace using PCA (to find the principal traffic components) makes it much easier to identify volume anomalies in the remaining subspace.
For the Abilene network with 41 links, most data variance can be captured by the first k=4 principal components. The remaining (n-k) is the abnormal traffic subspace.
A monitor only sends the coordinator an update of its data when the local constraint is violated. The coordinator receives an approximate (imprecise), or “perturbed” view of the global state. Use stochastic matrix perturbation theory to analyze the effect on the PCA-based anomaly detection. Use their own trace-driven simulator to validate with one-week trace collected from the Abilene network (traffic load measured every 10 minutes from 41 links). When the relative eigen-error of 1.5% (yielding 4% missed detection and 6% false alarm rate), they can filter out more than 90% of data.
Handout
Topic revision: r7 - 09 Aug 2007 - 16:05:41 -
QiLiaoRepository.QiLiao_6_13_2007_PCA moved from Repository.PaperDiscussionQi on 09 Aug 2007 - 16:05 by QiLiao -
put it back