Withdraw
Loading…
Closed-loop network anomaly detection
Zhou, Qinghai
Loading…
Permalink
https://hdl.handle.net/2142/121978
Description
- Title
- Closed-loop network anomaly detection
- Author(s)
- Zhou, Qinghai
- Issue Date
- 2023-11-14
- Director of Research (if dissertation) or Advisor (if thesis)
- Tong, Hanghang
- Doctoral Committee Chair(s)
- Tong, Hanghang
- Committee Member(s)
- Sun, Jimeng
- Zhai, ChengXiang
- Chau, Duen Horng
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- data mining
- graph mining
- anomaly detection
- graph neural networks
- Abstract
- Anomalies are defined as rare observations that significantly deviate from the majority. In recent years, with the networked data becoming ubiquitous, network anomaly detection (NAD), which aims to identify the rare objects in networks, has attracted remarkable attentions in a variety of high-impact applications, ranging from social network analysis (e.g., social spammer detection), online review system (e.g., opinion spam detection) to financial fraud (e.g., credit card fraud detection). Generally speaking, an NAD algorithm is composed of three major components, including (1) networks, (2) supervision, and (3) users. The vast majority of existing NAD techniques have been developed to take networks and supervision as input and deliver the detection results (e.g., a top-k list) to the end user. Despite tremendous advances being achieved, three key challenges remain. First (rich networks), real-world networks are often sourced from multiple instances or dynamically evolving, whereas the majority of existing NAD approaches are designed for single or multiple static aligned network(s). It remains nascent how to detect anomalies in rich (e.g., multiple, dynamic) networks. Second (weak supervision), the existing NAD methods are predominately developed in an unsupervised manner due to lack of supervision. Nevertheless, it has not been well studied on how to leverage low-cost weak supervision (e.g., limited number of labels, labels in coarse granularity) to design supervised algorithms. Third (user interaction), existing methods primarily regard the users as the passive receiving end of an NAD algorithm. It is imperative on how to bring the users into the NAD loop to boost both the interpretability and detection accuracy. The close interactions between the key challenges in NAD naturally necessitate four major tasks, namely predicting, auditing, augmenting and interpreting. First, predicting aims to advance the detection performance in complex networks by mining the crucial knowledge from weak supervised signals. Second, the auditing task studies how user-based anomalous activities and the corresponding alterations on graphs impact the network systems. Third, augmenting correlates users and networks, and explores reinforcing the supervision and network information, to improve NAD algorithms. The goal of interpreting is to help the end users understand the outcome of mining techniques through quantitative uncertainty estimation and intuitive visual explanations. The theme of my Ph.D. research is to collectively address the above key challenges in network anomaly detection through the four major tasks, including predicting, auditing, augmenting and interpreting. Specifically, for predicting, we have developed GDN to learn anomalous patterns from limited labeled anomalies and Meta-GDN which realizes effective meta-knowledge transfer across multiple networks by equipping GDN with a meta learning algorithm. In addition, we design a generic framework, Wedge which is capable of identifying node-level anomalies given coarse-grained subgraph supervision. Second, for auditing, we have designed a family of scalable algorithms, Admiring to analyze the impact of anomalous activities on multi-network systems, to graph learning results. Furthermore, we develop Attent, a generic influence-based query strategy to actively obtain user feedback. Third, for augmenting, we develop G-ADAM, a mixup-based NAD approach that can augment the original limited training data by adaptively interpolating data instances in the embedding space. Moreover, we have studied the problem of dynamically optimizing the user net- work (e.g., teams) with reinforcement learning. For the interpreting task, we have proposed JuryGCN, which is the first frequentist-based approach to quantify node uncertainty of graph convolutional network without model training. JuryGCN has demonstrated superiority in both active learning on node classification and semi-supervised node classification, and achieves the best effectiveness and lowest memory usage than the competitors. We also develop Extra, an interactive visualization tool, to provide intuitive visual explanations for results in the team recommendation scenario.
- Graduation Semester
- 2023-12
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2023 Qinghai Zhou
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…