RT-APT: A Real-time APT Anomaly Detection Method for Large-scale Provenance Graph

作者:翁正秋 来源:【数据科学与人工智能学院】 添加时间:2024-11-13 浏览:

编号:WZUT-2024-21

标题:RT-APT: A Real-time APT Anomaly Detection Method for Large-scale Provenance Graph

入藏号:WOS:001343483700001

中科院期刊分区:二区

本院作者:翁正秋(第一作者)

来源出版物:Journal of Network and Computer Applications 卷: 233

出版年:2025

关键词: APT attack;Provenance graph;Anomaly detection;Clustering analysis

代表图:

翁2

Figure 1: The architecture of APT attack detection system based on provenance graph

图1:基于溯源图的APT攻击检测系统的架构


 

翁1

Figure 2: The provenance graph of real attack cases

图2:真实攻击案例的溯源图


 

摘要:

Advanced Persistent Threats (APTs) are prevalent in the field of cyber attacks, where attackers employ advanced techniques to control targets and exfiltrate data without being detected by the system. Existing APT detection methods heavily rely on expert rules or specific training scenarios, resulting in the lack of both generality and reliability. Therefore, this paper proposes a novel real-time APT attack anomaly detection system for large-scale provenance graphs, named RT-APT. Firstly, a provenance graph is constructed with kernel logs, and the WL subtree kernel algorithm is utilized to aggregate contextual information of nodes in the provenance graph. In this way we obtain vector representations. Secondly, the FlexSketch algorithm transforms the streaming provenance graph into a sequence of feature vectors. Finally, the K-means clustering algorithm is performed on benign feature vector sequences, where each cluster represents a different system state. Thus, we can identify abnormal behaviors during system execution. Therefore RT-APT enables to detect unknown attacks and extract long-term system behaviors. Experiments have been carried out to explore the optimal parameter settings under which RT-APT can perform best. In addition, we compare RT-APT and the state-of-the-art approaches on three datasets, Laboratory, StreamSpot and Unicorn. Results demonstrate that our proposed method outperforms the state-of-the-art approaches from the perspective of runtime performance, memory overhead and CPU usage.

 

链接:https://doi.org/10.1016/j.jnca.2024.104036