Novel Feature Selection Method for APT Detection

Hasanain M. J. Alfouadi; Hiba Abdulrazzak Ahmed

doi:10.47134/jtsi.v3i2.5931

Authors

Hasanain M. J. Alfouadi University of Al-Qadisiyah
Hiba Abdulrazzak Ahmed University of Al-Qadisiyah

DOI:

https://doi.org/10.47134/jtsi.v3i2.5931

Keywords:

APT, Cybersecurity, IDS, Feature Selection Techniques, SHAP

Abstract

An Advanced Persistent Threat (APT) is a multistage, highly sophisticated, and covert form of cyber threat that gains unauthorized access to networks to either steal valuable data or disrupt the targeted network. These threats often remain undetected for extended periods, emphasizing the critical need for early detection in networks to mitigate potential APT consequences. In this work, we propose a feature selection method for developing a lightweight intrusion detection system capable of effectively identifying APTs at the initial compromise stage. Our approach leverages the XGBoost algorithm and Explainable Artificial Intelligence (XAI), specifically utilizing the SHAP (SHapley Additive exPlanations) method for identifying the most relevant features of the initial compromise stage. The results of our proposed method showed the ability to reduce the selected features of the SCVIC-APT-2021, dataset, a benchmark dataset for APT detection that contains both benign and malicious traffic records, from 77 to just four while maintaining consistent evaluation metrics for the suggested system. The estimated metrics values are 97% precision, 100% recall, and a 98% F1 score. The proposed method not only aids in preventing successful APT consequences but also enhances understanding of APT behavior at early stages.

References

AL-Aamri, A. S., Abdulghafor, R., Turaev, S., Al-Shaikhli, I., Zeki, A., & Talib, S. (2023). Machine Learning for APT Detection. Sustainability (Switzerland), 15(18).

Alsanad, A., & Altuwaijri, S. (2022). Advanced Persistent Threat Attack Detection using Clustering Algorithms. International Journal of Advanced Computer Science and Applications, 13(9), 640–649.

Bodström, T., & Hämäläinen, T. (2019). A novel deep learning stack for APT detection. Applied Sciences, 9(6).

Dhal, P., & Azad, C. (2022). A comprehensive survey on feature selection in the various fields of machine learning. In Applied Intelligence (Vol. 52, Issue 4). Applied Intelligence. https://doi.org/10.1007/s10489-021-02550-9

Hasan, M. M., Islam, M. U., & Uddin, J. (2023). Advanced Persistent Threat Identification with Boosting and Explainable AI. SN Computer Science, 4(3), 1–9.

Hofer-Schmitz, K., Kleb, U., & Stojanović, B. (2021). The influences of feature sets on the detection of advanced persistent threats. Electronics (Switzerland), 10(6), 1–22. https://doi.org/10.3390/electronics10060704

Jaw, Ebrima and Wang, X. (2021). Feature selection and ensemble-based intrusion detection system: an efficient and comprehensive approach. Symmetry, 13, 1764.

Khaleefa, E. J., & Abdulah, D. A. (2022). Concept and difficulties of advanced persistent threats (APT): Survey. Int. J. Nonlinear Anal. Appl, 13(1), 2008–6822. http://dx.doi.org/10.22075/ijnaa.2022.6230

Kim, T., & Pak, W. (2023). Integrated Feature-Based Network Intrusion Detection System Using Incremental Feature Generation. Electronics, 12, 1657.

Koo, H., Ghavamnia, S., & Polychronakis, M. (2019). Configuration-driven software debloating. Proceedings of the 12th European Workshop on Systems Security, 1–6.

Liu, H., Zhou, M., & Liu, Q. (2019). An embedded feature selection method for imbalanced data classification. IEEE/CAA Journal of Automatica Sinica, 6(3), 703–715.

Liu, J., Shen, Y., Simsek, M., Kantarci, B., Mouftah, H. T., Bagheri, M., & Djukic, P. (2022). A new realistic benchmark for advanced persistent threats in network traffic. IEEE Networking Letters, 4(3), 162–166.

Parkour, M. (n.d.). malware dump. Retrieved February 1, 2024, from contagiodump.blogspot.com

Santhosh Kumar, S. V. N., Selvi, M., & Kannan, A. (2023). A Comprehensive Survey on Machine Learning-Based Intrusion Detection Systems for Secure Communication in Internet of Things. Computational Intelligence and Neuroscience, 2023, 1–24. https://doi.org/10.1155/2023/8981988

Shaker, B. N., Al-Musawi, B., & Hassan, M. F. (2025). Explainable AI for enhancing IDS against advanced persistent kill chain. Cluster Computing, 28(7), 459. https://doi.org/10.1007/s10586-025-05219-x

Sharma, P., Mirzan, S. R., Bhandari, A., Pimpley, A., Eswaran, A., Srinivasan, S., & Shao, L. (2020). Evaluating Tree Explanation Methods for Anomaly Reasoning : A Case Study of SHAP. 2.

Shen, Y., Simsek, M., Kantarci, B., Mouftah, H. T., Bagheri, M., & Djukic, P. (2022). Prior Knowledge based Advanced Persistent Threats Detection for IoT in a Realistic Benchmark. In GLOBECOM 2022 IEEE Global Communications Conference (pp. 3551–3556).

Srikanth Yadav, M., & Kalpana, R. (2019). Data preprocessing for intrusion detection system using encoding and normalization approaches. Proceedings of the 11th International Conference on Advanced Computing, ICoAC 2019, 265–269.

Wah, Y. B., Ibrahim, N., Hamid, H. A., Abdul-Rahman, S., & Fong, S. (2018). Feature selection methods: Case of filter and wrapper approaches for maximising classification accuracy. Pertanika Journal of Science and Technology, 26(1), 329–340.

Xuan, C. Do, Duong, L. Van, & Nikolaevich, T. V. (2021). Detecting APT attacks based on network traffic using machine learning. Journal of Web Engineering, 20(1), 171–190.

Zaman, S., Alhazmi, K., Aseeri, M. A., Ahmed, M. R., Khan, R. T., Kaiser, M. S., & Mahmud, M. (2021). Security Threats and Artificial Intelligence Based Countermeasures for Internet of Things Networks: A Comprehensive Survey. IEEE Access, 9, 94668–94690. https://doi.org/10.1109/ACCESS.2021.3089681

Zhu, H., Wang, H., Lam, C. T., Hu, L., Ng, B. K., & Fang, K. (2024). Rapid APT Detection in Resource-Constrained IoT Devices Using Global Vision Federated Learning (GV-FL). Communications in Computer and Information Science, 1961 CCIS, 568–581.