MLSTM: A GraphSAGE-based Android Malware Detection Method Integrating Mean Aggregation and LSTM

Abstract

With the popularity of smartphones and mobile applications, the threat of Android malware is increasingly serious. To realize the efficient and accurate detection of Android malware, a detection method combining mean aggregator and long-term short-term memory (MLSTM) has been proposed. This method is based on the graph sample and aggregate framework. Construct an isomorphic graph of an application based on permission requests and third-party library call features. MLSTM first aggregates the average features of adjacent nodes, and then uses long-term short-term memory (LSTM) to process sequence information, generating the final node embeddings for classification. The test results on AndroZoo and VirusShare datasets show that compared with baseline models including average aggregator, LSTM aggregator, max pool aggregator, graph convolution network, and gated recurrent unit (GRU), MLSTM has the smallest average absolute error and root mean square error, which are 3.84 and 6.26, respectively. Its detection performance is the best. In terms of permission features and third-party library features, the accuracy of MLSTM is (98.85 ± 0.12)% and (92.58 ± 0.25)%, respectively, significantly higher than GRU under the same permission features. In addition, under the projected gradient descent attack, the success rate of MLSTM attack is 35.28%. This model exhibits good robustness against adversarial attacks. The proposed method has good detection performance, enhanced robustness and stability. Applied to actual Android devices can improve the security and privacy protection level of user data. This method ensures the enhanced efficiency and stability, and provides a certain reference direction for Android malware detection.

References

Dhalaria M, & Gandotra E. Android malware detection techniques: A literature review. Recent Patents on Engineering, 2021, 15(2): 225-245.

Kouliaridis V, Barmpatsalou K, Kambourakis G, & Chen S. A survey on mobile malware detection techniques. IEICE Transactions on Information and Systems, 2020, 103(2): 204-211.

Xu D, Peng H, Wei C, Shang X, & Li H. Traffic state data imputation: An efficient generating method based on the graph aggregator. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(8): 13084-13093.

Yuan W, Jiang Y, Li H, & Cai M. A lightweight on-device detection method for android malware. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019, 51(9): 5600-5611.

Mahindru A, & Sangal A L. MLDroid - framework for Android malware detection using machine learning techniques. Neural Computing and Applications, 2021, 33(10): 5183-5240.

Zhang N, Xue J, Ma Y, Zhang R, Liang T, & Tan Y A. Hybrid sequence - based Android malware detection using natural language processing. International Journal of Intelligent Systems, 2021, 36(10): 5770-5784.

Zhu H J, Wang L M, Zhong S, Li Y, & Sheng V S. A hybrid deep network framework for android malware detection. IEEE Transactions on Knowledge and Data Engineering, 2021, 34(12): 5558-5570.

Mahindru A, & Sangal A L. SOMDROID: Android malware detection by artificial neural network trained using unsupervised learning. Evolutionary Intelligence, 2022, 15(1): 407-437.

Zhu H, Li Y, Li R, Li J, You Z, & Song H. SEDMDroid: An enhanced stacking ensemble framework for Android malware detection. IEEE Transactions on Network Science and Engineering, 2020, 8(2): 984-994.

Guo H, Chen Q, Zheng K, Xia Q, & Kang C. Forecast aggregated supply curves in power markets based on LSTM model. IEEE Transactions on Power Systems, 2021, 36(6): 5767-5779.

Kaselimi M, Doulamis N, Voulodimos A, Protopapadakis E, & Doulamis A. Context aware energy disaggregation using adaptive bidirectional LSTM models. IEEE Transactions on Smart Grid, 2020, 11(4): 3054-3067.

Zhou H, Zhou Y, Hu J, Yang G, Xie D, Xue Y, & Nordström L. LSTM-based energy management for electric vehicle charging in commercial-building prosumers.Journal of Modern Power Systems and Clean Energy, 2021, 9(5): 1205-1216.

Han J, Liu H, Wang M, Li Z, & Zhang Y. ERA-LSTM: An efficient ReRAM-based architecture for long short-term memory. IEEE Transactions on Parallel and Distributed Systems, 2019, 31(6): 1328-1342.

Tang J, Shu X, Yan R, & Zhang L. Coherence constrained graph LSTM for group activity recognition. IEEE transactions on pattern analysis and machine intelligence, 2019, 44(2): 636-647.

Wang Q, Bu S, He Z, & Dong Z Y. Toward the prediction level of situation awareness for electric power systems using CNN-LSTM network. IEEE Transactions on Industrial Informatics, 2020, 17(10): 6951-6961.

Nsugbe E. Toward a self-supervised architecture for semen quality prediction using environmental and lifestyle factors. Artificial Intelligence and Applications, 2023, 1(1): 35-42.

Rana M S, Sung A H. Evaluation of advanced ensemble learning techniques for Android malware detection. Vietnam Journal of Computer Science, 2020, 7(02): 145-159.

Zhang Z, Li Y, Dong H, Gao H, Jin Y, & Wang W. Spectral-based directed graph network for malware detection. IEEE Transactions on Network Science and Engineering, 2020, 8(2): 957-970.

Mokayed H, Quan T Z, Alkhaled L, & Sivakumar V. Real-time human detection and counting system using deep learning computer vision techniques. Artificial Intelligence and Applications, 2023, 1(4): 221-229.

Authors

  • Yi Liu Jiujiang Polytechnic University of Science and Technology, Jiujiang 332020 , China and School of Graduate Studies, Management and Science University, Shah Alam 40100, Malaysia
  • Md Gapar Md Johar School of Graduate Studies, Management and Science University, Shah Alam 40100, Malaysia
  • Jacquline Tham Software Engineering and Digital Innovation Centre, Management and Science University, Shah Alam 40100, Malaysia

DOI:

https://doi.org/10.31449/inf.v50i11.10831

Downloads

Published

04/23/2026

How to Cite

Liu, Y., Johar, M. G. M., & Tham, J. (2026). MLSTM: A GraphSAGE-based Android Malware Detection Method Integrating Mean Aggregation and LSTM. Informatica, 50(11). https://doi.org/10.31449/inf.v50i11.10831