Classification and Online Clustering of Zero-Day Malware

Jurečková, Olha; Jureček, Martin; Stamp, Mark; Di Troia, Fabio; Lórencz, Róbert

Computer Science > Cryptography and Security

arXiv:2305.00605 (cs)

[Submitted on 1 May 2023 (v1), last revised 3 Aug 2023 (this version, v2)]

Title:Classification and Online Clustering of Zero-Day Malware

Authors:Olha Jurečková, Martin Jureček, Mark Stamp, Fabio Di Troia, Róbert Lórencz

View PDF

Abstract:A large amount of new malware is constantly being generated, which must not only be distinguished from benign samples, but also classified into malware families. For this purpose, investigating how existing malware families are developed and examining emerging families need to be explored. This paper focuses on the online processing of incoming malicious samples to assign them to existing families or, in the case of samples from new families, to cluster them. We experimented with seven prevalent malware families from the EMBER dataset, four in the training set and three additional new families in the test set. Based on the classification score of the multilayer perceptron, we determined which samples would be classified and which would be clustered into new malware families. We classified 97.21% of streaming data with a balanced accuracy of 95.33%. Then, we clustered the remaining data using a self-organizing map, achieving a purity from 47.61% for four clusters to 77.68% for ten clusters. These results indicate that our approach has the potential to be applied to the classification and clustering of zero-day malware into malware families.

Subjects:	Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:2305.00605 [cs.CR]
	(or arXiv:2305.00605v2 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2305.00605

Submission history

From: Martin Jureček [view email]
[v1] Mon, 1 May 2023 00:00:07 UTC (873 KB)
[v2] Thu, 3 Aug 2023 12:04:46 UTC (1,528 KB)

Computer Science > Cryptography and Security

Title:Classification and Online Clustering of Zero-Day Malware

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Classification and Online Clustering of Zero-Day Malware

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators