Neural Networks Optimizations Against Concept and Data Drift in Malware Detection

Maillet, William; Marais, Benjamin

Computer Science > Cryptography and Security

arXiv:2308.10821 (cs)

[Submitted on 21 Aug 2023]

Title:Neural Networks Optimizations Against Concept and Data Drift in Malware Detection

Authors:William Maillet, Benjamin Marais

View PDF

Abstract:Despite the promising results of machine learning models in malware detection, they face the problem of concept drift due to malware constant evolution. This leads to a decline in performance over time, as the data distribution of the new files differs from the training one, requiring regular model update. In this work, we propose a model-agnostic protocol to improve a baseline neural network to handle with the drift problem. We show the importance of feature reduction and training with the most recent validation set possible, and propose a loss function named Drift-Resilient Binary Cross-Entropy, an improvement to the classical Binary Cross-Entropy more effective against drift. We train our model on the EMBER dataset (2018) and evaluate it on a dataset of recent malicious files, collected between 2020 and 2023. Our improved model shows promising results, detecting 15.2% more malware than a baseline model.

Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2308.10821 [cs.CR]
	(or arXiv:2308.10821v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2308.10821

Submission history

From: Benjamin Marais [view email]
[v1] Mon, 21 Aug 2023 16:13:23 UTC (1,167 KB)

Computer Science > Cryptography and Security

Title:Neural Networks Optimizations Against Concept and Data Drift in Malware Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Neural Networks Optimizations Against Concept and Data Drift in Malware Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators