Malware memory analysis (CIC-MalMem-2022)

Obfuscated malware is malware that hides to avoid detection and extermination. The obfuscated malware dataset is designed to test obfuscated malware detection methods through memory. The dataset was created to represent as close to a real-world situation as possible using malware that is prevalent in the real world. Made up of Spyware, Ransomware and Trojan Horse malware, it provides a balanced dataset that can be used to test obfuscated malware detection systems.

This dataset uses debug mode for the memory dump process to avoid the dumping process to show up in the memory dumps. This works to represent a more accurate example of what an average user would have running at the time of a malware attack.

1. Introduction

The obfuscated malware dataset focuses on simulation of real-world scenarios. Figure 1 shows the breakdown of benign and malicious memory dumps. Figure 2 shows the breakdown of what malware families are used in each malware category for Spyware (a), Ransomware (b), and Trojan Horse (c) malware. Figure 3 shows the overall malware families used in the whole dataset.

Figure 1: Memory Dump Categories

Figure 2A: Spyware Families

Figure 2B: Ransomware Families

Figure 2C: Trojan Horse Families

Figure 3: Complete dataset breakdown

2. Dataset details

The dataset is balanced with it being made up by 50% malicious memory dumps and 50% benign memory dumps. The break down for malware families is shown in the table below. The dataset contains a total of 58,596 records with 29,298 benign and 29,298 malicious. Figure 4 shows the total count of each malware family from each malware category.

Malware category	Malware families	Count
Trojan Horse	Zeus Emotet Refroso scar Reconyc	195 196 200 200 157
Spyware	180Solutions Coolwebsearch Gator Transponder TIBS	200 200 200 241 141
Ransomware	Conti MAZE Pysa Ako Shade	200 195 171 200 220

Figure 4: Malware Table Breakdown

License

Tristan Carrier, Princy Victor, Ali Tekeoglu, Arash Habibi Lashkari,” Detecting Obfuscated Malware using Memory Feature Engineering”, The 8th International Conference on Information Systems Security and Privacy (ICISSP), 2022

Download the dataset

Global Site Navigation (use tab and down arrow)