IoT-DIAD 2024 | Datasets | Research | Canadian Institute for Cybersecurity | UNB

Global Site Navigation (use tab and down arrow)

Canadian Institute for Cybersecurity

CIC IoT-DIAD 2024 dataset

A dual-function dataset for IoT device identification and anomaly detection

The primary goal of this research is to introduce a comprehensive IoT attack dataset designed for both IoT device identification and anomaly detection, aiming to advance security analytics applications for real-world IoT environments. To achieve this, 33 distinct attacks are conducted within an IoT topology comprising 105 devices at the Canadian Institute for Cybersecurity.

These attacks are classified into seven categories: DDoS, DoS, Recon, Web-based, Brute Force, Spoofing, and Mirai. All attacks are executed by malicious IoT devices targeting other IoT devices.

The proposed approach leverages both packet-based and flow-based feature extraction techniques to extract a diverse and essential set of features for robust anomaly detection and device classification. This novel combined feature set incorporates a wide range of attributes from various domains, including HTTPS-related features, handshake information, and User Agent strings, specifically extracted for IoT device identification. Additionally, the feature set includes specialized attributes for anomaly detection, such as stream, channel, and jitter metrics, which are calculated over different time intervals to enhance the model’s anomaly detection capabilities. The following workflow illustrates the integrated framework for the IoT Device Identification and Anomaly Detection System.

Data descriptions

  • Benign
  • DDoS
  • Brute Force
  • Spoofing
  • DoS
  • Recon
  • Web-based
  • Mirai

IoT device tables

Researchers focusing on IoT device identification and anomaly detection can directly utilise the extracted features stored in CSV files to train machine learning and deep learning models, with specified labels provided for each task.

Dataset directories

The main dataset directory (CIC IoT-DIAD 2024) contains two subdirectories which individually contain network traffic features extracted using different feature extraction approaches form Pcap files, namely:

  • AD_Flow-based-features: Contains features extracted using CICFlowMeter (.csv files). This is expected to be used in Anomaly Detection (AD) and attack classification studies for IoT devices.
  • DI_AD_Packet-based-features: Contains features extracted using Packet-per-Packet analysis from Pcap files (.csv files). This dataset can be simultaneously used in both Device Identification (DI) and Anomaly Detection (AD) studies for IoT devices.
  • README.txt: Each subdirectory contains a README.txt file that provides a description of the features available in the corresponding .csv files.

Acknowledgements

The authors express their gratitude to Mastercard Vancouver Tech Hub and the Canadian Institute for Cybersecurity (CIC) for their financial and educational support.

CIC IoT dataset 2023:  Neto EC, Dadkhah S, Ferreira R, Zohourian A, Lu R, Ghorbani AA. CICIoT2023: A real-time dataset and benchmark for large-scale attacks in IoT environment. Sensors. 2023 Jun 26;23(13):5941.

Citation

More details and information on the feature descriptions, feature extraction methodologies, and baseline machine learning models used for evaluation and comparison are available in the following paper. Researchers using this dataset are requested to cite the associated research publication.

M. Rabbani, J. Gui, F. Nejati, Z. Zhou, A. Kaniyamattam, M. Mirani, G. Piya, I. Opushnyev, R. Lu, A. A. Ghorbani. "Device Identification and Anomaly Detection in IoT Environments," IEEE Internet of Things Journal, Dec 2024.

Download the dataset