Mostly these datasets downloaded
-
https://github.com/ocatak/malware_api_class : malware dataset generated by Cuckoo Sandbox based on Windows OS API calls analysis
-

-
Network packets - IoT 23 dataset : https://www.stratosphereips.org/datasets-iot23
- prerprocessed dataset - kaggle : https://www.stratosphereips.org/datasets-iot23
-
Harward HPC Ransomware Dataset : https://dataverse.harvard.edu/dataverse/ransomware
- IO_5Events_7Rounds IO_41Events_5Rounds (This set is a collection of 8 block I/O events. We collected the data on 6 different VM loads, 7 rounds each for 22 ransomware, 3 benignware, 1 dry run. Total number of traces in this dataset includes: 6x7x26=1092. Go through the README file in this directory for full details.)
- HPC_5Events_7Rounds HPC_41Events_5Rounds This set is a collection of 41 HPC events. We collected the data on 6 different VM loads, 5 rounds each for 22 ransomware, 3 benignware, 1 dry run. Total number of traces in this dataset includes: 6x5x26=780. This data is used for feature selection (reducing from 41 to 5 events)β¦
-
BODMAS: An Open Dataset for Learning based Temporal Analysis of PE Malware :
-
EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models : https://github.com/elastic/ember/tree/master?tab=readme-ov-file#download
-
Malware & Goodware Dynamic Analysis Reports https://www.kaggle.com/datasets/greimas/malware-and-goodware-dynamic-analysis-reports
-
A curated dataset of malware and benign Windows executable samples for malware researchers https://github.com/Mayachitra-Inc/MaleX (mailing dataset)
MISC (Still to Donwloadv these )
- https://cybersciencelab.com/internet-of-things-malware-dataset/
- API calls generated by dynamic malware analysis : https://www.kaggle.com/datasets/marcuscarpenter97/api-calls-generated-by-dynamic-malware-analysis
- Malware Detection in Network Traffic Data (used IoT23 ) https://www.kaggle.com/datasets/agungpambudi/network-malware-detection-connection-analysis
- [CIC-AndMal-2020] Static-Dynamic Malware analysis https://www.kaggle.com/datasets/albertozorzetto/cic-andmal-2020-dynamic-static-analysis
- Quo Vadis: Dynamic Malware Analysis Dataset https://www.kaggle.com/datasets/dmitrijstrizna/quo-vadis-malware-emulation
Least Prior :
- Malware Analysis Datasets: API Call Sequences https://www.kaggle.com/datasets/ang3loliveira/malware-analysis-datasets-api-call-sequences
- Malware Analysis Datasets: Top-1000 PE Imports https://www.kaggle.com/datasets/ang3loliveira/malware-analysis-datasets-top1000-pe-imports