Skip to main content

latest news

The AndroOBFS Dataset

Description:

With the large-scale adaptation of Android OS and ever-increasing contributions in the Android application space, Android has become the number one target of malware authors. In recent years, a large number of automatic malware detection and classification systems have evolved to tackle the dynamic nature of malware growth using either static or dynamic analysis techniques. Performance of static malware detection methods degrades due to the obfuscation attacks. Although many benchmark datasets are available to measure the performance of malware detection and classification systems, only a single obfuscated malware dataset (PRAGuard) is available to showcase the efficacy of the existing malware detection systems against the obfuscation attacks. PRAGuard contains outdated samples till 2013 and does not represent the latest application categories. Moreover, PRAGuard does not provide the family information for malware because of which PRAGuard can not be used to evaluate the efficacy of the malware family classification systems. Hence, we create and release AndroOBFS, a time-obfuscated malware dataset with familial information spanning over three years from 2018 to 2020.

The AndroOBFS dataset contains 16279 unique real-world obfuscated malware samples in six categories viz. (i) Trivial, (ii) Renaming, (iii) Encryption, (iv) Reflection, (v) Code, and (vi) Mix (a mix of two or more methods from (i) to (v)). Out of 16279 unique obfuscated malware samples, 114579 samples are distributed across 158 families with at least two unique malware samples in each family. We store all the information about obfuscated malware with family in two CSV files; one CSV file corresponds to 16279 samples ( 16279.csv) and the other for 14579 familial malware samples       ( 14579.csv). We release this dataset to aid the Android malware study in designing robust and obfuscation resilient malware detection and classification systems.

Seminars & Events

Annual Report
publications
courses
responsible-disclosure