Name: Using Generative Adversarial Networks to Harden Phishing Class
Start: 2019-01-09T09:30:00-0500
End: 2019-01-09T10:00:00-0500

Back To Schedule

Using Generative Adversarial Networks to Harden Phishing Class

Feedback form is now closed.

As machine learning classifiers are increasingly deployed for defensive cybersecurity purposes, there is a growing interest in using adversarial machine learning to allow for the safe use of these classifiers. One area of focus is on building classifiers that are robust to evasion attacks, where evasion attacks are adversarial examples specifically crafted to defeat a machine learning model.

In this presentation, we explore the use of generative adversarial networks (GANs) to construct synthetic phishing domains as potential evasion attacks, and test the value of including these generated domains in the training set of a machine learning classifier designed to correctly label phishing and non-phishing domains. Specifically, we test the hypothesis that by training a classifier on an augmented set of data that includes generated domains, we will build a more robust classifier for the task of identifying phishing domains. To perform this testing, we construct several random forest classifiers, all of which use the same set of hand-engineered features.

We first develop an initial classifier that is trained on a corpus of benign and phishing domains, with no generated examples. We develop three additional random forest classifiers, two of which are trained on synthetic examples generated by different GANs and one which is trained on a corpus that includes additional genuine phishing domains. The purpose of including this final model is to determine whether any performance gains achieved by the GAN-augmented classifiers can be explained by a simple increase in the size of the training data. We test all four models on a holdout set of domains, which includes benign, phishing, and generated domains. While these test set results indicate that GAN-hardened classifiers are more robust to potential evasion attacks, we also use results obtained by deploying all four models in an operational prototype environment to determine the real value proposition. Real-world testing allows us to move beyond academic validation to concretely demonstrate whether this type of approach shows meaningful promise in improving the safe adoption of defensive machine learning classifiers. We conclude with a discussion on areas for future work and extensions

What will Attendees learn?
Although there is a lot of hype surrounding deep learning methods, they are most often discussed in the context of image generation and recognition problems, which can make it difficult to understand their potential use and value for cybersecurity problems. This talk will highlight the potential advantages as well as the challenges of applying advanced machine learning methods like generative adversarial networks to a relevant problem in security. Attendees will gain an understanding of the value of using these techniques to develop robust machine learning classifiers, and they will leave with suggestions and ideas for how to apply them in their own security operations.

Speakers

Jen Burns

Senior Cybersecurity Engineer, The MITRE Corporation

Jen Burns is a senior cybersecurity engineer who joined MITRE shortly after earning her master’s degree in information security at Carnegie Mellon University. She's a technical lead on MITRE’s cyber threat intelligence strategy, focusing on the efforts to move ATT&CK publication... Read More →

Emily Heath

Capability Area Lead, The MITRE Corporation

Emily Heath is the Capability Area Lead for Cyber Data Analytics and Malware in the Defensive Operations Department at the MITRE Corporation. Her work focuses on the application of machine learning, analytics, and optimization approaches to problems in cybersecurity, ranging from... Read More →

3. Using Generative Adversarial Networks to Harden Phishing Classifiers PPTX

Wednesday January 9, 2019 9:30am - 10:00am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

General Session, Domain Analysis

Survey https://sei.az1.qualtrics.com/jfe/form/SV_bpXnQIul8bws95b

FloCon 2019

Jen Burns

Emily Heath

Attendees (22)

FloCon 2019

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Jen Burns

Emily Heath

Attendees (22)