Loading…
FloCon 2019 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

General Session [clear filter]
Wednesday, January 9
 

8:30am

Monitoring Massive Network Traffic Using Bayesian Inference
Monitoring network logs from DNS requests to TCP connections is challenging because these logs are both large and noisy- hindering efforts to identify malicious traffic. In a sizable network, for example, it is common to see thousands of requests made to one destination- at one time the frequency is cyclical and at another sporadic. This random behavior in network connections causes most unsupervised and supervised statistical modeling to fail. In this talk we discuss methods for performing large scale Bayesian inference on DNS logs aggregated into count data, representing the number of requests from tens of millions of stub IPs made to hundreds of millions of domains. We describe novel mixtures of common discrete distributions, or hidden Markov processes, that model some of the most sporadic network traffic volumes to domain names. For example, we discuss how the zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) distributions, and their more generalized forms, provide parameters we can use to differentiate traffic volumes associated with day-to-day threats from spam and malvertising to widespread threats arising from botnets. Using Apache Spark and Stripe’s newly released Rainier - a powerful Bayesian inference software for the JVM - we run tens of thousands of simulations per domain, fitting the underlying distribution of requests, then repeating this for millions of domains. We profile the performance by fitting a variety of mixtures of distributions to different sporadic traffic volumes. Running simulations often, we then show how to efficiently trend parameter estimates using exponential moving averages to model day/night and weekday/weekend traffic distributions. With hundreds of thousands of simulated and archived traffic patterns associated with benign and malicious network traffic, we show how to reduce false alarms to effectively monitor evolving online threats and masquerading malicious traffic.

Attendees will learn:
In this session, you’ll learn:
• The latest advances in Bayesian inference on the JVM using Stripe’s open sourced Rainier project
• To scale Bayesian inference to internet scale datasets using Apache Spark
• To build time dependent risk and severity metrics identifying network anomalies associated with pernicious threats like spam, malvertising and botnets

And cover mathematical concepts to:
• Model sporadic network traffic using discrete probability distributions
• Build Hidden Markov Models (HMMs) capturing idle/active states of network traffic
• Use Markov chain Monte Carlo (MCMC) methods
• Handle outliers, false alarms, and time dependent trends

Speakers
avatar for David Rodriguez

David Rodriguez

Senior Research Engineer, Cisco Systems, Inc
David Rodriguez works as a Senior Research Engineer at Cisco Umbrella (OpenDNS). He has co-authored multiple pending patents with Cisco in distributed machine learning applications centered around deep learning and behavioral analytics. He has an MA in Mathematics from San Francisco... Read More →



Wednesday January 9, 2019 8:30am - 9:00am
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

9:00am

Arbitrary Albatross: Neutral Naming of Vulnerabilities at Scale
Vulnerability identification is critical defensive security infrastructure. We have CVE, which is improving scope and coverage, But CVE assigns numbers and people like words. Phrases. Names. From Heartbleed to Efail, there’s a trend in security research to market disclosure events with catchy brand names. Some are annoyed by this trend. Is annoyance justified? Names imply importance. Is the claimed importance justified? It may be that a more human-oriented handle is beneficial. We explore the issues around named vulnerabilities and present a system to generate names separate from implied importance.

Speakers
avatar for Leigh Metcalf (Software Engineering Institute)

Leigh Metcalf (Software Engineering Institute)

Senior Network Security Research Analyst, CERT Division - Software Engineering Institute
Leigh Metcalf has a PhD from Auburn University in Mathematics. She has been at CERT for over 8 years as a Cybersecurity researcher and is the co-Editor-in-chief of ACM Digital Threats: Research and Practice. She is also the primary author of the book Cybersecurity and Applied Mathematics... Read More →



Wednesday January 9, 2019 9:00am - 9:30am
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

9:30am

Using Generative Adversarial Networks to Harden Phishing Class
As machine learning classifiers are increasingly deployed for defensive cybersecurity purposes, there is a growing interest in using adversarial machine learning to allow for the safe use of these classifiers. One area of focus is on building classifiers that are robust to evasion attacks, where evasion attacks are adversarial examples specifically crafted to defeat a machine learning model.

In this presentation, we explore the use of generative adversarial networks (GANs) to construct synthetic phishing domains as potential evasion attacks, and test the value of including these generated domains in the training set of a machine learning classifier designed to correctly label phishing and non-phishing domains. Specifically, we test the hypothesis that by training a classifier on an augmented set of data that includes generated domains, we will build a more robust classifier for the task of identifying phishing domains. To perform this testing, we construct several random forest classifiers, all of which use the same set of hand-engineered features.

 We first develop an initial classifier that is trained on a corpus of benign and phishing domains, with no generated examples. We develop three additional random forest classifiers, two of which are trained on synthetic examples generated by different GANs and one which is trained on a corpus that includes additional genuine phishing domains. The purpose of including this final model is to determine whether any performance gains achieved by the GAN-augmented classifiers can be explained by a simple increase in the size of the training data. We test all four models on a holdout set of domains, which includes benign, phishing, and generated domains. While these test set results indicate that GAN-hardened classifiers are more robust to potential evasion attacks, we also use results obtained by deploying all four models in an operational prototype environment to determine the real value proposition. Real-world testing allows us to move beyond academic validation to concretely demonstrate whether this type of approach shows meaningful promise in improving the safe adoption of defensive machine learning classifiers. We conclude with a discussion on areas for future work and extensions

What will Attendees learn?
Although there is a lot of hype surrounding deep learning methods, they are most often discussed in the context of image generation and recognition problems, which can make it difficult to understand their potential use and value for cybersecurity problems. This talk will highlight the potential advantages as well as the challenges of applying advanced machine learning methods like generative adversarial networks to a relevant problem in security. Attendees will gain an understanding of the value of using these techniques to develop robust machine learning classifiers, and they will leave with suggestions and ideas for how to apply them in their own security operations.


Speakers
avatar for Jen Burns

Jen Burns

Senior Cybersecurity Engineer, The MITRE Corporation
Jen Burns is a senior cybersecurity engineer who joined MITRE shortly after earning her master’s degree in information security at Carnegie Mellon University. She's a technical lead on MITRE’s cyber threat intelligence strategy, focusing on the efforts to move ATT&CK publication... Read More →
avatar for Emily Heath

Emily Heath

Capability Area Lead, The MITRE Corporation
Emily Heath is the Capability Area Lead for Cyber Data Analytics and Malware in the Defensive Operations Department at the MITRE Corporation. Her work focuses on the application of machine learning, analytics, and optimization approaches to problems in cybersecurity, ranging from... Read More →



Wednesday January 9, 2019 9:30am - 10:00am
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

10:00am

Hunting Frameworks
In this talk, I will be discussing the type of information that should be continuously collected and kept on-hand for investigative value in the case of a network compromise. I plan to address the value of such artifacts in an investigation. Additionally, I plan to note several open source solutions and resources that exist to assist in these endeavors. I will likely also touch on the different forms of "hunting" (indicator-based vs hypothesis-driven).

Hunting has been a buzz word for a few years. Talks abound on how to find anomalies within data-sets utilizing various methods. However, rarely does a talk present a framework for hunting. How do I actually get started within the field? What data should be collected and centralized? Can the data be enriched? How do you hunt with this data?

Fortunately, lots of great resources exist for building out a functional environment for hunting.  Once the environment exists, resources like Mitre's ATT&CK and testing tools like Red Team emulation tools allow teams to quickly build and validate capabilities. In this talk, we will put all these pieces together to establish a framework for hunting by discussing key points of hunt: the types of data that are important, how to learn from and enrich data in your own environment, and hunting concepts driven by various methods. This talk aims to empower operators everywhere in their network defense capacities.

Attendees will Learn:
* The benefits of holistic log aggregation for incident validation, incident response, and hunting
* Hunting concepts
* Resources available for hunting

Speakers
avatar for David Gainey

David Gainey

Defense Information Systems Agency (DISA)
David Gainey has been responding to system and network compromises for 10+ years with DISA. His work involves analyzing isolated, compromised systems and malware; increasing defensive posture; maturing incident response tactics, techniques and procedures (TTPs); and sharing knowledge... Read More →



Wednesday January 9, 2019 10:00am - 10:30am
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130