FloCon 2019 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

General Session [clear filter]
Tuesday, January 8

8:30am EST

FloCon 2019 Chair Angela Horneman will kick off the conference.

avatar for Angela Horneman

Angela Horneman

Network Intelligence Analyst, CERT Division - Software Engineering Institute
Angela Horneman is a Network Intelligence Analyst for the CERT division of the SEI. Her focus is on helping others understand network cyber security topics and solve related problems. They can then make better decisions, improve their security posture, and better interact in the cyber... Read More →

Tuesday January 8, 2019 8:30am - 9:00am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

9:00am EST

Cutting Through the Hype: How to Effectively Apply ML to Cybersecurity
Current cybersecurity challenges represent a machine-scale problem and large amounts of automation are required to solve it. Data scales will continue to grow, further compounding the challenge. Defenders need to use the internal network and host log data that is already at their disposal, cross-network and cross-host, to discover the presence of sophisticated adversaries. This talk will detail a machine-learning based approach for how to solve this difficult problem--automated internal network monitoring, with low false positive rates--to find sophisticated adversaries and their campaigns.

It will discuss the three fundamental requirements to achieve effective monitoring with a reasonable, practical amount of resources:
  1. Focus on the adversary campaign holistically: Using a campaign-oriented framework for monitoring also simplifies what needs to be monitored. You only monitor behaviors the adversary must perform, the ones they cannot avoid, to succeed in their mission. This reduces the noise, false positives and level of effort required by analysts.
  2. Automation, machine learning, and interpretability: It is not possible today to directly model the problem: the community does not have enough examples of “known bad” (identified, true APT campaigns) and networks are too complex and varied. To frame this as a machine learning problem, it needs to be broken down into multiple sub-problems of monitoring for individual surprising behaviors. E.g., is this an unusual number of pings? Is this an unusually large data movement?
  3. Adapt to ever-changing environments and adversaries: the training of models must be automatic and not require human intervention, meaning they must train on data in situ, must be retrained and updated frequently to stay relevant, support using a variety of raw data sources, and be easily updatable to account for the latest and greatest adversary tactics.

Any approach that lacks these necessary pieces will not scale to large networks or will lag behind evolving adversaries.

Attendees will Learn:
1) Knowledge of the ways ML can be effectively and ineffectively applied to the challenges of cybersecurity, so they are more educated on to evaluate different tools for their unique environments
2) A strategic understanding of how to frame the problem of advanced threat detection so that machine learning can be effectively applied
3) A more in-depth understanding of the core behaviors in the adversary campaign, and how that enables a reduction in false positives

avatar for Jason Kichen

Jason Kichen

Vice President of Advanced Security Concepts, ESentire
Jason Kichen serves as the Vice President of Advanced Security Concepts at eSentire; prior to its acquisition by eSentire, Mr. Kichen was the Director of Security Research & Operations at Versive Previously, Mr. Kichen spent nearly 15 years working in the U.S. intelligence community... Read More →

Tuesday January 8, 2019 9:00am - 9:30am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

9:30am EST

Improved Hunt Seeding with Specific Anomaly Scoring
As the practice of hunting has spread through enterprise cyber security, interest in generalized anomaly detectors to seed hunts has also increased. The generally accepted premise seems to be that security events are rare and rare events are almost always anomalous. Therefore, if one seeds hunts with anomalous events, then the hunts are more likely to uncover activity of interest. However, implementing directed hunts in this manner requires the ability to define and detect anomalous behavior within complex systems. This is typically done probabilistically and there are several products available that employ machine learning approaches, such as neural networks, to define anomalous network activity. However, unsupervised learning approaches are inescapably plagued by high false alarm rates which, in turn, lead to analyst alert fatigue. To decrease false alarm rates, one may tune such products to only alert in cases of extremely anomalous events. But this still fails to address the heart of the problem which is that generalized anomaly detection for an entire network is probably not an optimal approach. Rather, defenders should develop a number of specific models. What is required is a scalable and repeatable framework for doing so. We present an open source approach for cyber security experts and data scientists/statistical engineers to collaboratively develop specific anomaly scoring models. Our approach utilizes a non-parametric kernel density estimator to evaluate the distributions of security logs. Once the desired distribution has been learned, analysts may use it to score records with a single-number, probabilistic measure of anomalousness. Logs can then be filtered based on this anomalousness score and rare events can be utilized to seed hunts. After sufficient validation, models may be transitioned to detectors which alert defenders when some criterion is met.

Attendees will Learn:
Attendees will be presented with a flexible, open source tool for non-parametrically modeling multivariate densities of network logs. Once constructed, such models can be utilized to score the anomalousness of log records and facilitate directed hunting. More subtly, attendees will gain insight into the potential benefits available through iteratively collaborating with statistical engineers/data scientists, such as the construction of highly customizable models for specific phenomena on specific networks.

avatar for Brenden Bishop

Brenden Bishop

Data Scientist, Columbus Collaboratory
Brenden Bishop is a data scientist at Columbus Collaboratory. He focuses on developing prototype solutions for network defenders and enterprise IT in a variety of problem areas, including network anomaly detection and active directory tidying. He is a graduate of The Ohio State University... Read More →

Tuesday January 8, 2019 9:30am - 10:00am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

10:00am EST

Using Triangulation to Evaluate Machine Learning Models
There are few industries using machine learning models with more at stake than network security. Having a high performing statistical model is critical: a false positive error leads to unnecessary work for the network security team while a false negative error increases exposure to malware, threat actors and/or other types of threats. Since there are no perfect machine learning models, as data scientists our task is to first convince ourselves and then convince others that we have a statistical model worthy for defending the network. Persuasion, though, can be difficult because many of the steps and assumptions that go into training a statistical model from data are difficult, if not impossible, to accurately share with the ultimate consumers of the model. As machine learning and other advanced statistical techniques become more wide spread within the network analysis community, the need for accurate assessment of models for threat detection is also increasing.

Drawing on ideas from the philosophy of science such as falsifiability and counterfactuals, we present a framework for triangulating the performance of machine learning models using a series of questions to help establish the validity of performance claims. In navigation tasks, triangulation can be used to determine one’s current location based on the angle and distance from other landmarks with known position. We believe triangulation of a different sort is necessary to determine the performance of machine learning models. Each of the steps that go into making a machine learning model including input data selection, sampling, outcome variable selection, feature creation, model selection and evaluation criteria shape the final model and provide necessary context for interpreting the performance results. Our framework highlights ways to uncover assumptions hidden in those choices, identify higher performing models, and ultimately better defend our networks.

Attendees will Learn:
​​​​Attendees will be given a series of questions and data queries that can be used to determine the parameters of effectiveness for a machine learning model. Using our framework can help security operators better understand the performance characteristics of machine learning models, helping to avoid unnecessary errors.

avatar for Andrew Fast

Andrew Fast

Chief Data Scientist, CounterFlow AI, Inc
Andrew Fast is the Chief Data Scientist and co-founder of CounterFlow AI, where he leads the implementation of streaming machine learning algorithms on CounterFlow AI's ThreatEye cloud-native analytics platform for Encrypted Traffic Analysis. Previously, Dr. Fast served as the Chief... Read More →

Tuesday January 8, 2019 10:00am - 10:30am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

2:30pm EST

Cybersecurity Data Science: Best Practices from the Field
Cybersecurity data science (CDS) is a fast emerging professional discipline. The field seeks to apply data analytics methods and processes to goals and practices associated with cybersecurity. As an emerging domain, many aspects of mature professions–standards, best practices, a body of knowledge–are still evolving.

The rapid evolution of technical infrastructure and tools, cyber threats, and data science methods, as well as political, regulatory, legal, and organizational complexities, combine to make this a challenging domain. Collaboration is difficult as practitioners often work in secrecy, organizational isolation, and under tight tactical pressures.

This presentation seeks to derive and categorize a set of common threads which characterize the emerging professional discipline from the perspective of practitioners. A comprehensive examination of the nascent profession is offered with a view to iterating towards professionalization.

As a central anchor, the presentation reports on research into cybersecurity data science best practices based on interviews with a representative sample of recognized global practitioners conducted in 2018.

Through the results of the interview research, the presentation seeks to address the questions:
• What is the professional status of cybersecurity data science?
• What are perceived central challenges?
• What methodological and technical trends are emerging?
• What are key best practices based on the collective experiences of peers?
• What aspects of data science are appearing on the adversarial side?

The objective of this research is to better understand and report on key factors underlying cybersecurity data science as an emerging profession. Utilizing qualitative research methods, results have been examined quantitatively to identify trends, challenges, and best practices resident in the nascent field.

As this research will lead to a forthcoming book publication, the hope is to gain active feedback from the community through discussion and debate on the best practices and challenges identified.

Attendees will Learn:
This talk seeks to take a step back from methodological insights and case studies to ask larger questions concerning the status of cybersecurity data science as an emerging profession.

Is the discipline a temporary trend, a solution in search of problems, or an enduring and expanding phenomenon? To resolve these disparate views, a social science based qualitative research initiative was undertaken.

A representative sample of global cyber security data scientists were interviewed to gain insights into the professional status of the domain. Qualitative feedback from practitioners was analyzed through quantitative methods to derive a set of key trends, challenges, and best practices.

avatar for Scott Mongeau

Scott Mongeau

Cybersecurity Data Scientist, SAS Institute
Scott Mongeau is a Cybersecurity Data Scientist - Principal Business Solutions Manager at SAS Institute. He has three decades of experience in designing and deploying data-intensive solutions in a range of industries, including management consulting, software and services, financial... Read More →

Tuesday January 8, 2019 2:30pm - 3:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

3:00pm EST

Four Machine Learning Techniques that Tackle Scale (And Not Just By Increasing Accuracy)
Because many of the most prominent successes of machine learning have been in the area of prediction via supervised learning, there has been a disproportionately large emphasis in the security realm on using machine learning to identify maliciousness. In the lab, analysis of a new model often looks promising, with any metric greater than 99% being deemed a success. Attempts at implementation in a real environment and at scale often run into irritating and humdrum issues: you can’t get the content you need in the right place, collecting features takes too long, you get some of the data but there are gaps, you didn’t realize that the real data would be so different from your training samples, your model seems to be oddly confident that things are bad but you can’t figure out why. And the most classic: with a billion samples, 99% isn’t so great. Striving for better accuracy in your model may help with the 99% problem, but does little for the other issues.

This emphasis on classification accuracy overlooks the other ways that machine learning techniques can help. Several contemporary approaches lend themselves to helping with these issues of scale. In some cases, these techniques provide additional context that reduces the load on human analysis. For example, techniques that deal with the problem of adversarial examples can also be used to flag results that come from a previously unseen distribution. Bayesian approaches can provide insight about levels of confidence in conclusions. Also, techniques aimed at model explainability can provide more rapid troubleshooting of results. In other cases, architectures can enable scalable structures. Multi-stage machine learning models allow for distributed models and effectively merge goals of reducing scaling costs with achieving good model performance. Towards a similar goal, techniques have been developed to reduce the footprint of models, thereby allowing for wider distribution.

This work presents an overview of the ways in which recent machine learning techniques can provide ancillary value—value beyond accurate predictions—that helps with the problems of scaling real-world implementations. In addition to an overview of the research, this work will provide specific examples of some of these techniques applied to security data.

Attendees will Learn:
Attendees will learn about ways in which recently developed machine learning techniques can help with some of the messier aspects of trying to apply a classification model to large-scale data. Learning about these issues and some of the potential remedies ahead of time will make the implementation of machine learning models to real-world security operations environments more likely to succeed.

avatar for Lindsey Lack

Lindsey Lack

Principal Security Engineer, Gigamon
Lindsey Lack is a Principal Security Engineer at Gigamon, where he focuses on the application of data science to information security. Lindsey has over nineteen years of experience in information security, having led a data science and threat modeling team, performed malware reverse... Read More →

Tuesday January 8, 2019 3:00pm - 3:30pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

3:30pm EST

The Power of Cyber Threat Intelligence and its Influence on Executive Decision Making
Executives are inundated with an abundance of cyber threat intelligence from several sources, but what does it all mean?  How this information is conveyed to them matters. Too much or too little data has the same effect, sub-optimal decision making that does not allow executives to make choices that posture the organization for the future and can become costly over time. How can leaders use tools systems that are at their disposal to prevent, anticipate, or mitigate the next cyber-attack from an evolving actor? Could data analytics or threat intelligence have prevented stock values from being affected?

C-Suite executives must calculate the cyber risks to their companies and justify their return on investment when looking to employ expensive cybersecurity ecosystems. Government leaders need to understand the threats to the USG from adversaries and have the task of defending against and proactively posturing the nation to be successful against future cyber-attack. Learn how to anticipate the right questions and convey the right information to executives through case studies that highlight the power of what cyber threat intelligence can do to drive executive decision making. Examples of this play out in both the private and government spheres where time is of the essence and questions surrounding attribution, liability, potential repercussions and mitigation force executives to make challenging decisions that could affect company reputation or impact policy considerations.

avatar for Eboni Thamavong

Eboni Thamavong

Lead Associate - Commercial Cyber Team, Booz Allen Hamilton
Eboni Thamavong has worn many hats throughout her career and is at the forefront of transformation in cybersecurity operations, analysis, and strategy. She is known for identifying areas for development and growth to move organizations forward. Ms. Thamavong is known for her insights... Read More →

Tuesday January 8, 2019 3:30pm - 4:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

4:00pm EST

The Generation and Use of TLS Fingerprints
There are many TLS implementations in use by different applications and operating systems, each of which evolves as that protocol does. TLS fingerprints offer a way to identify client implementations from passive observations of sessions, and thus to make valuable inferences about the applications, libraries, and operating systems in use. However, to do so reliably requires a complete and regularly updated database of TLS fingerprints, accurate models of the prevalence of and relationships between libraries and processes, and a fingerprint definition that accommodates GREASE and admits a similarity measure. In this presentation, we describe a TLS fingerprinting system that meets these requirements. By fusing detailed network flow data and managed endpoint telemetry from an enterprise network, we have developed the first large-scale system to generate a TLS fingerprint database automatically and continuously. Additionally, our fingerprints naturally capture the intricacies of the information a TLS fingerprint conveys, i.e., each fingerprint is associated with a list of application names, hashes, and version numbers observed utilizing the specified ClientHello parameters, sorted by their empirical prevalence. The fingerprint database is open-source and regularly updated. After the first month of data collection, we had generated nearly 1,000 unique TLS fingerprints that provide coverage for nearly 5,000 unique processes.

Additionally, we will present an analysis of TLS fingerprints in the wild, which our fingerprint database makes possible. First, we analyze the stability of the fingerprint database, i.e., the rate that the environment introduces new TLS fingerprints and the database’s attribution efficacy over time. When our system observes a TLS session in the wild and the database lacks attribution information for that session, we return a set of the closest known fingerprints by using a similarity metric over the space of TLS fingerprints. We leverage our longitudinal data to quantify the effectiveness of this approach. Next, we analyze cases where the TLS fingerprint provides application attribution versus library attribution, and relatedly, the set of fingerprints that uniquely identifies a single application versus a set of applications. Finally, we will use graph analysis based on a graph derived from the fingerprint database that highlights the evolutionary relationships between TLS fingerprints and different application versions

Attendees will Learn:
​​​​Attendees will learn about a new open-source database that provides application attribution via the TLS ClientHello. Furthermore, the audience will learn about common pitfalls when using this type of information and analysis techniques that make effective use of our newly open-sourced TLS fingerprint database.

avatar for Blake Anderson

Blake Anderson

Senior Technical Leader, Cisco
Blake Anderson currently works as a Senior Technical Leader in Cisco’s Advanced Security Research team. Since starting at Cisco in early 2015, he has participated in and led projects aimed at improving the analysis of encrypted network traffic, which has resulted in open source... Read More →

Tuesday January 8, 2019 4:00pm - 4:30pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130
Wednesday, January 9

8:30am EST

Monitoring Massive Network Traffic Using Bayesian Inference
Monitoring network logs from DNS requests to TCP connections is challenging because these logs are both large and noisy- hindering efforts to identify malicious traffic. In a sizable network, for example, it is common to see thousands of requests made to one destination- at one time the frequency is cyclical and at another sporadic. This random behavior in network connections causes most unsupervised and supervised statistical modeling to fail. In this talk we discuss methods for performing large scale Bayesian inference on DNS logs aggregated into count data, representing the number of requests from tens of millions of stub IPs made to hundreds of millions of domains. We describe novel mixtures of common discrete distributions, or hidden Markov processes, that model some of the most sporadic network traffic volumes to domain names. For example, we discuss how the zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) distributions, and their more generalized forms, provide parameters we can use to differentiate traffic volumes associated with day-to-day threats from spam and malvertising to widespread threats arising from botnets. Using Apache Spark and Stripe’s newly released Rainier - a powerful Bayesian inference software for the JVM - we run tens of thousands of simulations per domain, fitting the underlying distribution of requests, then repeating this for millions of domains. We profile the performance by fitting a variety of mixtures of distributions to different sporadic traffic volumes. Running simulations often, we then show how to efficiently trend parameter estimates using exponential moving averages to model day/night and weekday/weekend traffic distributions. With hundreds of thousands of simulated and archived traffic patterns associated with benign and malicious network traffic, we show how to reduce false alarms to effectively monitor evolving online threats and masquerading malicious traffic.

Attendees will learn:
In this session, you’ll learn:
• The latest advances in Bayesian inference on the JVM using Stripe’s open sourced Rainier project
• To scale Bayesian inference to internet scale datasets using Apache Spark
• To build time dependent risk and severity metrics identifying network anomalies associated with pernicious threats like spam, malvertising and botnets

And cover mathematical concepts to:
• Model sporadic network traffic using discrete probability distributions
• Build Hidden Markov Models (HMMs) capturing idle/active states of network traffic
• Use Markov chain Monte Carlo (MCMC) methods
• Handle outliers, false alarms, and time dependent trends

avatar for David Rodriguez

David Rodriguez

Senior Research Engineer, Cisco Systems, Inc
David Rodriguez works as a Senior Research Engineer at Cisco Umbrella (OpenDNS). He has co-authored multiple pending patents with Cisco in distributed machine learning applications centered around deep learning and behavioral analytics. He has an MA in Mathematics from San Francisco... Read More →

Wednesday January 9, 2019 8:30am - 9:00am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

9:00am EST

Arbitrary Albatross: Neutral Naming of Vulnerabilities at Scale
Vulnerability identification is critical defensive security infrastructure. We have CVE, which is improving scope and coverage, But CVE assigns numbers and people like words. Phrases. Names. From Heartbleed to Efail, there’s a trend in security research to market disclosure events with catchy brand names. Some are annoyed by this trend. Is annoyance justified? Names imply importance. Is the claimed importance justified? It may be that a more human-oriented handle is beneficial. We explore the issues around named vulnerabilities and present a system to generate names separate from implied importance.

avatar for Leigh Metcalf (Software Engineering Institute)

Leigh Metcalf (Software Engineering Institute)

Senior Network Security Research Analyst, CERT Division - Software Engineering Institute
Leigh Metcalf has a PhD from Auburn University in Mathematics. She has been at CERT for over 8 years as a Cybersecurity researcher and is the co-Editor-in-chief of ACM Digital Threats: Research and Practice. She is also the primary author of the book Cybersecurity and Applied Mathematics... Read More →

Wednesday January 9, 2019 9:00am - 9:30am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

9:30am EST

Using Generative Adversarial Networks to Harden Phishing Class
As machine learning classifiers are increasingly deployed for defensive cybersecurity purposes, there is a growing interest in using adversarial machine learning to allow for the safe use of these classifiers. One area of focus is on building classifiers that are robust to evasion attacks, where evasion attacks are adversarial examples specifically crafted to defeat a machine learning model.

In this presentation, we explore the use of generative adversarial networks (GANs) to construct synthetic phishing domains as potential evasion attacks, and test the value of including these generated domains in the training set of a machine learning classifier designed to correctly label phishing and non-phishing domains. Specifically, we test the hypothesis that by training a classifier on an augmented set of data that includes generated domains, we will build a more robust classifier for the task of identifying phishing domains. To perform this testing, we construct several random forest classifiers, all of which use the same set of hand-engineered features.

 We first develop an initial classifier that is trained on a corpus of benign and phishing domains, with no generated examples. We develop three additional random forest classifiers, two of which are trained on synthetic examples generated by different GANs and one which is trained on a corpus that includes additional genuine phishing domains. The purpose of including this final model is to determine whether any performance gains achieved by the GAN-augmented classifiers can be explained by a simple increase in the size of the training data. We test all four models on a holdout set of domains, which includes benign, phishing, and generated domains. While these test set results indicate that GAN-hardened classifiers are more robust to potential evasion attacks, we also use results obtained by deploying all four models in an operational prototype environment to determine the real value proposition. Real-world testing allows us to move beyond academic validation to concretely demonstrate whether this type of approach shows meaningful promise in improving the safe adoption of defensive machine learning classifiers. We conclude with a discussion on areas for future work and extensions

What will Attendees learn?
Although there is a lot of hype surrounding deep learning methods, they are most often discussed in the context of image generation and recognition problems, which can make it difficult to understand their potential use and value for cybersecurity problems. This talk will highlight the potential advantages as well as the challenges of applying advanced machine learning methods like generative adversarial networks to a relevant problem in security. Attendees will gain an understanding of the value of using these techniques to develop robust machine learning classifiers, and they will leave with suggestions and ideas for how to apply them in their own security operations.

avatar for Jen Burns

Jen Burns

Senior Cybersecurity Engineer, The MITRE Corporation
Jen Burns is a senior cybersecurity engineer who joined MITRE shortly after earning her master’s degree in information security at Carnegie Mellon University. She's a technical lead on MITRE’s cyber threat intelligence strategy, focusing on the efforts to move ATT&CK publication... Read More →
avatar for Emily Heath

Emily Heath

Capability Area Lead, The MITRE Corporation
Emily Heath is the Capability Area Lead for Cyber Data Analytics and Malware in the Defensive Operations Department at the MITRE Corporation. Her work focuses on the application of machine learning, analytics, and optimization approaches to problems in cybersecurity, ranging from... Read More →

Wednesday January 9, 2019 9:30am - 10:00am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

10:00am EST

Hunting Frameworks
In this talk, I will be discussing the type of information that should be continuously collected and kept on-hand for investigative value in the case of a network compromise. I plan to address the value of such artifacts in an investigation. Additionally, I plan to note several open source solutions and resources that exist to assist in these endeavors. I will likely also touch on the different forms of "hunting" (indicator-based vs hypothesis-driven).

Hunting has been a buzz word for a few years. Talks abound on how to find anomalies within data-sets utilizing various methods. However, rarely does a talk present a framework for hunting. How do I actually get started within the field? What data should be collected and centralized? Can the data be enriched? How do you hunt with this data?

Fortunately, lots of great resources exist for building out a functional environment for hunting.  Once the environment exists, resources like Mitre's ATT&CK and testing tools like Red Team emulation tools allow teams to quickly build and validate capabilities. In this talk, we will put all these pieces together to establish a framework for hunting by discussing key points of hunt: the types of data that are important, how to learn from and enrich data in your own environment, and hunting concepts driven by various methods. This talk aims to empower operators everywhere in their network defense capacities.

Attendees will Learn:
* The benefits of holistic log aggregation for incident validation, incident response, and hunting
* Hunting concepts
* Resources available for hunting

avatar for David Gainey

David Gainey

Defense Information Systems Agency (DISA)
David Gainey has been responding to system and network compromises for 10+ years with DISA. His work involves analyzing isolated, compromised systems and malware; increasing defensive posture; maturing incident response tactics, techniques and procedures (TTPs); and sharing knowledge... Read More →

Wednesday January 9, 2019 10:00am - 10:30am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

1:30pm EST

Network Telescopes Revisited: From Loads of Unwanted Traffic to Threat Intelligence
Network telescope (a.k.a., darknet) is a monitored but otherwise unused IP space that should not receive any legitimate network traffic. In practice, a lot of packets can be observed in there: our network telescope deployed at NASK (Research and Academic Computer Network, Poland) which consists of more than 100 000 unused IP addresses gets about 30 million of packets per hour on average. This presentation will introduce a comprehensive system we developed to analyze malicious traffic on a large scale and produce actionable results in close to real time. We will present case studies where data from our network telescope is used for threat hunting and improving situational awareness.

Presentation plan:
1) Architecture and design
At the beginning, we will discuss basic concepts concerning the architecture of the system and present our approach to data analysis and aggregation.

2) Scanning activity and mass exploitation campaigns
As we are able to monitor a large number of IP addresses, it is possible to continuously observe and analyze trends in scanning activities. Just looking at the dynamics of target ports contributes to better situational awareness, but more in-depth analysis allows to reveal much more information. We will cover the following case studies:
a) Github Memcached DRDoS attack: can scanning patterns indicate an upcoming attack?
b) How publication of vulnerability PoCs or publication of the CVEs translate into observed exploitation campaigns.
c) Recognizing different groups responsible for the scanning activities by the analysis of their methods and technical capacities.

3) Denial of Service attacks
A significant part of the traffic we observe is backscatter generated by DoS attacks (for example TCP SYN or DNS floods) using spoofed source addresses. We are able to identify the victims and estimate duration and magnitude of attacks. We will show examples of interesting DoS attacks and demonstrate how data from network telescopes can be combined with other sources, like DRDoS honeypots, to obtain a global view on volumetric attacks on the internet.

4) Fingerprinting packet generation algorithms
Software for network scanning and DoS attacks (including malware) usually have custom code for generating packets. We will show how it is possible to analyze certain features of packets in the live traffic to automatically build signatures that can be used to fingerprint individual tools. This approach has been successfully applied to analysis of darknet traffic to create multiple signatures and to traffic from malware sandboxes to link some of the signatures to malware families.

Attendees will Learn:
Attendees will learn methods for deriving actionable threat intelligence from traffic collected through the network telescopes. We will explain how packet characteristics can be used to fingerprint network traffic (scanning or flooding) generated by particular malware families. The talk will have mostly practical focus, which should be useful for the members of CERTs/SOCs. From the researcher perspective, we will cover recent advancements in the analysis of network telescope traffic.

avatar for Piotr Bazydlo

Piotr Bazydlo

Head of Network Security Methods Team, Research and Academic Computer Network (NASK, Poland)
Piotr Bazydlo earned a master's degree from Warsaw University of Technology in the faculty of Electronics and Information Technology in 2016. His adventures with cybersecurity started in the NASK (Research and Academic Computer Network) as a researcher in the Network Security Methods... Read More →
avatar for Adrian Korczak

Adrian Korczak

Network Security Researcher, Research and Academic Computer Network (NASK, Poland)
Adrian Korczak is a network security researcher at Research and Academic Computer Network in Poland (NASK). He finished his BS in Network Systems at the University of California Irvine. His interests cover subjects like malware analysis, sandboxing, and DGA.
avatar for Pawel Pawliński

Pawel Pawliński

Principal Security Specialist, CERT Polska / NASK
Paweł Pawliński is a principal specialist at CERT.PL. His past job experience include data analysis, threat tracking, and automation. He is responsible for the design and implementation of the n6 platform for sharing security-related data and has also designed systems for large-scale... Read More →

Wednesday January 9, 2019 1:30pm - 2:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

2:00pm EST

Data as Evidence: Analysis of Logs for Litigation
Security goes well beyond the operational need to identify activity and decide whether it should be allowed to continue unencumbered, further scrutinized, or halted. When it comes to identifying responsible actors and making victims whole, remedies largely depend on criminal and civil adjudication. Successful prosecution and recovery of damages requires that data may be admitted as evidence into the legal record, that the means of analyzing the data withstand scrutiny, and that counsel, court, and jurors understand the story the data analyst finds. Furthermore, careless or myopic analysis used in real time security operations can have disastrous effects when the analysis is scrutinized in litigation.

In this presentation, we consider three case studies where the author led a team that analyzed system
logs, developed findings from the data that were relevant to the nature, scope, and severity of the alleged damage, and presented those results. We focus on the legal processes at work in securing data for analysis, methods for assessing and making use of data, the legal standards for offering expert opinion, and techniques for effectively presenting findings to legal professionals and lay jurors.  The cases are: 
  • Pharmatrak Privacy Litigation, United States Court of Appeals, First Circuit. 329 F.3d 9, in which plaintiffs alleged that pharmaceutical companies collected and sent personal information to third undisclosed third parties, in violation of their privacy policies. Forensic analysis of operational system logs led to critical findings that set standards for application of Federal wiretap statutes to web technology.
  • Ford, et al v. SBC Communications Inc. and SBC Internet Services, Inc.  d/b/a AT&T Internet Services, Inc., Circuit Court of St. Louis County (Missouri) Cause No. 06CC-003325, Division No. 6, in which disparate datasets were analyzed to find any cases where fees were collected for service that could not be provided. New York Stock Exchange Specialists Litigation, U.S. District Court, Southern District of New York, 405 F. Supp. 2d 2, in which the California Public Employees Retirement System (CalPERS) represented a class of investors who were allegedly harmed by securities specialists interpositioning themselves into otherwise executable trades. Analysis of tick-by-tick data from the systems that capture, relay, and display orders for the entire New York Stock Exchange over a five year period made possible findings needed to address the allegations. 
We discuss techniques for analysis and present examples from the case studies and conclude with
principles for data analysts both to support operational needs and to create the foundation to protect the organization in subsequent 

Attendees Will Learn: 
• When system data and analysis can be exposed to the scrutiny of an adverse party.
• How adverse parties can use the data in unexpected ways.
• How to identify both operational needs and long-term impacts of data collection, analysis, and presentation.
• How to present findings that will withstand not only internal questions but adversarial inquiry

avatar for Matthew Curtin

Matthew Curtin

Founder, Interhack Corporation
C. Matthew Curtin is the founder of Interhack Corporation, a computer expert firm based in Columbus, OH.  His practice helps attorneys and executives in high-stakes situations to understand and make use of computer technology and relevant data.  He has appeared as an expert witness... Read More →

Wednesday January 9, 2019 2:00pm - 2:30pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

2:30pm EST

Simulating Your Way to Security - One Detector at a Time
Covering a network with sensors is the first step towards security, but the massive flood of unprocessed, raw data points is frequently as paralyzing as having no visibility at all. To find actionable signal in the noise, one has to first define signal and noise. Threat detection must be motivated from a problem-first mentality, rather than a data-first mentality. Using this approach, "Big Data" problems tend to become small, relevant data problems, facilitating accurate and scalable detection solutions. We demonstrate the aforementioned problem-first approach with a case study of a password spray attack against an Active Directory (AD) system. We examine the nature of the attack: how it works, why it works and how its parameter settings interact with attacker style. In the resulting threat model, the "signal" is a sequence of failed authentication attempts from a particular device and the "noise" is the rest of the LDAP traffic.

To understand detectability of a dynamic password spray attack in a variable environment, the central idea is to gather samples of attack and merge them with records of the baseline enterprise network traffic. This may be accomplished by mapping timestamps and IP addresses of simulated and real flow data. For successful detection, signal must be discriminable from noise, so we demonstrate how to use time-series and probability density plots, combined with faceting and animation techniques, to visually examine the separation of signal from noise, across the sample of devices. Next, we show how constraints that come from details of the threat model suggest how to reduce the signal into a filtered, low-dimensional summary that preserves discriminability and allows detection to scale to a large network of devices. Finally, we show how the signal summary can be used to construct heuristic and statistical detection methods, and evaluate their efficacy, using accuracy and time-to-detection metrics.

Attendees will Learn:
Attendees will learn how to determine whether an attack is detectable and how to quantify detector’s quality using accuracy and time-to-detect. This can improve security operations by focusing investment on reliable detection.

avatar for Slava Nikitin

Slava Nikitin

Data Scientist, Columbus Collaboratory
Slava Nikitin is applying statistics and high-performance computing to bring the future back to now.  He is a Data Scientist at Columbus Collaboratory, working on statistical and machine learning modeling, software engineering, and interactive information displays.  He also is... Read More →

Wednesday January 9, 2019 2:30pm - 3:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130
Thursday, January 10

8:30am EST

Detecting Lateral Movement with a Compute-intense Graph Kernel
Both successful intruders and internal abusers of computer networks seek to move laterally in an enterprise network, to discover other sources of valuable information; detection of lateral movement remains a valuable analytic for cybersecurity analysts. We calculate maximum independent set, an NP-hard graph kernel, on a graph composed of point-to-point (e.g., ssh and RDP) connections, to detect lateral movement. In addition to assessing whether the atypical lateral movement is tree-like and suspect, we display it in the network graph context so an analyst can judge the likely risk. We seek data with known lateral movement to validate the analytic. This work extends the cybersecurity trend of applying more computing to a smaller fraction of the data, such as O(n^2) analytics like betweenness centrality. This trend anticipates the rapidly growing computational performance of early quantum computers from D-Wave Systems, enabling use of graph kernels with exponential computational cost on small (by cyber standards) datasets. We discuss the implications of using these more compute-intense kernels.

Attendees will Learn: 
  • How a set of analytic kernels that detect global characteristics and that analysts may not have considered are useful
  • Add an additional tool to the analytic toolbox

avatar for Steve Reinhardt

Steve Reinhardt

Director of Customer Applications, D-Wave Government Inc.
Steve Reinhardt has built hardware and software systems that deliver new levels of performance:  usable via conceptually simple interfaces, including Cray Research’s T3E distributed-memory systems, ISC’s Star-P parallel-MATLAB software, and YarcData/Cray’s Urika graph-analytic... Read More →

Thursday January 10, 2019 8:30am - 9:00am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

9:00am EST

Time-based Correlation of Malicious Events and their Connections
In the cyber security arena, many events of interest occur in conjunction with network connection events. For example, a connection to a suspected malware command and control node might proceed a hidden process disabling security logging on a compromised computer. Associating such malicious events with their related connections is a critical task in network forensics. Often times a suspicious connection can tip off investigators to previously overlooked events and vice versa. However, in many cases, associating events with corresponding connections is difficult due to network layering, dynamic addressing, or gaps in sensor coverage. Inevitably, the investigator will invoke timestamps to help correlate events with possible connections. In this presentation, we discuss automating this approach with a Time Based Correlation big data analytic that uses a statistical approach to gauge independence in events and possibly related connections. We include the results of a validating discrete event simulation that identifies under which conditions this approach provides the best performance and fewest false positives. We discuss scaling this analytic to the DoD enterprise level and its use in helping detect various anomalies.

Attendees will learn:
Attendees will learn how to automate the use of statistics to help link events and connections in a timeline during an incident or forensic investigation. This includes under which conditions time can be definitive in linking events and when it must be combined with other methods.

avatar for Steven Henderson

Steven Henderson

Lead Data Scientist, Enlighten IT Consulting
Steve Henderson is the Lead Data Scientist at Enlighten IT Consulting, where he supervises petabyte-scale data science analytics in support of DoD cyber operations for USCYBERCOM, ARCYBERCOM, and DISA. Steve is a 23-year Army veteran who is an expert in data science systems engineering... Read More →
avatar for Brittany Nicholls

Brittany Nicholls

Cloud Software Engineer, Enlighten IT Consulting
Brittany Nicholls is a Technical Lead who oversees a team of software engineers at Enlighten IT Consulting, LLC, an Alion Company. She and her team are currently focused on advancing the fusion of cloud analytics and visualizations. These innovative tools assist Defensive Cyber Operations... Read More →

Thursday January 10, 2019 9:00am - 9:30am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

9:30am EST

Quantum Approach to Inverse Malware Eradication
A quantum approach to malware eradication addresses the needs of organizations, which are facing a shortage of cybersecurity staff and resources, to tackle the increasing and dynamic cyber threat they are facing in a distributed and mobile computing environment. This approach closes the existing security gap and provides entities with a layer of security and protection between end-user and the Internet. It also provides a new sensing capability to provide a novel vantage point for threats in near real-time while sharing that visibility through standardized methodologies. The quantum approach to malware eradication inverts current common practices through the rewrite of binaries and documents to drive inbound and outbound files into compliance with permitted behaviors—an organization’s pre-established file risk parameters. This approach borrows from a variety of reductionist models introduced over the last few decades across the physical, biological and social sciences to analyze, describe and at times control the emergent properties of complex adaptive systems at their most fundamental, constituent levels. Positing that a file, including its content and behavior, emerges from the complex interactions of its constituent parts, the approach reduces it to predictable building-blocks and then regenerates them in accordance with a controlled, pre-established rule set without an impact to content, but with risk-based behavior controls. Interdicting files before they reach an endpoint, the quantum approach offers the opportunity to significantly reduce the vulnerability introduced into enterprises by the human user who is susceptible to a variety of social engineering attacks. It is the ultimate “left of boom” method that eliminates as much malware as all retroactive detection methods combined with no human interaction. Combining these methods is the future. It is scalable such that small- and medium-sized organizations can afford it, and it is flexible such that it can be applied across multiple use-cases.

Attendees will Learn:
​​​​Quantum rewrites of binaries and documents to support permitted behaviors is the inverse of malware response where content needs to be detected, analyzed and/or detonated. It is the ultimate “left of boom” method that eliminates as much malware as all retroactive detection methods combined with no human interaction. The goal is to inform the attendees that using a 'pass only known good' methodolgy through a quantum approach simplifies the solution and the future of information security will benefit from an inverted approach to security.

avatar for Daniel Medina

Daniel Medina

Director, Strategic & Technical Engagement, Glasswall Solutions Inc.
Daniel V. Medina is currently the Director of Strategic and Technical Engagements at Glasswall Solutions Inc. In his current role, Mr. Medina is responsible for developing and leading strategic engagement, thought leadership, and business development for Glasswall Solutions Inc... Read More →
avatar for Matthew Shabat

Matthew Shabat

U.S. Strategy Manager, Glasswall Solutions
Matt Shabat is the U.S. Strategy Manager for Glasswall Solutions. He served for nearly 10 years in the U.S. Department of Homeland Security's Office of Cybersecurity and Communications, most recently as a cybersecurity strategist and as the Director of Performance Management, and... Read More →

Thursday January 10, 2019 9:30am - 10:00am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

10:30am EST

Identifying Automatic Flows
One of the limitations of solely using flow metadata (e.g. Netflow) for network analysis is the difficulty in differentiating flows generated by user activities and flows generated by automatic processes. Most personal computers generate network flows continuously, performing actions such as checking for system updates, new messages, or network resources. We investigated how to identify automatic flows as a means of enhancing Netflow-based analyses of user behaviors; this approach however can be used to isolate and evaluate non-user generated flows as well. To develop this methodology this we created two virtual machines, one Windows 7 and one Ubuntu, and performed typical user activities on each VM while capturing the resultant flow data generated. User actions were scripted, with times logged and actions separated by intervals long enough for user initiated flows to complete. This allowed us to label all captured flow data as being either automatic or user generated. The labeled data was assessed, and used to develop and test algorithms to identify and label automatic flows. The resulting algorithms are not dependent on the ports or platform used. We present our observations on the discriminators we identified, the algorithms we generated and how well they performed.

Attendees will Learn:
Attendees will learn about specific Netflow-derived features that can be used to discriminate between flows generated by user actions and those generated automatically by applications or systems. This can improve security operations by enabling analysts to focus on either set of flows.

avatar for Jeffrey Dean

Jeffrey Dean

Electrical Engineer, USAF
Jeffrey Dean received his PhD in Computer Science from the Naval Postgraduate School in 2017. His dissertation focused on evaluating the use of organizational roles in comparing user network behaviors, using Netflow as source data. He served in the U.S. Air Force as an officer (active... Read More →

Thursday January 10, 2019 10:30am - 11:00am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

11:00am EST

Network throughput and complexity are increasing due to the increasing number of devices and data-driven applications, especially at universities and Research and Education (R&E) Networks. In this talk we present InSight2, an open platform, intended to monitor and facilitate the development of network analytics for these large-scale networks. University and R&E networks are facing a deficiency in operational and security awareness. Real-time behavioral visibility and analysis of networks are crucial to detect problems, predict patterns and protect the data and critical assets. Conventional monitoring techniques and tools do not scale well in these environments. Novel analytics must be developed to understand traffic behavior and security issues, addressing the complexity and throughput of these networks. Network managers, operators and analysts face difficulty finding tools to analyze the amount of the data they collect. Researchers and educators encounter a barrier to entry to develop network analytics. These issues can be addressed by an open platform, that facilitates collaboration among the global community for the development and improvement of network analytics. We present two analytics modules. The predictive analytics module forecasts network utilization and enables the detection of unexpected behavior. The botnet detection module identifies botnet activity in network traffic. Results from its various deployments as well as benchmarks are also presented.

avatar for Angel Kodituwakku

Angel Kodituwakku

PhD candidate Computer Engineering, concentrating in Cybersecurity, The University of Tennessee, Knoxville
Angel Kodituwakku is currently a PhD candidate in Computer Engineering with a concentration in Cybersecurity at the University of Tennessee, Knoxville. He served as a Research Associate for two years on a National Science Foundation funded project. He received his MS in Computer Engineering... Read More →

Thursday January 10, 2019 11:00am - 11:30am EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

11:30am EST

Good and interesting research starts with good and interesting data. Jeff Schmidt will introduce a U.S. Department of Homeland Security (DHS) program called Information Marketplace for Policy and Analysis of Cyber-risk & Trust (IMPACT). The IMPACT project supports the global cyber-risk research community by coordinating and developing real-world data and information-sharing capabilities—tools, models and methodologies. To accelerate solutions around cyber-risk issues and infrastructure security, the IMPACT project enables empirical data and information-sharing between and among the global cybersecurity research and development (R&D) community in academia, industry and government. Importantly, IMPACT also addresses the cybersecurity decision-analytic needs of Homeland Security Enterprise (HSE) customers in the face of high volume, high-velocity, high-variety and/or high-value data through its network of Decision Analytics-as-a-Service Providers (DASP). These resources are a service technology or tool capable of supporting the following types of analytics: descriptive (what happened), diagnostic (why it happened), predictive (what will happen) and prescriptive (what should happen).

avatar for Jeff Schmidt

Jeff Schmidt

VP, Chief Cyber Security Innovator, Columbus Collaboratory
Jeff is an accomplished cybersecurity expert with a background in security and risk management. He was the founder and CEO of JAS Global Advisors LLC, a security consulting firm in Chicago, and founded Authis, a provider of innovative risk-managed identity services for the financial... Read More →

Thursday January 10, 2019 11:30am - 12:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

1:30pm EST

Dynamically Repurposed and Programmable Network Monitoring
Effective NetOp and SecOp system architectures require collecting and analyzing network traffic data in real time. With the advent of programmable switching fabrics like the Barefoot Networks Tofino ASIC, alongside packet processing Domain Specific Languages (DSL) like P4, instrumenting networks deeply in real time is doable. We have developed a YANG model-driven system for network monitoring and visibility, based on a commodity switch, that harnesses the power of a fully programmable dataplane to filter, aggregate and shape packet frames and simultaneously generate telemetry natively. These components combined together can now be leveraged to develop powerful streaming analytic pipelines implemented using on-premise or cloud-based in-memory computing architectures. Scalable collection, analysis, and processing of high-speed, high port count networks using commodity, dynamically repurposed and programmed switches and servers is now achievable.

Attendees will Learn:
​​​​How to build a streaming analytic pipeline for high-speed, high port count network monitoring. SecOps are improved by providing the design of component pieces needed to support real-time decisions.

avatar for Michael Reed

Michael Reed

VP of Engineering, MantisNet, Inc.
Michael Reed has 20+ years of academic and commercial experience in system design, networking and security. He has been a Scientist at the Verite Group, Technical Director at SPARTA/Parsons, Senior Software Engineer Extreme Networks, and Scientist at the Naval Research Laboratory... Read More →

Thursday January 10, 2019 1:30pm - 2:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

2:00pm EST

Backwaters: Security Streaming Platform
Backwaters is a project devoted to the transportation of security data for Comcast's Enterprise. This platform, which utilizes Apache Kafka, provides Comcast with a secure and highly available data pipeline which is cloud-agnostic.

This talk will highlight why it is important to have a distributed streaming platform.  It will discuss how to utilize open source-based tools and describe how security engineering was able to utilize Apache Kafka's API's to perform correlation and alerting of events.

Attendees will Learn:
Attendees will learn which open source software was used for creating private and public cloud infrastructure as well as which Apache Kafka APIs were utilized for security operations.

avatar for Chris Maenner

Chris Maenner

Principal Security Developer, Comcast
Chris Maenner is a member of Comcast's TPX Security Solutions Engineering group, which provides Security DevOps, Data Engineering, Data Science, and Software Development resources for enterprise and commercial services. The group's primary objective is to help support Comcast with... Read More →
avatar for Will Weber

Will Weber

Senior Security Developer, Comcast

Thursday January 10, 2019 2:00pm - 2:30pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

2:30pm EST

Automated Cluster Testing and Optimization
How do you know if your cluster can handle the load you want to put into it? What is the optimal way to configure the cluster to handle the specific data and the way it is coming in?

We will show the process we used, and more importantly some of the tools that can run on the cluster to get the answers you need.

Attendees will Learn:
  • Cluster tools
  • DevOps tools for automation
  • Thought process of breaking down overall operations to measurable entities
  • Data generation tools

avatar for Brad Powell

Brad Powell

Senior Security Engineer, CERT Division - Software Engineering Institute
Brad Powell is a member of the technical staff and the Development and Test Environment (DTE) team in SEI’s Security Automation Directorate. His responsibilities include:Implementing testing tools and frameworks to support operational activities and the Security Automation Engineering... Read More →

Thursday January 10, 2019 2:30pm - 3:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130