FloCon 2019 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

General Session [clear filter]
Tuesday, January 8

2:30pm EST

Cybersecurity Data Science: Best Practices from the Field
Cybersecurity data science (CDS) is a fast emerging professional discipline. The field seeks to apply data analytics methods and processes to goals and practices associated with cybersecurity. As an emerging domain, many aspects of mature professions–standards, best practices, a body of knowledge–are still evolving.

The rapid evolution of technical infrastructure and tools, cyber threats, and data science methods, as well as political, regulatory, legal, and organizational complexities, combine to make this a challenging domain. Collaboration is difficult as practitioners often work in secrecy, organizational isolation, and under tight tactical pressures.

This presentation seeks to derive and categorize a set of common threads which characterize the emerging professional discipline from the perspective of practitioners. A comprehensive examination of the nascent profession is offered with a view to iterating towards professionalization.

As a central anchor, the presentation reports on research into cybersecurity data science best practices based on interviews with a representative sample of recognized global practitioners conducted in 2018.

Through the results of the interview research, the presentation seeks to address the questions:
• What is the professional status of cybersecurity data science?
• What are perceived central challenges?
• What methodological and technical trends are emerging?
• What are key best practices based on the collective experiences of peers?
• What aspects of data science are appearing on the adversarial side?

The objective of this research is to better understand and report on key factors underlying cybersecurity data science as an emerging profession. Utilizing qualitative research methods, results have been examined quantitatively to identify trends, challenges, and best practices resident in the nascent field.

As this research will lead to a forthcoming book publication, the hope is to gain active feedback from the community through discussion and debate on the best practices and challenges identified.

Attendees will Learn:
This talk seeks to take a step back from methodological insights and case studies to ask larger questions concerning the status of cybersecurity data science as an emerging profession.

Is the discipline a temporary trend, a solution in search of problems, or an enduring and expanding phenomenon? To resolve these disparate views, a social science based qualitative research initiative was undertaken.

A representative sample of global cyber security data scientists were interviewed to gain insights into the professional status of the domain. Qualitative feedback from practitioners was analyzed through quantitative methods to derive a set of key trends, challenges, and best practices.

avatar for Scott Mongeau

Scott Mongeau

Cybersecurity Data Scientist, SAS Institute
Scott Mongeau is a Cybersecurity Data Scientist - Principal Business Solutions Manager at SAS Institute. He has three decades of experience in designing and deploying data-intensive solutions in a range of industries, including management consulting, software and services, financial... Read More →

Tuesday January 8, 2019 2:30pm - 3:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

3:00pm EST

Four Machine Learning Techniques that Tackle Scale (And Not Just By Increasing Accuracy)
Because many of the most prominent successes of machine learning have been in the area of prediction via supervised learning, there has been a disproportionately large emphasis in the security realm on using machine learning to identify maliciousness. In the lab, analysis of a new model often looks promising, with any metric greater than 99% being deemed a success. Attempts at implementation in a real environment and at scale often run into irritating and humdrum issues: you can’t get the content you need in the right place, collecting features takes too long, you get some of the data but there are gaps, you didn’t realize that the real data would be so different from your training samples, your model seems to be oddly confident that things are bad but you can’t figure out why. And the most classic: with a billion samples, 99% isn’t so great. Striving for better accuracy in your model may help with the 99% problem, but does little for the other issues.

This emphasis on classification accuracy overlooks the other ways that machine learning techniques can help. Several contemporary approaches lend themselves to helping with these issues of scale. In some cases, these techniques provide additional context that reduces the load on human analysis. For example, techniques that deal with the problem of adversarial examples can also be used to flag results that come from a previously unseen distribution. Bayesian approaches can provide insight about levels of confidence in conclusions. Also, techniques aimed at model explainability can provide more rapid troubleshooting of results. In other cases, architectures can enable scalable structures. Multi-stage machine learning models allow for distributed models and effectively merge goals of reducing scaling costs with achieving good model performance. Towards a similar goal, techniques have been developed to reduce the footprint of models, thereby allowing for wider distribution.

This work presents an overview of the ways in which recent machine learning techniques can provide ancillary value—value beyond accurate predictions—that helps with the problems of scaling real-world implementations. In addition to an overview of the research, this work will provide specific examples of some of these techniques applied to security data.

Attendees will Learn:
Attendees will learn about ways in which recently developed machine learning techniques can help with some of the messier aspects of trying to apply a classification model to large-scale data. Learning about these issues and some of the potential remedies ahead of time will make the implementation of machine learning models to real-world security operations environments more likely to succeed.

avatar for Lindsey Lack

Lindsey Lack

Principal Security Engineer, Gigamon
Lindsey Lack is a Principal Security Engineer at Gigamon, where he focuses on the application of data science to information security. Lindsey has over nineteen years of experience in information security, having led a data science and threat modeling team, performed malware reverse... Read More →

Tuesday January 8, 2019 3:00pm - 3:30pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

3:30pm EST

The Power of Cyber Threat Intelligence and its Influence on Executive Decision Making
Executives are inundated with an abundance of cyber threat intelligence from several sources, but what does it all mean?  How this information is conveyed to them matters. Too much or too little data has the same effect, sub-optimal decision making that does not allow executives to make choices that posture the organization for the future and can become costly over time. How can leaders use tools systems that are at their disposal to prevent, anticipate, or mitigate the next cyber-attack from an evolving actor? Could data analytics or threat intelligence have prevented stock values from being affected?

C-Suite executives must calculate the cyber risks to their companies and justify their return on investment when looking to employ expensive cybersecurity ecosystems. Government leaders need to understand the threats to the USG from adversaries and have the task of defending against and proactively posturing the nation to be successful against future cyber-attack. Learn how to anticipate the right questions and convey the right information to executives through case studies that highlight the power of what cyber threat intelligence can do to drive executive decision making. Examples of this play out in both the private and government spheres where time is of the essence and questions surrounding attribution, liability, potential repercussions and mitigation force executives to make challenging decisions that could affect company reputation or impact policy considerations.

avatar for Eboni Thamavong

Eboni Thamavong

Lead Associate - Commercial Cyber Team, Booz Allen Hamilton
Eboni Thamavong has worn many hats throughout her career and is at the forefront of transformation in cybersecurity operations, analysis, and strategy. She is known for identifying areas for development and growth to move organizations forward. Ms. Thamavong is known for her insights... Read More →

Tuesday January 8, 2019 3:30pm - 4:00pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130

4:00pm EST

The Generation and Use of TLS Fingerprints
There are many TLS implementations in use by different applications and operating systems, each of which evolves as that protocol does. TLS fingerprints offer a way to identify client implementations from passive observations of sessions, and thus to make valuable inferences about the applications, libraries, and operating systems in use. However, to do so reliably requires a complete and regularly updated database of TLS fingerprints, accurate models of the prevalence of and relationships between libraries and processes, and a fingerprint definition that accommodates GREASE and admits a similarity measure. In this presentation, we describe a TLS fingerprinting system that meets these requirements. By fusing detailed network flow data and managed endpoint telemetry from an enterprise network, we have developed the first large-scale system to generate a TLS fingerprint database automatically and continuously. Additionally, our fingerprints naturally capture the intricacies of the information a TLS fingerprint conveys, i.e., each fingerprint is associated with a list of application names, hashes, and version numbers observed utilizing the specified ClientHello parameters, sorted by their empirical prevalence. The fingerprint database is open-source and regularly updated. After the first month of data collection, we had generated nearly 1,000 unique TLS fingerprints that provide coverage for nearly 5,000 unique processes.

Additionally, we will present an analysis of TLS fingerprints in the wild, which our fingerprint database makes possible. First, we analyze the stability of the fingerprint database, i.e., the rate that the environment introduces new TLS fingerprints and the database’s attribution efficacy over time. When our system observes a TLS session in the wild and the database lacks attribution information for that session, we return a set of the closest known fingerprints by using a similarity metric over the space of TLS fingerprints. We leverage our longitudinal data to quantify the effectiveness of this approach. Next, we analyze cases where the TLS fingerprint provides application attribution versus library attribution, and relatedly, the set of fingerprints that uniquely identifies a single application versus a set of applications. Finally, we will use graph analysis based on a graph derived from the fingerprint database that highlights the evolutionary relationships between TLS fingerprints and different application versions

Attendees will Learn:
​​​​Attendees will learn about a new open-source database that provides application attribution via the TLS ClientHello. Furthermore, the audience will learn about common pitfalls when using this type of information and analysis techniques that make effective use of our newly open-sourced TLS fingerprint database.

avatar for Blake Anderson

Blake Anderson

Senior Technical Leader, Cisco
Blake Anderson currently works as a Senior Technical Leader in Cisco’s Advanced Security Research team. Since starting at Cisco in early 2015, he has participated in and led projects aimed at improving the analysis of encrypted network traffic, which has resulted in open source... Read More →

Tuesday January 8, 2019 4:00pm - 4:30pm EST
Grand Ballroom 300 Bourbon St, New Orleans, LA 70130