Deep learning and neural networks have gained incredible popularity in recent years. The technology has grown to be the most talked-about and least well-understood branch of machine learning. Aside from it's highly publicized victories in playing Go, numerous successful applications of deep learning in image and speech recognition has kickstarted movements to integrate it into critical fields like medical imaging and self-driving cars. In the security field, deep learning has shown good experimental results in malware/anomaly detection, APT protection, spam/phishing detection, and traffic identification. This DEF CON 101 session will guide the audience through the theory and motivations behind deep learning systems. We look at the simplest form of neural networks, then explore how variations such as convolutional neural networks and recurrent neural networks can be used to solve real problems with an unreasonable effectiveness. Then, we demonstrate that most deep learning systems are not designed with security and resiliency in mind, and can be duped by any patient attacker with a good understanding of the system. The efficacy of applications using machine learning should not only be measured with precision and recall, but also by their malleability in an adversarial setting. After diving into popular deep learning software, we show how it can be tampered with to do what you want it do, while avoiding detection by system administrators.
Besides giving a technical demonstration of deep learning and its inherent shortcomings in an adversarial setting, we will focus on tampering real systems to show weaknesses in critical systems built with it. In particular, this demo-driven session will be focused on manipulating an image recognition system built with deep learning at the core, and exploring the difficulties in attacking systems in the wild. We will introduce a tool that helps deep learning hackers generate adversarial content for arbitrary machine learning systems, which can help make models more robust. By discussing defensive measures that should be put in place to prevent the class of attacks demonstrated, we hope to address the hype behind deep learning from the context of security, and look towards a more resilient future of the technology where developers can use it safely in critical deployments.
Clarence Chio graduated with a B.S. and M.S. in Computer Science from Stanford, specializing in data mining and artificial intelligence. He currently works as a Security Research Engineer at Shape Security, building a product that protects high valued web assets from automated attacks. At Shape, he works on the data analysis systems used to tackle this problem. Clarence spoke on Machine Learning and Security at PHDays, BSides Las Vegas and NYC, Code Blue, SecTor, and Hack in Paris. He had been a community speaker with Intel, and is also the founder and organizer of the ‘Data Mining for Cyber Security’ meetup group, the largest gathering of security data scientists in the San Francisco Bay Area. Twitter: @cchio