Adversarial AI and Cybersecurity


Sponsored by:
The spy-vs-spy game of AI and machines attacking and eluding each other is here. Adversarial AI is a crucial topic in cybersecurity that we must understand. It is the foundation of using machine learning for cyberwarfare.

Stuart McClure is the founder and CEO of Cylance, which uses AI and mathematical models for endpoint security. He is also the author of the highly respected and influential book, Hacking Exposed.

This video is a short excerpt from Stuart’s appearance on episode 324 of CxOTalk.


Michael Krigsman: The spy versus spy game of AI and machines attacking and eluding each other is here. Adversarial AI is a crucial topic in cybersecurity that we must understand. I spoke with one of the world’s top cybersecurity experts to learn more about adversarial AI. Stuart McClure is the founder and CEO of Cylance, which uses AI and mathematical models for endpoint security. Listen to my conversation with Stuart as we discuss adversarial AI.

What is Adversarial AI?

Stuart McClure: Adversarial AI or offensive AI, sometimes it’s called. I just call it AI versus AI. We have yet to see an adversary of any sophistication leveraging AI in the wild today to defeat AI.

We know that that’s coming. We certainly have anticipated it for many, many years. We actually have a team dedicated to adversarial AI research to build in a sort of preparation for that type of technique going after us.

It will happen. We know that. But for now, we haven’t seen it and we are very, very ready for that and have anticipated that for quite some time.

The way that we do that is we actually try to break our own models, our own AI. By trying to break our own AI, we’re actually anticipating how the adversary would try to break us as well. We do this in real time in the cloud in thousands of computers inside of Amazon AWS. By doing that, we can actually predict and prevent new forms of AI adversarial attacks.

When Will We See These Attacks?

Stuart McClure: In the next probably three to five years, I believe we will absolutely start to see this in the real world being very successful to bypass other technologies, I’m hoping not ours, but possibly ours, to bypass these technologies and to be able to gain a foothold. Right now, we are years and years ahead of the adversary because of this technique. I would say we’re at least three years ahead.

Now, that window might shrink. When it does, then we will have a challenge. But again, we’re spending more research, more time, more effort to make sure that we understand all of the different adversarial techniques and then building that into our improving learning math models will ultimately keep us ahead of the bad guys.

Michael Krigsman: How do you build system to bypass AI defenses?

Stuart McClure: Well, it really it takes three things to build a proper AI or a bypass AI model.

  • The first is the data itself. That’s what you might call resources, at least the first implementation of it is the data, so the examples of what would bypass us. That has to be created somehow.
  • Now, the second thing is the security domain expertise, the ability to know what is an attack that’s successful and what’s not an attack that’s successful and being able to label all of those elements properly.
  • Then the last is the actual learning algorithms and the platform that you use, the dynamic learning system that you’ve created to be able to do this very, very quickly and rapidly.

You need all three elements.

Now, a nation-state could absolutely provide the first and the third without much struggle or problem. The second, which is the domain expertise problem, that is an age-old issue. If you go into the entire security industry right today and you ask, “Well, what percentage of people,” let’s say adversaries in security, “actually know how to create it, find a zero-day, exploit it, and use it?” just a simple example of something that’s quite complex, you’re probably talking about 0.1% of the hackers out there in the world that can do that kind of thing.

Similarly, in the world of defense, the folks that can actually detect a zero-day, prevent a zero-day, and move on to clean it up are probably simple. We’re in the low single digits. It’s a much more difficult problem to scale is the domain expertise. While certainly a large country–China, Russia, what have you–who have a lot of resources at hand and a lot of smart people, you could start to catch up but it becomes just a very difficult scale problem because humans are not easily scalable.

Mathematical Cybersecurity

Stuart McClure: We’ve gone through many evolutions of our algorithms. We use many different types of techniques. Right now, we’ve settled on two great groups of techniques. The first is traditional deep learning algorithms like neural networks. That’s sort of our primary go-to usage. But we also use more sort of anomaly-based algorithms like Gaussian and Bayesian, for example. It just depends on the use.

We’ve applied, now, AI mathematics into, I think, over a dozen different features inside of the technology today to catch all kinds of different attacks. And so, how these algorithms work, it’s really, really simple. You take a large data set of data. You take then the characteristics of all of that data. Then you feed the characteristics, along with the labels, into these learning algorithms. It’ll tell you what are the predictive features that are most predictive of a classification set.

AI Models

Michael Krigsman: Explain your approach to AI models?

Stuart McClure: One of the greatest examples I give is I usually tell people, “Just look outside or look out your window and look at people walking by on the street. Now I’m going to give you a challenge. Think of three qualities of each person walking by that would give you a high probability detection that they are a man or a woman.”

Now, of course, this is a controversial topic but something that is, I think, quite interesting to talk about. You could look at them and say, “Well, look, long hair tends to be predictive of women or females, but not necessarily. It’s maybe only 90%. Facial hair might be highly predictive of men. Not 100%, but maybe 90%.” Adam’s apple, clothes, you name it, there are all kinds of qualities that you would probably come up with as you start to look through this.

Now, just take those three or four features, these characteristics. Now plot that in a three-dimensional graph or a four-dimensional graph if you have four qualities. Then now stick these learning algorithms into that graphing matrix in memory and start to learn from it.

What’ll happen is, you keep training each new sample that this is a woman, this is a man, this is a woman, this is a man, and you pull all these features. You’ll start to learn that, yes, truly, these characteristics–hair length, Adam’s apple, things like dress–are highly predictive of a man versus a woman. Now, it doesn’t mean it’s 100%, but if you learn enough from enough people around the world, you can probably get to 99.99%, and that’s the same kind of concept.

Now, instead of three or four features of a classification, for us, we mapped over two million features. That’s how advanced the machine learning and the feature extraction has become in our world.

Michael Krigsman: I hope you’ve  enjoyed this conversation on adversarial AI. Right now, please subscribe to our newsletter and subscribe to our YouTube channel.