Cybersecurity Sessions #7: AI in Cybersecurity – A Double-Edged SwordLearn the latest in artificial intelligence with a Mimecast data scientist.
It’s likely that we encounter artificial intelligence more often than we realize. Just as AI can be used to facilitate fraud and spread misinformation via deepfakes and sophisticated identity theft, it can also be used to develop algorithms that detect cyber-attacks in the blink of an eye.
In this episode of the Cybersecurity Sessions, Andy is joined by Elaine Lee (Data Scientist, Mimecast) to delve into the benefits and risks of AI in cybersecurity, examining how artificial intelligence can be used both as an offensive weapon by adversaries and by security teams to defend against attacks.
Elaine Lee, Data Scientist at Mimecast
Elaine has worked in industries ranging from government, healthcare, and now cybersecurity with Mimecast, thanks to her math degree, which has been her passport to many exciting opportunities. She enjoys methodical problem solving and believes that solutions can always be found in the data.
- Why AI adoption is rapidly increasing across industries
- How AI is being used to make fraud more sophisticated and widespread
- The cybersecurity defenses built using AI and machine learning algorithms
- The role of humans to supervise and keep AI controlled in the future
Andy Still 00:00
Good day everyone and welcome. Here we are again for the latest instalment of the cybersecurity sessions, our regular podcast talking about all things cybersecurity with myself, Andy Still, CTO and co-founder of Netacea, the world's first fully agentless bot management product. So today we're talking about email, a tool that we all rely upon, but also one that's historically been the entry point for many cyber-attacks, malware distribution to phishing attacks, and with the growth of AI these attacks are getting ever more sophisticated. We're lucky today to be joined by Elaine Lee, data scientist with Mimecast, who can tell us some more about how AI is being harnessed as an attack vector, but also as a means of defense. So welcome, Elaine. Great pleasure to talk to you today.
Elaine Lee 00:42
Thank you, Andy. Happy to be on the talk today.
Andy Still 00:45
Before we start, could you quickly introduce yourself for our listeners?
Elaine Lee 00:49
My name is Elaine and I'm a data scientist at Mimecast, primarily focused on incorporating AI and machine learning into email security products. This is actually my first stint in the cybersecurity field and I'm really enjoying the dynamic aspect of it. Especially since there's a adversarial and fast moving component that keeps us on our toes, really. Previously, I have worked in healthcare and fintech and the federal government. So yeah, happy to be here and to share my knowledge about this topic.
Andy Still 01:18
Thank you very much. So I think everywhere we turn at the moment, there seems to be talk of new problems being solved with AI. So as an opening question, is AI taking over the world?
Elaine Lee 01:28
Is it taking over the world? Well, we're definitely seeing more of it, that is true. There has been an accelerated adoption, thanks to the performance improvements of hardware, just leaps and bounds over the last decade, hardware is getting more powerful, storage is getting cheaper, compute power is getting faster, and also cheaper. So this hardware advancement has really facilitated some of the growth. And also we have a lot of experts and practitioners who have built a lot of commercial off the shelf tools that allow anyone to basically incorporate some form of AI in their products.
Andy Still 02:04
Yeah, I think it's definitely moved out of the regime of pure academic or very sophisticated expertise into much more mainstream, so you can with minimum kind of knowledge go on to on demand systems, such as AWS, etc., and easily get up and running with image recognition or something, that previously was the domain of sophisticated AI, those kinds of things like you say they are an absolute game changer. My other question around a lot of the AI systems, speaking from the background of having built a complex AI system, and we see our product go head-to-head with other products that claim to have AI, how real do you think a lot of the claims of things that purport to be true artificial intelligence actually are?
Elaine Lee 02:47
That's a good question, I think it's helpful to distinguish these buzzwords that we're seeing a lot more of, especially in product documentations and marketing materials. AI specifically has actually been around for a long time. At its core, the definition of AI is really a system that contains rules or instructions that instruct a computer how to perform a task. So depending on what it is, even just a simple computer program that people have been writing for the last 40 years, that could fall under the classification of AI, if it's just instructing the computer to perform a specific task, often like a categorization task, for example. So broadly speaking, a lot of things could be defined as AI, machine learning is, I like to think of it as a subset of AI. So machine learning is less about receiving a well-defined set of instructions on how to perform a task and more about receiving a large set of examples to learn from, to learn characteristics about, basically to infer patterns about, and then using those inferred patterns to perform the task itself. So that's the difference between machine learning and artificial intelligence. And regarding these marketing materials that companies are putting out about their products containing AI. Yes, I believe that's a relatively low bar to achieve in this day and age. So that's not relatively groundbreaking or informative in terms of product capabilities. But if they do mention machine learning that might be worth paying attention to, there might be something special going on there.
Andy Still 04:28
Yeah, and the other buzzword you hear as well as deep learning as well, what's your take on the difference between machine learning and deep learning?
Elaine Lee 04:35
Deep Learning is a type of machine learning that borrows inspiration from biology, specifically neuroscience. In fact, deep learning used to be more commonly known as artificial neural networks. And as you can guess from the name, neural network is mimicking the behavior of the human brain and how the human brain learns. Now why is it called Deep Learning? What does deep actually refer to exactly? The deep refers to the different layers of perception that is in a deep learning system. So kind of mimicking how a human brain works when it observes something in front of itself, it doesn't notice everything at the same time. For example, if I saw an animal in front of me, I might notice first, how big it is, whether it's large, small, medium sized, that's like the first thing I would notice. And then the second thing I would notice is probably its skin, whether it's very smooth, or scaly, etc., etc. And then I might notice some other details about its face or its tail. And you know, all this stuff is perceived gradually. And the Deep Learning System mimics this behavior. And by learning things gradually that is. So for example, one of the first popular deep learning models was actually built to detect handwritten digits. This, as you can imagine, was very useful and practical for the Postal Service, which was still processing mail with handwritten digits on the envelopes representing the zip codes. So the, if you were to peek under the hood of this deep learning model, you may see a layer that corresponds to looking for horizontal straight lines that go left to right. So some digits will have this feature represented very strongly, such as the number five, with its little hat at the top of the digit, the next layer may be looking for very straight and vertical lines that go top to bottom. So some digits that might exhibit this feature strongly, or the number nine, for example, or the number seven, and then the third layer may be looking for curves in the digits. So digits that have this feature very strongly, as you can imagine, is the number eight, and also the number two to a lesser degree. So a deep learning system, or an artificial neural network that is trained to identify digits will perceive the features of the digits in this manner. As it relates to cybersecurity, I just gave two visual examples, images of digits, and also, you know, a very visual perception of animals in front of me. So as you can imagine, this can be applied to, you know, anything that's visual, images, video, etc. Another common application is being applied to typed or written text, human text, if you will, a deep learning system can pick up on attributes about the text in question. So in a nutshell, that is what deep learning is.
Andy Still 07:33
Yeah, I think we're looking at the power of AI. And I think, you know, there's a view of artificial intelligence, and then, intelligence is a very wide-ranging word. But what artificial intelligence does is it allows specific tasks to be solved very intelligently, beyond a set of simple instructions. So it can learn how to do that in a relatively nuanced way. But they tend to be very focused on very specific problems solved. And I think that segues quite nicely into talking about your experience in cybersecurity, and particularly around some of the challenges on email security is as you're looking at it now. Can you go into a bit more detail about the kind of AI approaches that you're seeing to evolving threats in the email area?
Elaine Lee 08:17
So attackers are definitely incorporating more AI into their attacks. That's for sure. One strategy I could think of, it's a bit more nuanced, is basically gathering vast amounts of information about the target in question. This usually informs a sophisticated social engineering type attack, where they definitely have to do a lot of research and investigate a collection on the target first, and then secondly, they use that information to craft a social engineering attack that is likely to entrap the target in question. Machine learning enables them to craft a more convincing attack. I mean, the attack would have happened regardless, but it may have happened with less finesse if they did not incorporate machine learning or artificial intelligence into it. You know, before it would just be a less finessed attack. Maybe it would just be an email from some random person like the Nigerian prince, if you will, to asking you for money. That's like a not very finessed attack. That's the before and now after with machine learning capability, or what have you, there's just access to more data about people, about the target. You can figure out who their CEO is, that's like an easy one but, you know, that's made possible by the data that's now available online, such as LinkedIn, you could create a more targeted attack that way by, instead of pretending to be a Nigerian prince, you pretend to be the CEO. And then maybe a step further, if you get your hands on the information, you can pretend to be their direct manager, and then crafting the attack such that the sender seems to have a close relationship with the target, makes those attacks more convincing, and the subject more likely to fall for it. So yeah, this is all possible based on the availability of information with some machine learning techniques that can be applied, they can identify who's close to you, who was close to the target, and then impersonate that person. So the attacker has a greater likelihood of succeeding.
Andy Still 10:16
Back in the past, you would have kind of phishing attacks, which will be very much kind of scattergun, they would just fire off tens of thousands of emails without much intelligence. Then you might have spear phishing attacks where there would be a kind of human involvement to target a particular person, learn more about that person, their potential weaknesses. From what you're saying, it sounds like what the use of AI has done, is allow the spear phishing attack to be much wider and much closer to a phishing attack, because they can gather that data, they can apply that human like intelligence to that data to make it a targeted attack. So you're getting the benefits of the spear phishing attack, but with the effort of a phishing attack.
Elaine Lee 10:58
Yep. I could give another example of how machine learning has enabled attackers in a way. So this one's a little bit more nuanced. And it involves the attacker suspecting that the target is running some sort of machine learning model, or even just AI defense, some sort of defense system at the target site. And then the attack strategy then becomes how do I trick that defense system to lower its guard, so my email has a greater chance of landing in the inbox? And then from there, you know, once you're in the gate, then if the user clicks on it, then the rest is history, right? So there's that sort of attack, basically, AI versus AI, right. So it's trying to fight the defense system. And depending on what it is, it's, yeah, they could use AI, or a bunch of automation to barrage the defense system and get it to lower its guard to trick the system in some ways.
Andy Still 11:54
A lot of these techniques are just about getting people to lower their guard to put their trust in, in what they're seeing. I know one of the things that was mentioned, and I think this is some research that that Mimecast have done was around the use of deep fakes as part of phishing attacks, could you just share some more about how deep fakes are used as part of this?
Elaine Lee 12:14
Deep fakes are really just made possible by all this widely available data that's being recorded, such as this conversation, for example. So you know, all this data, all this audio and video, and image footage is out there on the internet now, and it's getting easier and easier to find it and to categorize it and just make it easier to find recordings of you, Andy, online, and know that it is actually you online, and to just create these convincing bot like entities that can go about and pretend to be you and just do all sorts of awful things. Yeah, it's made possible by all this available data. And that does give rise to deep fakes. At its most basic, it could just be the splicing together of using various words and, just like, make a coherent sentence that way, that is at its most basic, obviously, you know, it's probably not going to sound very good. But with the, again, back to the whole advancement in computing, reduction in cost of computing technologies, powerful computers that are available to people, you can use all the all these fancy audio and video processing features to just smooth out that content, that fake content, and deliver it to the targets in question. We did say in one of our articles that Mimecast put out, specifically the one in Intelligent CISO back in 2019, we did say that a cloned voice sample can be developed with four seconds of footage, which is pretty amazing when you think about it, I mean, just me saying the sentence was probably four seconds already.
Andy Still 13:53
I mean, the think it's, what you talked before about getting people to lower their guard against things. One of the things that people have inherently always trusted with, they could actually hear the voice of the person, if you got the email telling you to transfer thousands of dollars to another account, you would doubt it. But if you got a voice message from the person, again, that usually crafts another layer. So I think it's just constantly looking at getting people to lower their guard. And the idea that these could be scouring the internet right now to get appropriate voice signals. And almost everyone is recording something and putting some kind of video out there, particularly senior members of C suites of most companies have a presence out there. So the idea that it's as little as four seconds that it takes to be able to do that. Again, you just need to be aware that this is something that can be happening and adjust your processes around it. But awareness of that I think is very fascinating, sort of how easily these things can be generated.
Elaine Lee 14:49
Andy Still 14:49
I think we've talked a lot about how AI is being used for bad and how AI is targeting bypassing defenses on emails, your day is spent actually trying to protect our email. So how are you actually using AI as part of your cybersecurity defense approach?
Elaine Lee 15:07
In summary, the best way that AI helps out from a defense standpoint is we just have to play to the strengths of AI. So AI systems, they are computers after all, they are very good at processing vast amounts of data. They're good at remembering things that humans can't, yeah, humans can't remember everything. But AI systems can remember things pretty well, they're also very good at processing a bunch of information at the same time. So that combination makes them very good at anomaly detection. So they can pick up on things, deviations from normal behavior, much, much more easily than a human could. So a lot of strategies are centered around that sort of theme. So as a result, a lot of AI systems that us in the cybersecurity world build are centered around what's weird, let's try to let's build a system that's very good at alerting when things are a little weird. And let's tell it to look at these sorts of characteristics. And if there's any deviations and these sorts of characteristics, let's raise a red flag and get a human to look at it. So that's a high level. That's how the AI helps us then from a defense standpoint.
Andy Still 16:18
So would you say that AI was no fundamental part of your defense strategy?
Elaine Lee 16:24
Yes, yes, it is. And we have various products that basically play off of that theme I just described, these products are all situated in various parts of the security defense systems. For example, I primarily work on inbound emails going to our customers' users, just building defense systems around that, analyzing content and communication patterns between our customers' users and the senders of the emails, just simply looking at communication patterns, a little bit of content analysis. That's just me specifically, there are others at the organization who do a deeper analysis on the contents of the emails, and they build machine learning and AI models around content analysis, looking at attachments, identifying your roles embedded within emails, as potential risks are potentially safe. There's a lot of AI work going on across Mimecast.
Andy Still 17:18
So I'm thinking about the kinds of phishing attacks, I'm thinking particularly of the kinds of emails sent to people in finance departments, asking them to make payments claiming to be on behalf of the CEO. Is the sort of thing you're looking at that, even down to the actual kind of language that is used in those emails? Are you training those based on known attack vectors? Or are you training those on looking at typical content that that user will be looking at? Or is it a combination of those?
Elaine Lee 17:43
Definitely a combination of those. We do have specific teams that are dedicated to researching, identifying and incorporating knowledge about known attack vectors into the systems, there's definitely that aspect of the work going on. But in order to be agile and to respond to novel attack types, we do have to look at machine learning and AI to incorporate those components into the system to, again, going back to the anomaly detection type of theme. If they see something unusual flag it, maybe it's a false positive, but at least flag it. And then if it ended up being actually malicious, then that's a good thing that we flagged it. The AI and machine learning system is definitely crucial for identifying things that are never seen before. Whereas these known attack vectors, well, they are known for a reason. They've been seen before. So yeah, so we definitely have to use a combination of both.
Andy Still 18:36
Okay. I think one of the things whenever I've looked at AI systems, AI is good for solving certain problems. Humans are good for solving certain problems. And the combination of humans plus AI is usually the way to go. Is that kind of reflect your view of the world as well, does it?
Elaine Lee 18:52
Yes, I totally agree with that. Humans definitely need to be working closely with the AI systems and to, you know, be very involved in the development of these AI systems, we definitely need that human in the loop, because I think it kind of does go back to something we spoke about earlier in the conversation about AI taking over the world, there seems to be that perception, yes. But in order to prevent that from truly happening, we definitely need the humans to stay closely involved and to monitor the AI systems, make sure they don't adapt too quickly. And in bad ways. Honestly, the little joke that I like to make is, if you've ever written a program and infinitely looped, you know, you know what that's like, that's just that's a system running amok. And AI systems also can do something similar where it could just get into its own little local maxima or local minima, they just get stuck somewhere. And then they just keep doing the same thing over and over again, you know, examples that we have seen in real life of an AI system going amok like that typically have been in recommendation systems. For example, there were some criticisms about the YouTube recommendation algorithm going down very dark paths, shall we say, just the quality of the recommendations getting kind of bizarre and strange and not desirable. So that's an example of, you know, a system that needs maybe not so much direct human supervision, but definitely some safeguards engineered by humans, you know, just put into that system. And that's why humans definitely need to work closely with the AI systems that they built.
Andy Still 20:24
Yeah, I absolutely agree with that. Like we've said earlier, intelligence is a very nuanced word. And artificial intelligence tends to solve very specific problems. But it doesn't bring that human intuition with it, for want of a better word, common sense that you would get from, from having a human vet your YouTube recommendations. And there's, there's plenty of other examples where you see problems have been trying to be solved with AI. And then humans have got involved and have managed to game the AI to do something that clearly wasn't intended to do. Just before we wind up is anything else that you would like to share with us today around the subject of AI in cybersecurity?
Elaine Lee 21:01
Yes, yeah, I would also like to share, AI has been used to accelerate the sophistication of defense strategies. So we did talk a little bit about this already. But something that I also want to point out that we do at Mimecast is we're also using AI to craft more convincing awareness training modules for our customers' users. So you know, this kind of goes back to the human and defense theme, where humans can sometimes be the best defense against these attacks. So, you know, if humans are really the best ones, then we should make sure they have the information and the know how to defend themselves. Actually, awareness training is a pretty big part of our product. And as the name implies, we craft these scenarios, these fake, but real looking emails that we deploy to the users at the customer site periodically, it's crafted to look very convincing. And if they interact with it, then they're notified that they failed the exercise and taught what happens here. This is what an attack could look like. And in the future, to protect yourself against this attack, use another method of verification of the content in your email. So these awareness training exercises, it's not a novel concept. But we have started using AI and machine learning to make more nuanced scenarios. Kind of like, you know, awareness training used to be the Nigerian prince emails. But now it's more of, oh, let's craft a scenario where your direct manager is emailing you to Venmo them 200 bucks or something like that. So yeah, awareness training is even more important. Yeah, so it's just educating the humans, and they are often the best defense, especially against zero day attacks or novel types of attacks. So that's, we cannot underestimate how important humans are in this whole defense strategy.
Andy Still 22:58
No, I think that's really good, because I think humans are easily the weak point, but can also be the strong point if you, if you're appropriately training people. I mean, we've I've been on the other side of those awareness attacks, of having fake phishing emails sent to our, our address, and some of them are very convincing.
Elaine Lee 23:17
Andy Still 23:17
And having seen examples of real phishing emails, you can absolutely understand why people are falling for them, that they play on known weaknesses, like not wanting to question the boss or, you know, like you say, the amount seeming reasonable. One thing we've touched with the deep fakes thing resonated with me. We've, we've seen quite a lot recently of people just requesting WhatsApp numbers from people.
Elaine Lee 23:40
Andy Still 23:41
And I presume that is a way of them bypassing defenses because things go straight to WhatsApp, which doesn't have the same kind of protection that that we have around emails, things like that. I don't know if that's something that you've seen rising as well.
Elaine Lee 23:55
Oh, well, I have gotten a lot more spammy messages on WhatsApp recently. I'm not entirely sure why, but yeah, that's, yeah, I've definitely seen that a lot recently. And, yeah, these attackers, if there's, you know, there's so many different media to reach people. So they're gonna try everything. I guess we just always have to just be vigilant and, and you know, us humans, we have a pretty good intuition of what's normal versus what's not normal of the people that we spend time with, including our coworkers. So if something seems a little off, we should heed our intuition and proceed with caution.
Andy Still 24:30
Yep, definitely use your use your human intuition. Thank you very much, Elaine, for joining us today. Thank you everyone else for listening in. If you've got any feedback, please tweet to our Twitter account @CyberSecPod. Subscribe, leave a review, any questions you want to use good old fashioned email, which will be protected by Mimecast. You get to us at firstname.lastname@example.org. So thank you very much for joining us today.
Elaine Lee 24:55
Thank you, Andy. Thank you for having me. This was a wonderful conversation. And of course if you're on a market for email security products, definitely check out Mimecast at mimecast.com. And we also have our own podcast called Phishy Business which you can find on Spotify or wherever you listen to podcasts. So again, thank you, Andy. This was wonderful.
Andy Still 25:15
Thank you very much, Elaine. And thank you everyone, and we will see you again in the next episode.