Blog, Events & News


Combining Machine Learning With Behavioural-Based Bot Mitigation

By Richard Jones / 11th Dec 2018

Machine learning and Artificial Intelligence (AI) are often used interchangeably and sprinkled, like fairy dust, over bot detection security products and services. Anyone who has recently attended a security trade show will know only too well the difficulty of separating the vendor hype from the reality.

In this blog, we outline exactly how our machine learning works to protect your enterprise estate.

Netacea Focuses on Machine Learning to Determine The Intent of Attackers.

We’ve built our behavioural engine from the ground up as a machine learning tool for detecting bots and anomalous traffic behaviour for enterprise estates. Our adaptive threat architecture (ATA) adapts itself to the actual threat on your web estate using machine learning and is at the heart of what we do.

Both traditional rules-based systems and supervised machine learning models are based on historical data which has been labelled in the past. Every time we complete an image Captcha by clicking on the road sign, tree or hills in the photographs, we are labelling the images for Google to help train the image recognition software with correct data.

This is a very time consuming and laborious task, but the exhaustive training does produced very effective results eventually. In this use case, the machine learning might be detecting the subtle difference between a roof with a tree next to it, and a wooded hill in the background. The significant difference in the cybersecurity market is that the roof isn’t at risk of being deliberately camouflaged to look like a hill.

In the cybersecurity world labelling data becomes much harder. Cybercriminals are actively looking to disrupt all detection methods including the threat posed by machine learning. Cybercriminals employ sophisticated reconnaissance techniques to gain the intelligence needed to work around existing rules sets, and will just morph and change the attack too quickly if the machine learning is purely based on just historical analysis. They will simply find the weak spots in the defence, and organise a new attack that takes maximum advantage of the weak spots.

Many of the latest attacks we are seeing - particularly the massive Account Takeover (ATO) change IPs and user agents so rapidly over large multi-country botnets, that any traditional rules-based approach, geographic filters, or rate limiting rules just won’t work.

Worse these rules-based approach often create very large high false-positive rates. The rules are often put in place to stop a particular attack and often are performed in an emergency. They lack any overall rigorous statistical validation process and are, for the most part, just ad-hoc policies created on the fly for each incident. Over time, these rules mutate into an unmanageable list of endless blocked IP and user agents, which are out of the date the moment they are written, and just become more and more unmanageable.

Recognising the limitations of a purely historical approach, one of the newer machine learning techniques is unsupervised machine learning, which looks at the total population of events, and uses anomaly-based detection methods to score which of the actors falls outside a known set of behavioural patterns.

The trigger here is not previously labelled data, so the unsupervised models work effectively to determine the anomalous behaviour. This goes far beyond simply looking for increases in traffic levels. The machine learning is analysing all the behaviour of each input in the overall system, and is effectively using the standard deviation away from the normal behaviour to detect any outliers. This method is particularly effective at detecting the ‘slow and low’ types of attacks that are programmed by the cybercriminals to evade rules-based detection, and works particularly well for any new attack types that have not been seen before.

While the results from the unsupervised models are often very effective, machine learning requires very large volumes of data analysis, and considerable processing overhead. From a cybersecurity perspective, this inevitably means response is inevitably going to be delayed. If you are under attack, a pure behavioural-based approach won’t give you the response levels you need.

At Netacea our machine learning approach is to combine both the supervised and unsupervised models into one overall ensemble model.

This allows us to have the best of both models. Most attackers using standard tools and don’t bother to re-tool for each attack. They just use the same tool on lots of websites until they succeed. The supervised model will detect the vast majority of these attacks, as we will have seen this type of attack before. Combining the unsupervised model data for new attacks which go outside of established behaviour in an anomalous way gives us the best of both worlds. We can thwart know attack types and have a sophisticated baseline that measures anomalous behaviour so we have an early warning for new attack types.

We also add a third machine learning module into the overall efficacy of the detection. We used a co-training module to examine the results of both the supervised and unsupervised models to help identify the potential false positives and negatives generated by the ensemble model, and help us with further training.

The problem with the core behavioural analysis with both machine learning methods - including combining them in one ensemble approach is the speed of detection and the sheer amount of processing power. Determining the behaviour of each and every visitor according to the specific site path, time of day, and user behavioural characteristics takes massive processing power and just can’t be done in-line in a timely manner to instantly stop attacks.

Our approach at Netacea is to add all the deep security fingerprinting and signature-based detection at the network edge as data enrichments into the ensemble behavioural layers. This is not just a binary checking which requires no machine learning whatsoever. For example, a Javascript fingerprint may seek to prove a particular agent is self-reporting correctly. This will either pass or fail this particular test. However, any Javascript by its very nature is subject to reverse engineering and possible abuse. This adaptive threat architecture (ATA) gives us the instant ability to react to both known bad actors as well as using our predictive supervised layer to find and prevent anomalous new attack types.

Adaptive Threat Architecture

The adaptive threat architecture (ATA) means that our cloud-based behavioural layer does all the heavy lifting and is completely independent of your estate. All the security elements are at the edge to minimise latency, and we don’t sit in line or operate any expensive and hard to install and maintain customer premises equipment. This means we don’t operate a reverse proxy and add any other element of risk to your existing network. Instead of using a whole suite of JS based scripts that have to be amended and updated as the cybercriminals reverse engineer the javascript we use a Javascript API to add the tag as and when its needed according to the actual threat and business priorities set by the estate owner.

We’ve additionally worked very hard to ensure that we support as many CDNs and WAFs as possible, so we can integrate into your estate and pipe the intelligence you need into your estate using your existing infrastructure.

By combining all the security data such as IP digital provenance, known botnets, javascript based client fingerprints, and the telemetry from captcha services as data enrichments into the core behavioural engine, we can greatly improve the speed to detection and improve the core machine learning with additional labels.

This adaptive learning threat architecture (ALTA) is outlined below.

Netacea Adaptive Threat Architecture
Netacea Adaptive Threat Architecture

All the data is finally expressed as an overall threat score where they can then be applied dynamically in a central business ruleset.

Instead of having on-the-fly policies created that can’t be managed, we allow our customers to finally regain control over the business rules and logic. Controlling these core business priorities allows our customers to finally put in policies that reflect changes to the regulatory requirement, and the appetite for any risk across the estate.

For each mitigation, you can then clearly see the entire mitigation chain - from the business rule, to the threat type and finally the action taken on that particular visitor type.

Together, the business rules work with the adaptive architecture layer to finally provide a comprehensive framework for truly managing your visitors, be they friend or foe.

View our Adaptive Threat Architecture datasheet to learn more about our adaptive behavioural approach.

Account Takeover / Machine Learning / Bot Management