Blog | 20th Jan 2022 / 09:00

Netacea’s approach to machine learning: unsupervised and supervised models

Sian Roach Cybersecurity Content Specialist

Our world is driven by technological innovation. Recent years have seen many companies adopt artificial intelligence (AI) and machine learning technology to analyze larger data sets and perform more complex tasks with faster and more accurate results. This is not limited to technology-based industries such as computer science – now, many industries work continuously to enhance their technology to keep up with consumer expectations, with data-based decision making often central to this drive.

Designed to imitate the way that humans learn, machine learning technology makes use of data and algorithms to gather knowledge and gradually improve accuracy over time. There are many machine learning applications; the two most commonly used and referred to machine learning models are supervised learning and unsupervised learning. The following outlines the differences between supervised and unsupervised machine learning programs, the benefits and drawbacks of each approach, and how Netacea uses a combination of the two machine learning models alongside anomaly detection, in our unique approach to bot management.

Machine learning models

Supervised machine learning

Supervised learning is a machine learning model characterized by its use of labeled data, which is used to teach algorithms to classify data, or predict accurate outcomes based on the labeled training data. Supervised learning algorithms can often be categorized into two types:

  • Classification
  • Regression

Classification uses an algorithm to assign new data to specific categories, based on training data. Regression is a supervised machine learning algorithm used to predict continuous values, again based on the initial training data. Supervised learning algorithms are best suited to situations where there is a set of available reference points on which to train the data. That being said, data is not always able to perfectly align within certain categories or labels; when this is the case unsupervised machine learning can provide a solution.

Unsupervised machine learning

Unsupervised learning algorithms are used to analyze and group sets of unlabeled data. Unsupervised machine learning models can help with pattern recognition for previously unseen or undetected patterns within data, without being explicitly programmed or requiring any human intervention. There are three types of unsupervised machine learning algorithms:

  • Clustering
  • Association
  • Dimensionality reduction

“Clustering” looks for similarities and differences within the data and will then use this information to form groups or ‘clusters’ of data. Similarly, “association” is an unsupervised machine learning algorithm that uses different rules or rulesets to find relationships between variables within the data. If the number of features in a set of data is too high, “dimensionality reduction” can be used to reduce the number of inputs to a more manageable size. Dimensionality reduction is sometimes used as a pre-processing step for supervised machine learning models.

Unsupervised machine learning allows you to find and group previously unknown patterns within the data, without any initial manual input of labels or categories.

Benefits and drawbacks of each machine learning model

While each approach has its merits, there are also some drawbacks to using one machine learning model over the other.

Supervised learning is a simpler method of machine learning, beneficial in situations where the goal is to predict outcomes of new data, whilst already aware of the type of results to expect. Although supervised learning helps you collect data, make predictions, and optimize performance criteria following the input of initial labels, supervised learning models can be time consuming and often require expertise when it comes to labeling the initial inputs.

Unsupervised learning is beneficial when the goal is to gather insights from large volumes of new, previously uncategorized data, or for anomaly detection. Whilst unsupervised learning is more adaptive and allows you to discover previously unknown patterns from data and find features for categorization, results from unsupervised learning require expert human intervention and analysis to validate.

Why Netacea uses both

Netacea’s multi-dimensional approach to bot management has our team of data scientists and bot experts using a combination of both supervised and unsupervised machine learning as well as anomaly detection to keep ahead of the continuously evolving bot threat.

Supervised learning allows us to ask, “Does this attack match a known attack pattern?”. We can then compare the data streams from our clients with those within our Active Threat Database giving us the ability to stop known bot attacks, as well as predict and prevent future attacks from occurring.

While supervised learning allows us to detect known attacks, unsupervised learning allows us to detect suspicious behavior, or patterns of behavior relating to new or previously unknown attack vectors by comparing the behavior of one user to others in the system. We use real-time clustering to group similar users, allowing us to spot when new clusters are created, highlight odd or atypical behavior, and constantly re-evaluate what a ‘normal’ pattern of behavior looks like.

Learn more about Netacea’s multifaceted approach to machine learning in our on-demand Technical Showcase webinar, “Netacea’s Approach to Machine Learning in Advanced Bot Management”

By using both methods of machine learning, Netacea maximizes the benefits of AI and outweighs any drawbacks of using one type of machine learning over the other. Our Intent Analytics® engine, powered by these machine learning algorithms, provides an innovative and profoundly more effective alternative to the traditional “black box” or JavaScript-reliant solutions. This allows Netacea to always stay one step ahead of the bots.

Bots can't hurt your business
with Netacea on the job
Imagine a world where your site traffic is free from bots that prey on your
users and take a bite out of bottom lines. Netacea brings that world to life.

Related posts:

Related Resources

American Big Box Retailer Cuts API Abuse By 84%, Elimi...

04th Mar 2022 / 12:14 VIEW case study

Customer Loyalty: How are bots exploiting business logic?

28th Jun 2021 / 16:32 VIEW whitepaper

The Bot Management Review: Separating Bot Fact from Fi...

16th Mar 2022 / 10:48 VIEW guide