Blog, Events & News
Arbitrage (Arb) Bots – What’s The Threat?The betting industry is rife with bot attacks. So, what can the gambling industry do to tackle the bot threat and what exactly does that threat look like?
It doesn’t take long for a gambling organisation to earn a lot of money – we’re talking upwards of six figures – during a high-profile event, and all that potential can be lost just as quickly due to an Arbitrage Arb-Bots attack.
What’s the bot threat in gambling?
A large bookmaker recently asked the Netacea team for help with arbitrage-betting (arb-betting), a bot attack technique that had been repeatedly raising its head.
How does arbitrage-betting work?
Arbitrage (arb) betting exploits imbalances in the odds across multiple bookmakers, placing bets on multiple outcomes to guarantee that the bets don’t lose money (and may win).
For example, in a World Cup football match with no possibility of a draw between England and Sweden:
Bookmaker A has Sweden at 8/15 (naturally, Sweden are the best!) and England at 12/7, and bookmaker B has Sweden at 2/5 and England at 9/4.
You bet £100 on Sweden with bookmaker A and £47.18 on England with bookmaker B.
If Sweden wins, you get a return of £100 *(1+ 8/15) = £153.3, if England wins you get a return of £47.18 *(1+ 9/4) = £153.3. From your bet of £147.18, so you are guaranteed a profit of £153.3 – £147.18 = £6.12.
Arb-betting is not illegal and is even encouraged by some bookmakers.* However, most big-name bookmakers consider arbers to be a liability, and as such, it is an industry-wide problem.
How do arb-bots work?
Arb-bots bet on arbitrage opportunities in an automated fashion. At the time of approaching Netacea, arb-bots were costing our client an estimated £1mn in losses per month.
- The direct approach – To go after the actual arbitrage betting
- The indirect approach – To go after the information flow feeding the arbors
The direct approach to stopping arb-bots
Arb-bots have some easy to spot identifiers.
As seen in the example above, arbing often leads to betting strangely specific amounts and tends to involve betting on obscure events, because this is where the odds are most likely to be out of sync between the different bookmakers.
The obvious approach is to look out for these tell-tale signs. However, without access to the required data, we were initially reliant on the web logs alone and could not identify either the size or placement of the bet. We had labelled some examples of arb-bots, but not enough to train a classifier.
How clustering helped
In order to reduce the problem, we decided to cluster all the sessions on our client’s website. We used an approach that we previously described in the blog, Using the Skip-Gram Model to Understand Web Traffic.
To sum-up how the model works, we represent each request numerically in such a way that “similar” requests end up with numerical representations that are close to each other. These are then used to cluster the sessions so that similar sessions end up (hopefully) in the same cluster.
Identifying clustered sessions in arb-betting
As it turns out, all the arbers clustered nicely into three clusters. The clustering allowed us to extract block-able patterns of behaviour from these clusters.
When clustering we noticed that there were many clusters made up, almost entirely, of highly distributed automated traffic. Our client has many good bots of course, but these were in a few separate clusters. Many of the clusters contained bots that were unknown and undesired by our client. This leads us onto our second prong in our approach!
Stop the scrapers, stop the arb-bots!
Since arbitrage betting relies on up-to-date odds from multiple bookmakers, arbitrage bots must automate quick and effective data gathering. Armed only with access logs, we decided it would be a good idea to go after this information gathering, aka. web scraping.
Using our tried and tested scraper model, we quickly identified the worst scrapers. From the behaviour clusters, block-able behaviour patterns were extracted to better target massively distributed scraping.
In addition to stopping the arb-bot’s information flow, blocking scrapers had a positive impact on our client’s infrastructure and performance.
The scrapers were filling the cache with obscure sites that humans seldom visited, which significantly increased the load time for real users and increased the cost of retrieving popular pages that should have been cached but, had been pushed out by scraping.
The outcome (so far)
So far, our blocking strategy targets arb-bots by identifying arbing patterns, as well as distributed scraping, in addition to the scrapers causing the greatest damage to the infrastructure impact. Which is a nice bonus.
In this instance, our client did not want to automate the blocking of all volumetric scrapers due to the risk of false positives. Instead, they opted to automatically block the worst 100 scrapers.
Since we began blocking the scrapers, our best estimate is that arb-bot activity has reduced by approximately 85%.
Blocking the top 100 volumetric scrapers is estimated to have reduced CPU usage by approximately 7%, bandwidth by approximately 5% and the number of requests by approximately 2%. Our client has also reported that the cache is now more relevant to humans, creating faster response times and lowering infrastructure costs.
How Netacea helps
If you’d like to know more about fighting back at bots and the threats faced by a range of industries, download our latest webinar to hear from cybersecurity researcher Scott Helme.
*Arb-betting can be positive for bookmakers if they are fast and or accurate enough. If a bookmaker reacts quickly to betting trends, arb-betting could help adjust the odds correctly. If the bookmaker is better than its competitor at estimating the true probabilities for an event, and are adept at reacting to changes, they can even make money from arbitrage arb-bots. Bookmakers such as these are sometimes referred to as sharp bookmakers, as opposed to soft, which is sometimes used to refer to slower, less accurate bookmakers.
web applications with Netacea's
Intent Analytics™ engine