An educational posting from me today on one of the most understated weapons ever unleashed in the counter-fraud world – Big Data Analytics. People often don’t realise that analysing vast chunks of data (commonly in the region of terabytes) can be done these
days in real-time, or at the very least near real-time. It’s understandable that this is the perception, given the technology to facilitate this has only been around in an affordable state for the past 12 months. However, given my role as a Fraud Consultant
I feel compelled to shout about this from the rooftops, as it has opened up a whole new world of possibilities in fraud prevention and detection.
Not only does this stuff have its use in fraud, but there are also many other areas such as credit risk, stock management, or product marketing that are starting to benefit. Imagine you could load up your iPad and immediately see your company’s risk exposure,
and what effect a small fluctuation in the markets will have on your overall liquidity, or what will happen if there is a wave of negative sentiment on social media toward a particular product/company. This used to take hours or days, but it now takes seconds
and gives you the power to decide how to react more or less immediately.
So why is this approach so ground-breaking in fraud? The ability to compile, interrogate and comprehend such vast, continuous data sources means we can now ‘understand’ and make automated decisions based on the content of email communications, voice conversations,
social media postings or stock market activity in real-time to our benefit. I thought I would include a brief example below to try and explain the benefits of this approach in a real scenario:
- Let’s imagine we’re an Investment Bank, and we’re observing many people escaping their positions on a particular stock, and a deluge of negative sentiment on Twitter, Facebook, Blog postings etc. towards the company the stock is associated with. However,
we now have a trader that takes up a large position in the stock – certainly counter-intuitive and unusual. We also notice an increase in email activity with a member of staff on another desk, containing particular words and phrases that don’t fit that trader’s
‘typical’ grammar profile. After 10 minutes, the stock unexpectedly jumps 10% in value and that trader makes a mountain of money. We then see further emails exchanged between the two traders which contain high levels of ‘congratulatory’ sentiment. Given the
number of ‘minor’ alerts generated for that trader (which could, in isolation, be explained away), the system takes a view that all of those occurring in combination for a single trader is highly suspicious. We then have a strong case, with all the requisite
evidence, to allow the bank to decide whether this trader was either very lucky or guilty of insider dealing – and they can assess this not the day/week/month after, but the minute after.
Now, obviously we’re not constrained to only analysing data in Investment Banks – in fact we’ve recently seen that analysis and text mining of insurance claim descriptions (written & voice) provided by bogus claimants uncovered some very interesting facts.
It turns out that certain phraseologies (the use of –ed rather than –ing on the end of verbs for instance), are extremely indicative of fraudulent claims. This is due to the different ways in which people relay stories they actually experienced versus those
they concocted; for instance ‘I was walking’ is indicative of someone recounting an actual experience whereas ‘I walked’ turned out to be indicative of someone describing a fictitious event. Applying Big Data Analytics to Social Media and data extracted from
Web Crawling/Scraping can also be used to uncover benefit claims abuse. We can identify people that are cohabiting and claiming single occupancy benefits, claiming disability benefits whilst posting videos of themselves skiing on YouTube (yes, this did happen
in the States!) or even identify tax evaders who have properties they are renting/selling that they haven’t declared to the tax authorities. All of this involves the collection, detailed analysis and intelligent reporting on terabytes and terabytes of data,
which wasn’t quick enough a couple of years ago, wasn’t thinkable five years ago and certainly wasn’t even possible 10 years ago! This technique isn’t applicable to every area of fraud given the varying quantities of data, but nearly every financial institution
could reduce the impact of fraud by interrogating its data more thoroughly.
At the end of the day, detecting fraud is all about uncovering the true picture, and the more pieces of the jigsaw (data) you have, the less guesswork you have to do to fill in the gaps and the less chance you have of missing some of the vital details. For
some, this may be seen as being a little ‘Big Brother-esque’, but for me it’s a case of ‘while the fraudsters have been using Ferraris we’ve been chasing them on bicycles’. It’s about time we changed the game and brought counter-fraud truly into the 21st century
with our own supersonic jets.