How do we reimagine the regulatory framework?
In an increasingly interconnected world financial services are rapidly blurring boundaries. They need to constantly keep track of changing regulatory obligations in different jurisdictions, fragmented and differing rules written in different languages and
using different taxonomy. One specific area of technology that is capable of making a huge impact on regulatory compliance now and in the future is AI.
Both regulators and financial institutions realise that the compliance function can be and should be reconceptualised, offering tremendous cost savings for financial institutions while providing regulators an opportunity for real-time enhanced oversight.
There is a huge amount of regulation globally. And now with Covid and remote work, regulators start to understand that moving in the cloud is basically essential.
2020 has shaped up to be a big year for Data Science and SupTech, or supervisory technology. Regulators globally are seriously considering speeding up the initiatives of digitising regulations. According to the survey carried out recently by the World Bank
and the Cambridge Centre For Alternative Finance (CCAF), 72% of regulators said that they have either accelerated or introduced initiatives on digital infrastructure in 2020, 58%have either accelerated or introduced initiatives regarding RegTech or supervisory
tech, and 56% did so in regard to innovation offices. And in November, GFIN also announced the global sandbox initiatives involving 23 regulators.
For firms that have multiple regulators that are supervising them, it's very hard to keep track of the fast-changing rules because they are published in different formats, different taxonomy. And that means that implementation of changes pushed out by different
regulator stakes much longer. That’s why the most forward-thinking regulators are making serious moves towards digital regulation, making it machine-readable.
We often hear about digital machine-readable regulation. What does it really mean in2020 and what are the latest developments in this space? We’re excited to share the results of the joint project ClauseMatch and ADGM carried out in collaboration.
The Project by ADGM & ClauseMatch
At the end of 2019, ClauseMatch was tasked by the Financial Services Regulatory Authority (FSRA) of Abu Dhabi Global Market (ADGM) to fully digitise the ADGM rulebooks and express them as a set of API's publishing them with innovative tools for firms to
interact with dynamically. With the main idea to help the regulated financial services firms achieve better compliance and risk management outcomes, while reducing regulatory costs and burden.
Following the initial stages of collaboration, in April 2020, FSRA (ADGM) launched three proofs-of-concept including knowledge graphs and API-enabled rulebooks.
Here are the main goals that the joint team set and achieved in several phases:
- Taxonomise all the content
- Develop AI/ ML to enable autotagging
- Create visual dynamic knowledge graphs
In essence, the team was expected to completely reimagine the regulatory framework. This was achieved by taking the content of regulatory requirements, automatically categorising it using advanced AI models,creating tags focused on regulatory concepts, obligations
and expectation and interlinking it on a granular paragraph-level based on the most common themes.
For various concepts, more than a thousand pages from ADGM regulations have been automatically analysed and correctly labeled by ClauseMatch. For most of the AI models, the accuracy score is higher than 90%. Trained AI Models learned to understand financial
concepts and then when applied to the whole ADGM corpus detected hundreds of thousands of occurrences for thousands of various entities from various concepts, many of which were not even presented at the training stage. Yet, the models were able to detect
these unseen cases successfully.
This is the first step before creating dynamic interconnected knowledge graphs. Enabled by artificial intelligence and advanced NLP (Natural Language Processing) algorithms, the knowledge graphs are designed to represent regulatory data in a structured and
During the graph-building phase, we aimed to improve and advance the graph’s functionality linking it with the internal documentation not only for analytical purposes but also so that it could solve automatable and repetitive tasks for financial institutions.
Creating AI-based interconnected tags and exposing the rules and regulations as a knowledge graph has helped to connect all the context in and around words in the regulations and make every word function as a data point.
Clicking on the bubbles in a graph brings you exactly to the places where a certain topic in regulation is covered. More insights about internal machine learning and KG architecture could be found in the Unboxing
Skynet for Regulatory Compliance article.
Then, the visual dynamic knowledge graphs are created for internal documentation such as policies, procedures, controls to map them with the requirements in a visualised dynamic form.
We've defined the concept, defined the obligation template. We've tagged over a thousand pages to create a training dataset. Then, we've trained tagging models, evaluated the results with experts, extracted relations, constructing a knowledge graph over
the regulations documents, making a live demo, including document obligation comparison in real-time.
The same approach could be replicated by any regulator, guidance or framework. Enabling the possibility of an automated and consistent merge of all regulations from any jurisdiction.
Knowledge Graphs: Where are the roots coming from?
The general idea of putting information into a kind of knowledge graph to then be able to make operations around it has been in the air for quite a long time - since the late 1980s. Serious developments in this area started at the beginning of the 2000s
and the first notable technology adoption was completed successfully by Google in 2012. Since then knowledge graphs have clearly become a trend as more and more companies like Airbnb, Uber, Facebook, and Amazon reported making variations of those graphs as
part of their system. Although task to reason over knowledge graphs remains to be a challenge for the machines, the situation is changing rapidly thanks to the recent advancements in Natural Language Processing (NLP) and language modelling.
In 2017 the mechanism of a 'new building block' - transformer - was developed which allowed a model to work with much more sophisticated concepts because of the use of the attention mechanism. The turning point for the advances in NLP was in 2018 with the
introduction of the BERT linguistic model that brought all the ML models to the new quality level. The transfer learning concept allowed effective adoption of the models from various other sectors to compliance. Various benchmarks proved the capability for
the new models to handle necessary association inference on or above the human level. Unfortunately, that wasn't the same for the causality inference.
The race stacking transformers continues up to the present day, providing us with new SOTA model results for higher costs. Models like GPT-3 worth $4.6 million for computational power were trained, revealing its questionable cost-effectiveness and lack of
explainability for results. Deep learning researchers community reports clearly that the next advancements should be made not with the billions sunk within the language modeling stage but with the combination of transformer-based models power with first-order
logic over knowledge graphs and enriching machines with the human-like ability for fact reasoning with techniques like reinforcement learning.
Tagging and relation extraction stages for the knowledge graph construction project were often the most challenging in terms of the cost of manual work but with those new models comes the possibility to automate the whole process. The causal reasoning over
graphs being now in the research edge, just check out the recent DeepMind release from the 23rd of October 2020 for
Causal Reasoning (link was removed due to finextra policy)
All that progress already allows us to put the regulation into code and even to automate judgment on top of it. Digitalisation and the possibility to transform all the regulation into code mean that we now can write a certain request or in other words, a
certain code, input all the regulation and get a knowledge graph that would actually be possible to view as a code.
Future vision: Regulation-as-a-Service
It’s clear that the digital transformation agenda has steadily become one of the most important focuses in all Financial Institutions (FIs) boardrooms, and we are seeing the impact of that on all aspects of our lives. Newer frameworks such as PSD2 and Digital
Banking, as well as guidance on topics such as ‘ethics in A.I’ and ‘encryption and storage of virtual assets’ are a good indication of how regulation is reflecting these changes. Though the evolution in finance is fast making the current analogue regulatory
system and frameworks obsolete.
Knowledge graphs are clearly the future of regulation. We are now existing in a much more digital paradigm and moving towards digital regulation. In fact, we’re witnessing the beginning of Regulation 2.0. After having Software-as-a-service, cloud-as-a-service,
banking-as-a-service, we're now moving into Regulation-as-a-service, which will truly usher in the new developments.
The collaboration between ADGM and ClauseMatch on the regulatory knowledge graph is a really exciting piece of work. We have been using natural language processing, semantics and machine learning to not only identify relevant subjects, objects and concepts
but also the relationships between them. The ‘who’ and the ‘what’ and, more importantly, the ‘why’.
This same process of converting words and sentences into ‘data points’ also means we can see linked and associated words and concepts i.e. we can start to see context.
It's really exciting as this works feels like the beginning of a very interesting journey for regulation and regulators. Enabling us to infer new relationships, gain a deeper understanding and realise patterns within the regulation that would not have been
spotted otherwise. As well as enabling us to represent regulation dynamically and in a truly digital medium.
- Vladimir Ershov, Data Science and Machine Learning Lead, ClauseMatch
- Barry West, Head of Emerging Technology, Abu Dhabi Global Market (ADGM)