Interest in collecting digital footprints is growing exponentially across industries and businesses. 97% of Fortune 1,000 companies will invest in data initiatives in 2022, and 91% - in AI activities, according to the Wavestone Company report. Globally,
buy-side firms have spent $1.71bn on alternative data during 2021. And the number of alt-data providers is 20 times higher today than 30 years ago, as the Alternative Investment Management Association found out. The demand will likely continue to grow as more
firms go digital and invest further in new technologies post-pandemic.
However, achieving data-driven leadership remains an elusive aspiration for most organisations. Only 39.7% of companies from Wavestone research reported managing data as an enterprise business asset, and 19.3% said they had created a data-driven culture.
The pressure that CEOs, marketers or risk professionals are under today to make decisions about adopting game-changing technologies is understandable. But this game is definitely worth playing. The latest survey from Splunk Inc. shows leaders in data innovation
are increasing gross margins by 9.5%.
However, while the scope and intensity of alternative data use are expanding, its application for business purposes still lacks a deep understanding of its nature.
No one seems to have enough alternative data
In the most general sense, alternative data is one that is collected from non-traditional sources and used by companies to find advantages in their respective marketplaces. Suppliers constantly look for new, unused data streams, so the list of categories
is continually changing. Broadly speaking, examples of alt-data sets include credit card transaction data, mobile data, IoT sensor data, satellite imagery, social media or weather data, web traffic, app usage and ESG (environmental, social and corporate governance)
data. Some providers even track corporate jet flights, podcast and video streaming, government contracts and congressional commerce.
However, not all companies claiming to work with alternative data are actually using them - even "digital-first" companies. On the one hand, this is due to a misperception of this kind of data. For example, one major BNPL player in the US thought it was
already collecting an extensive digital footprint. After analysing the actual data, it became clear that they were only 8% closer to their target. Moreover, they were collecting quite rudimentary information that was not even predictive for risk analysis or
useful for fraud detection.
Equally problematic is unstable user activity. For example, one large neobank we spoke to recently stated that 65% of its customers are completely passive and do not conduct any transactions after having downloaded the app. The bank has a great app but a
lot of inactive customers. How can any app publisher expect to improve its performance with so many inactive customers? Super apps like Uber, Lyft, Didi, Grab, GoJek, etc., which have expanded into related services, including food delivery, payments, local
parcel delivery and, more recently, lending, face a similar problem. They don’t have enough data to depict a 360-degree view of all their users across all their services. Think New-to-Credit or Thin-File customers for financial services but applied to consumer
The problem of "too much data" is also worth mentioning. According to Splunk's latest report, The Economic Impact of Data Innovation 2023, 67% of data innovation leaders strongly agree that their data is growing faster than they can keep up. Every organisation's
key challenge is meeting customer needs and expectations in an increasingly competitive marketplace with sophisticated technology. However, 'more data' is a problem that needs to be turned into a solution.
AI and Machine Learning as a solution
The development of AI and Machine Learning technologies enable us to deepen the usage of alternative data and find solutions to the abovementioned problems.
The financial sector is a prime example of its fast-growing potential. According to the latest survey from the Bank of England and the Financial Conduct Authority, the number of UK financial services firms using Machine Learning continues to grow. Overall,
72% of firms responding to the survey reported using or developing ML applications across all business areas. The total median number of ML applications is expected to increase by 3.5 times over the next three years. The insurance sector is expected to grow
the most in absolute terms, followed by the banking sector. Currently, the most frequently cited benefits are enhanced data analytics capabilities, improved operational efficiency, and fraud and money laundering detection.
Along with this trend, several global fintechs have already developed fundamentally new approaches to assessing creditworthiness, incorporating a wider variety of data sources offering supplemental information about applicants who lack sufficient traditional
data. ConfirmU, for example, has used selected psychometric traits that have shown >80% correlation to financial conscientiousness, with a significant lift for thin-file individuals. Fintechs like CreditLadder in the UK have built tools that allow customers
to provide their rent payment history to improve their credit score at the credit bureaus. Credit Kudos, recently acquired by Apple, enabled customers to utilise their open banking data to build an alternate credit report incorporating their day-to-day banking
and payment activity.
Fintechs like credolab use mobile metadata such as the number of events added to the calendar, new contacts added, types of apps used, and nearly 10 million similar micro-behavioural patterns to develop a predictive score without processing any personal
data. No images, no content of the messages, no files. Metadata technically means data about data, completely anonymised and protected from prying eyes.
For now, in most organisations, metadata tends to be an afterthought or of minor importance. Even companies that pay attention to it tend to focus only on definitions or maybe a little on data history. However, proper first-party metadata management allows
teams to get the most comprehensive results and models.
It is not just a business
Using alternative data with AI & ML algorithms can indeed promote greater financial inclusion and improve lenders' profitability by better understanding their customers. According to the IMF, access to non-traditional data can boost inclusion by alleviating
adverse selection problems that exclude disadvantaged populations from credit markets. The availability of more data about borrowers is a crucial element of the promise that technology offers to the provision of financial services. Information collected in
the context of online services, including social habits, payment of utility bills, and other traces of economic and social activity, may form the basis for evaluating the creditworthiness of a borrower who has not had previous interactions with any financial
services provider. For instance, Frost and others (2019) and Berg and others (2020) present evidence that non-traditional data collected online can predict creditworthiness more accurately than a traditional credit bureau score.
The lack of predictive data continues to be a significant pain point for any financial institution, whether in developing or developed countries. In emerging markets, people with no credit history are referred to as "unbanked" and in developed countries
as "credit invisibles" - the same problem, different semantics. According to the US Government Accountability Office report, roughly 45 million consumers in the country lack a credit score from one of three national credit bureaus in the US (Experian, TransUnion
and Equifax), which limits their ability to qualify for a mortgage loan. Among the efforts being explored to address this issue is using alternative data to quantify consumers without credit scores or with limited traditional credit histories for loans.
The very mobile and web digital footprints we create every second by simply using our smartphone or laptop provide a wealth of behavioural metadata that, if understood and analysed using the right Machine Learning algorithms, enables faster and better credit
scoring decisions based on how people behave in today's world, not just what they look like on paper.
Clear regulatory rules for this alt-data game are needed
Despite such significant advances, market players still have concerns. According to the previously mentioned Bank of England and Financial Conduct Authority survey, the top risks identified for consumers are data bias and representativeness, while the top
risks for firms are considered to be the lack of explainability and interpretability of ML applications. However, the most significant constraint is legacy systems. Almost half of UK firms who responded to the survey said Prudential Regulation Authority and/or
Financial Conduct Authority regulations constrain ML deployment. A quarter of firms (25%) said this is due to a lack of clarity within existing rules.
The situation in the US seems more promising. In December 2020, the Consumer Financial Protection Bureau (CFPB) issued rules that may facilitate the use of alternative data. For example, one rule changed the general qualified mortgage definition to give
lenders additional flexibility—which could include analysing alternative data such as cash flows—when assessing a consumer's ability to repay. In addition, lenders are protected from certain types of liability for loans meeting the definition.
Many countries are considering or have passed new laws governing how personal data can be collected, processed, used, and shared. As the staff discussion note of the IMF stated, national approaches differ widely. A large part of data regulation stems from
legal—and in some cases, constitutional—concerns regarding privacy. Rights-based approaches, like the European Union's approach in the General Data Protection Regulation of 2016, stand in contrast to the more activity-based regulation in the United States.
The US open transfer approach is mainly in the spirit of industry self-regulation, based on the concept of "notice and choice" (World Bank 2021), and data privacy protection laws in the United States are typically state-specific and apply to a relatively narrow
field of hyper sensitive data such as health care or finance.
There is no standard and straightforward approach to a global data (and especially alternative data) governance framework. Effective data policy will require a coordinated approach that involves many actors in addressing the difficult trade-offs. Efforts
must start with clarifying the rules of the digital economy while ensuring its competitiveness and sustainability and, at the same time, the full protection of users’ data and their control over what can be used or not, shared or kept private.
Overall, there are encouraging signs that hopefully will allow alternative data sources like digital footprints and smartphone metadata to gain more traction and become more mainstream in the future.
About the author
Michele Tucci is the Managing Director in North and Latin America and Chief Strategy Officer in credolab, the Singapore-based largest developer of bank-grade digital scorecards and data enrichment solutions. The company provides lenders, risk officers, and
marketers with a previously untapped, highly-predictive source of behavioural data: privacy-consented and anonymous smartphone metadata. Сredolab analyses over 70,000 data points with a proprietary, AI-driven platform and converts digital footprints into highly
predictive scores. Prior to joining credolab in 2018 as Chief Product and Marketing Officer, Michele worked on international consulting assignments, product management and business development roles with Capital One, MasterCard, Intesa Sanpaolo Bank, and
Telecom Italia Mobile.