Each use case that brings Machine learning in play, gets drenched in data. The industry is now waking up to the real challenge of deploying Machine Learning based models in traditional rule-based applications. Machine learning models are data guzzling engines
whose accuracy and performance depends on the data quality and continuity. Imagine them as high performance oriented F1 car engines that need clean & unadulterated fuel to perform at the best of their capacities.
And here lies the big challenge, many organizations over time, have become lax in the data capture & data management for their legacy applications. The trend notably worsened with the advent of data warehousing becoming popular in the early 2000s, as the
big idea was to dump everything in the warehouse, flatten the schema to create a long single record, add a timestamp and then mine the data for information & insights.
This is now coming back to haunt many such projects. When the data quality is discussed, many managers complain about the sparse datasets, blank columns, or even junk data. Lack of discipline in maintaining data dictionaries also creates another problem
of data relevance getting lost. In many cases, there are standard fields that were present just to support an off the shelf application’s analytics module to get structured results.
With time the knowledge of which of these fields are tool specific and which are the fields with actual data was lost, and the practice of lifting and dumping entire schemas continued. That leads us to where we are today, where everyone is afraid of taking
a call to drop certain data sets which most people feel are irrelevant. The lack of courage to take a call and face a potential break in a working system seems to be too big for people to break the rules.
All this has caught banks in a catch 22 of sorts. So, what is the way ahead for us? Of course, you can take the radical approach one of the respected financial services futurist Chris Skinner says, get rid of legacy, legacy people, legacy mindset, legacy
systems, even legacy customers. But is that so easy? He shares the approach to be a simple 2 step process, as confirmed to him by numerous bankers he discussed legacy issues with:
- Start with bite-sized transformations
- Create a rolling snowball effect
But is it this easy? Of course, senior management has to be courageous, to agree to the radical changes to the core systems, but is this push that easy to make? Many times, the story is deeper than it looks at the onset. Someone will have to take the risk
of taking the unpopular decisions. Someone should be courageous enough to agree, let’s drop these data sets as we may not need these datasets in our fresh start.
In our quest of becoming data driven from processes driven we’re placing huge importance and value to the data, and that’s becoming part of the problem for decision making too. Only if we know that the core of the strategy i.e. data itself is faulty and
garbage then we may have to devise a new strategy and many banks are lacking that creative intent to think out of the box.
There are certain banks that are taking the brave leap of faith and immersing them to the data sea and lakes with some filters that may clean the data in, but it will be a long time-consuming transformation, and we have to be patient as much as we have to
be ambitious to achieve the audacious.
Exciting times ahead, indeed.