The hype around big data is, ironically, huge. There is more to big data technology than the marketing bandwagon. The scale of the difference that systems like Hadoop – the open-source platform based on Google’s storage model – can have on making sense of
lots of input at once is interesting to observe. Using parallelised, decentralised processing offers is analogous to replacing an elephant in wheel with 1000 hamsters in wheels. They can generate the same cumulative power by turning a smaller cycle simultaneously
and therefore the process is faster. They can be split up to tackle problems of different sizes rather than applying the elephant to each one in turn.
Removing concerns around extracting, transforming and loading from one data structure into another, as these platforms are designed to, makes an enormous difference to the way one can design data analysis.
Many more forms of data can be considered as part of a single calculation. It makes the older models of checking a set of defined characteristics against a historical template look inflexible. There are some interesting use cases. Recently firms have begun
to look at alternative ways of operating credit ratings.
Douglas Merrill, the former chief information of Google, has founded a company called ‘Zest Finance’ which professes to be able to analyse peoples’ online behaviour as a credit risk measure. The stated principle behind the service is that it will enable
those without a credit history or rating to be evaluated for credit and therefore to be included in the credit-based world.
As a concept it has intriguing prospects, not least for other observers of risky behaviour, particularly regulators. If one could get credit risk ratings based on a search of publicly available data, could you also get a picture of unlikely scenarios that
skipped the profile of a ‘lucky winner’? What if the search got better with experience so that one were able to ask more pointed questions and get more pointed answers?
Regulators are becoming far more adept at gathering data but not at using it. The US derivatives regulator, the Commodities and Futures Trading Commission found that processing data from complex derivatives trades crashed its systems, according to Commissioner
That makes big, flexible number-crunching systems just what the doctor ordered, as far as market supervisors go. A number of interesting programmes have been launched by bodies in order to better understand what is out there - the SEC’s MIDAS website is
one – and in all likelihood, a new era of transparency will soon be upon us, as long as data is not hidden.
The only challenge then will be to determine what should and should not be private; if you can Google you credit score, what is to stop somebody else?