19 April 2014

The Big Data Blog

Amir Halfon - MarkLogic

9 | posts 44,595 | views 1 | comments

Innovation in Financial Services

A discussion of trends in innovation management within financial institutions, and the key processes, technology and cultural shifts driving innovation.

NoSQL Use Cases

04 January 2014  |  2362 views  |  0

In an earlier post I discussed the topic of NoSQL - what it is, what it isn't, and what are some of the misconceptions surrounding it.

I'd like to now turn to the topic of what NoSQL is good for - the actual use cases, especially those involving Enterprise NoSQL.

 

1. Operational Trade Store

Once a trade is made it needs to be processed by the back office, and reported to the regulators. Trade data is typically read off of the message bus connecting the trading systems, and persisted into a relational database, which becomes the system of record for post trade processing and compliance. The original data formats are either XML (FpML, FIXML) or text based (FIX), and have to be transformed into normalized relational representation. This may sound easy enough, but with a high rate of innovation in the front office, introducing very complex instruments quite often, the task of stuffing them into a relational store becomes harder and harder. And as a result, the back office takes a longer to respond to the needs of the business. This is compounded by the need to create and maintain a fully normalized,"canonical" schema before any new data can be ingested, which can become quite onerous, leading to a proliferation of multiple schemas and databases. Or worse yet, workarounds are put in place that allow shoving data into existing schemas (such as flags that indicate a record is of a different type than expected, or an empty shell into which any variable can be fitted).

These workarounds can create costly trade exceptions downstream, which need to be resolved manually, and the ensuing costs are compounded by the high maintenance costs of complex RDBS systems, leading to high costs per trade.

All of these ills can be addressed using NoSQL, by persisting trade messages as-is, without the need for transforming them into a normalized relational schema. Trade messages contain their own structure, and there's no need for an over-arching canonical data model in order to process them or report on them. Furthermore, this structure can be modified at the time of querying the data based on the actual usage, rather than trying to create a schema that will handle any foreseeable usage. This is an example of the notion of schema-on-read mentioned in earlier posts.

 

2. Reference Data

Another area where the current state of data management can seem abysmal is enterprise reference data - data about traded instruments and the legal entities related to them. Most banks have been through several rounds of M&A and other organizational changes that resulted in multiple reference data management systems across the firm. This introduces data inconsistencies (which lead to trade exceptions), complexity and costs. Many firms have been trying to rationalize their reference data systems to create a single enterprise data management platform. This has usually been a Herculean task however, for reasons similar to those mentioned above. Namely, the level of effort involved in creating a single, unified data model to handle all the different incoming data-vendor feeds and address all the different concerns of the downstream data consumers. And this is typical of Enterprise Data Management efforts of this scale that rely on a relational database as their core platform.

In this case as well, NoSQL provides an attractive alternative, allowing for the persistence of data vendor feeds in their original format, without the need to transform them into a canonical data model. The data can then be fed to the costumers in the appropriate formats with transformation occurring at the time it's needed, rather than ahead of time based on assumptions - again, schema on read vs, schema on write.

 

3. Customer Insight

Customer data is often dispersed across the organization, with diverse systems all having different notions and data models encapsulating a customer. Obtaining the illusive 360 Customer View, whether for revenue purposes, fraud prevention and risk mitigation, or as a result of regulations, has been close to impossible as a result. And the need to go beyond the firm's firewall and incorporate web and social media data is making this even tougher.

Again, the main culprit is applying typical Enterprise Data Warehousing methodologies, which are relational in nature, and are dependent on canonical models which are hard to change and evolve as more data becomes available and the needs of the business change.

In this case there's also an added wrinkle, as some of the data is much less structured than in the previous two use cases. And this is true for more tha external web data: Customer on-boarding documents, call center notes, web server logs, etc. are all internal to the firm, yet represent just as much of a challenge as social media data in incorporating them into a coherent customer view. This last particular aspect is generating a lot of interest in NoSQL, as it makes much more sense to use a non-relational database for these types of data, but the advantages of NoSQL also become apparent when it comes to highly structured data, as it alleviates the need to harmonize and normalize the data before it can be aggregated. With NoSQL it's perfectly fine to have different representations of a customer, which can be unified based on certain attributes without needing to create a single, over-arching data model. Thus new data can be easily incorporated from disparate systems, and then linked and enhanced with non-relational text data

 

4. Regulatory Compliance and Investigations

Whether it's Dodd Frank, EMIR, FATCA, KYC or Basel III - most of today's regulations involve data. And the data needs to be obtained from disparate sources (including non-relational ones), and presented quickly, sometimes in an ad-hoc manner. Consider for instance Dodd Frank Title VII, which requires reporting on all the phases of a swap transaction, eventually also including the pre-trade correspondence preceding it - obviously this would be quite hard to do in a relational database, which was not designed with text analytics in mind. Similarly, FATCA requires reporting on foreign account access by US citizens, and the data concerning this access may be found in non-relational sources (such as the on-boarding documents mentioned above). Legal investigations also represent a similar challenge, as they require combing through reams of documents and email messages in search of the ones relevant to the case at hand.

Furthermore, the need to constantly update internal procedures based on regulatory changes, including the mapping between the actual regulations and internally available data can become quite onerous.

In all of theses cases, a NoSQL database, particularly a document-oriented one, represents a superior solution to traditional relational technology.

 

5. Pre-Trade Decision Support

One of the earlier use cases for looking beyond relational databases was sentiment analysis - mining the web for indications of public sentiment to effect trading decisions and risk calculations. This use case is related to, but also expands on, text mining based on news analysis. But pre-trade decision support involves the aggregation of many other sources of information - from highly structured market data to highly unstructured analyst research, to geo spatial data (in the case of commodities), etc.

All this data needs to be visible on a trader's desktop, and there's a growing realization that rather than just presenting diverse data side by side, there's value in aggregating it into a more holistic view of a given instrument. Here again, a documented-oriented NoSQL store is a natural fit, especially if it supports geo-spatial information, and has event-processing features that can be used to alert traders about significant changes.

 

These are some of the use cases where I've seen successful NoSQL implementations within the industry. I’m interested in your thoughts - are these representative of the ones you're targeting ? Are there other prominent ones I've missed?

TagsRisk & regulationInnovation

Comments: (0)

Comment on this story (membership required)
Log in to receive notifications when someone posts a comment

Latest posts from Amir

The Case for Semantic Technology in Financial Services

14 April 2014  |  1420 views  |  0  |  Recommends 0 TagsTrade executionInnovationGroupData Management 101

NoSQL Use Cases

04 January 2014  |  2362 views  |  0  |  Recommends 0 TagsRisk & regulationInnovationGroupInnovation in Financial Services

ACID, BASE and NoSQL

09 May 2013  |  2547 views  |  0  |  Recommends 0 TagsPost-trade & opsInnovationGroupBanking Architecture

Enterprise Big Data: It's Not About Size

16 April 2013  |  3073 views  |  0  |  Recommends 0 TagsPost-trade & opsInnovationGroupInnovation in Financial Services

Big Data Use Cases

24 February 2012  |  11972 views  |  0  |  Recommends 0
name

Amir Halfon

job title

CTO

company name

MarkLogic

member since

2011

location

New York

Summary profile See full profile »
Amir Halfon is Chief Technologist for Financial Services at Marklogic, where he oversees the deve...

Amir's expertise

What Amir reads
Amir writes about

Who is commenting on Amir's posts