I have for some time been aware of companies increasingly submitting code to open source repositories, some quite openly like the hedge funds Man AHL and Two Sigma Investments. Others have jumped on the "cool open source" bandwagon. The subject is frequently
discussed on the conference circuit and blogosphere, particularly in data science and FinTech. However, does reality actually match the hype ?
To validate, I spent a happy New Year's day morning trawling Github and I manually compiled the following chart.
Does this tell us anything ? Overall, I was quite encouraged, but felt financial services firms in particular have more to do.
My first observation regards firms not featuring on the list. Many Financial Services companies - banks, asset managers and other open source-consuming tier 1 hedge funds - are notable by their absence on Github, though in fairness some host repos elsewhere.
While Goldman Sachs, for a long time active with Java, and JP Morgan are readily findable, many of their rivals sadly barely register. Kudos to those that have contributed, particularly the likes of Two Sigma and Man AHL, who have truly put money, time and
effort where their mouths are.
Vendors like Bloomberg and Thomson Reuters have found repos to be useful for promoting APIs to their databases, not unlike some internet services firm submissions represented in the list. They're doing well.
Particularly pleasing for me were two "proprietary" software firms active in Financial Services, MathWorks and SAS, both releasing significant numbers of high calibre code repos, not least because I worked many years for one of them. Predictably, most repos
former competitors at SAS ! It seems they understand the programming languages preferred by their staff's children and grandchildren, a bit like my using DuckDuckGo, chatting with gamers on Discord and shouting "boomer" at anyone over the age of 33.
So what does this all mean, if anything ? On one hand, perhaps not much. As a former colleague critiqued, "not many companies will host their proprietary sources in public Github repos."
He is absolutely right. This is the fundamental flaw in my chart and my analysis.
Goldman Sachs, for one, have fought expensive and PR-busting court cases over proprietary trading strategy code.
There is also much more to software repo sharing than Github, though the symbolic nature of Github as the leading open repo-sharing site does render the analysis at least useful and interesting, if not significant.
However, behind the numbers I see more dynamics at play than simply a split between proprietary "valuable" code and submittable, "safe", "un-valuable" code.
For one thing, some of the listed companies increasingly attract talent because of their community commitment. The listed hedge funds in particular have a brand reputation which supercedes rival "proprietary" hedge funds which use code but don't appear to
give back code. My middle son, a self-confessed computer geek and mathmo, is more likely to send his CV to Two Sigma when he's big enough.
One COO of a bank that does not feature on my list recently noted her company's ethos had not been to contribute. However, her banks's stance was changing, albeit not as fast as she liked. They had some non-proprietary APIs and semi-proprietary marketing
loss-leader applications lined up for some online repos, which after all are largely what the sugar-parented deep learning libraries are courtesy of Facebook, Google and others. She even showed them off at a conference I saw her present at. Her developers
were keen to contribute for the sakes of their cvs, for their former university colleagues who they collaborated with, and for fellow researchers at other banks with similar interests who could add incremental capability. The bureaucratic delays of her bank's
legacy attitudes were, the COO admitted, impacting her team's retention. Sure, her company was changing, but the internal debates and discussions were taking time. She was, it's true, convinced that the code-sets would be released quite soon. She also knew
that her bank is not the only organization facing such issues.
Another factor: In financial services, as with the Internet Services data scientists, the proprietary battleground driving differentiation and profit is tending towards data, with model frameworks in particular increasingly commoditized. What used to be
proprietary in terms of building fast, complex models quicker than anyone else has more or less been open sourced, a topic I've written on
in other Finextra blogs. In short, why not contribute more? Google have followed up their deep learning package Tensorflow with TensorFlow Quant Finance. I repeat, that's Google
doing that, not one of their Investment Bank counterparts though some good quant libraries are available in Github.
Pythonista legend Wes McKinney was asked the
following question at a recent conference, "I see a lot of companies now they have developer advocates where they work on open source projects, do you think that’s a way to help the open source community, keep maintaining all the software and all the work?"
McKinney answered, "I think it does. When companies contribute to open source projects, it goes beyond just charity. I think there are other benefits in building your company’s technology brand. It’s part kind of feel good, we’re making the open source world
better, but it’s also marketing for your organization, that you’re doing good in the world and you’re supporting these projects. It’s a win-win. I think that developer advocates, just in terms of lobbying for … because engineers who want to contribute to open
source projects may not be the best advocates for themselves in kind of arguing to management about why those contributions matter and it will make them happier and less likely to churn and move to a different company. I think developer advocates do help a
In conclusion, there are increasing pressures on firms in Financial Services, particularly those who consume open source (many do, just look at any skillset demanded by Quant, Developer, Data Scientist and Technologist recruiter) to contribute back.
The experience of the COO referenced earlier I see as symptomatic of the transformation taking place. If she's successful then 2020 will, I predict, be a breakthrough year as the early adopters will be joined by the mainstream.
However, can she and her colleagues - and those in other sell-side and buy-side institutions fighting their own bureacracies - overcome the remaining proprietary edicts within their respective organizations ?.
It is also to be seen as to whether firms who do already contribute, like Goldmans, JP and others, continue to up their game and truly rival the likes of Google in contributing back a commensurate quantity and quality of open source to compare with what
Here's hoping to a 2020 of many more code repositories posted by financial services firms !