Blog article
See all stories »

Science Says Paper Beats PDF: Can APIs Beat Them Both?

Remember when your utility bills landed on your doormat instead of in your inbox?

Remember when you payed for them with a cheque instead of direct bank transfer?

It struck me recently how much paper has disappeared from modern life, replaced instead by the undisputed king of electronic paper – the PDF. All of my utility bills are now "paperless", meaning that I receive an email instructing me to log into a website using a password I've forgotten, and download a PDF version of my bill which I will only glance at.

My supplier lowers their costs, I don't need to find lots of space to file all this paper (just some invisible storage space for all those 0's and 1's), and the planet has more trees. Everyone wins right? Maybe not.

Studies in the 2000s found evidence that people read physical paper documents more thoroughly than their PDF equivalents, and comprehend them more. The scientists believe part of the reason is the way we have become accustomed to interact with electronic screens - as windows into a system capable of searching and finding fast answers to questions that come into our minds - as opposed to books which represent a complete body of knowledge or story. The studies also suggest that being able to physically cross-reference between pages aids comprehension, and that the electronic equivalent - the ubiquitous search bar - may be abused to skip to certain sections without reading the whole document. 

I certainly recognize these findings in my own PDF interactions. I often find myself rapidly scrolling through PDF magazines for example, when I may have browsed longer over a print version. Once the document exceeds 20 pages or so, it is tempting to immediately open up the search function and ignore the rest.

So what's the problem here? For small documents like utility bills where you are looking for one key piece of data, I don't think there is a problem. For larger, more complex documents, there is a real risk of information being lost or simply not read.

Financial services is full of such complex, lengthy documents. Consider offering prospectuses, master agreements for OTC derivatives, KYC documentation or (in my niche market) connectivity interfaces. These are long, complex, technical documents which need real people to fully read and comprehend the contents of these documents both initially and as they change over time. The evidence suggests that PDF is a terrible format for this task, and yet it continues to be the choice in the absence of a better alternative.

Thankfully a range of start-ups and initiatives such as ClauseMatch for master agreement negotiation, the KYC utilities from Thomson Reuters and SWIFT, and FixSpec are offering technology solutions to the problem. Each solution serves a specific workflow but shares a common underlying pattern. Firms exchange strikingly similar - but not identical - information between each other in PDF or Word format via email, and it becomes the job of the recipient to read and interpret the document, understand how it maps to their business requirements, and to identify how it differs from similar documents from other counterparties.

The common technology solution in each case is to flexibly extract and normalize data from these various documents, and to allow end users to tap into the accumulated knowledge database to retrieve, compare or (in the case of ClauseMatch) negotiate information in the way most appropriate for their intended business use. I believe these should be the modern goals of the "paperless" movement - not replacing paper by an electronic metaphor, but systems and processes which let people access data in the most convenient and efficient way for them.

Done correctly, I believe that APIs which free data from documents, allowing more direct search, comparison and negotiation will unlock new levels of efficiency and significantly reduce scope for human misunderstanding.

It took over a decade for PDF to evolve from a niche desktop publishing format to de-facto global standard. While that doesn't sound very long, my bet is that disruptive services like these will take significantly less time to revolutionize our industry.

What challenges or successes have you experienced moving towards a paperless environment, and where does the future lie? Are you a PDF lover or a hater and why? Leave a comment and let me know what you think.


Comments: (0)

Now hiring