01 September 2014

tony

Tony Ballardie - Capito Systems Limited

2 | posts 4,832 | views 1 | comments

Innovation in Financial Services

A discussion of trends in innovation management within financial institutions, and the key processes, technology and cultural shifts driving innovation.

Contextualised Speech Recognition ready for prime time

12 June 2012  |  2057 views  |  2

The potential for speech recognition to augment and enhance mobile banking has been expressed several times in this group over the past few months. It makes sense. After all, m-banking apps have the potential to offer a vast array of options to search and navigate, all of which can result in a poor, time-consumning user experience, compounded by the fact that we all have fat fingers when it comes to mobile screens and keypads. For actions like checking account balance, displaying recent transactions or latest statement, or initiating payments, voice input is quick and easy. Contextualised speech recognition gives us a simple, fast and convenient method of interacting with our mobile apps, and what could be more intuitive than speaking to them! It might not always be appropriate for every circumstance, but speech should be available as an additional modality for when the user wants a fast and easy way to search, navigate and initiate actions in a single step (utterance).

So what is the state-of-the-art in contextualised speech recognition, and can it be made to work reliably given a wide range of accents and the multitude of ways a user might request an action? The short answer is ‘yes’! The “smarts” in this technology are twofold: first, in recognising spoken words, including domain jargon, given the variety of accents and regional dialects. This is a task that can be done very reliably provided the appropriate language modelling is done. More complex is the understanding of a complete spoken utterance so it can be mapped to a specific and relevant action on the device. This is where natural language processing comes in – and even here, significant advances have been made over the past few years, aided by greater computing power and speed. For a particular application domain such as m-banking, language models and interpretation components need to be developed and combined with sophisticated machine learning techniques so that a domain-specific natural language understanding system can quickly learn and refine itself constantly as it’s used, leading to ever more reliable exploitation of the app’s context.

So speech recognition and natural language understanding have come a long way and Apple’s Siri has shown us how it can be applied in a general way and integrated with several applications on a device. For an application like m-banking, what’s needed is a domain-specific speech recognition and understanding capability, for which the technology and expertise is out there for it to be developed and deployed. It just needs the banks to embrace it and trial it because it will greatly enhance the user experience on mobile devices.

TagsOnline bankingMobile & online

Comments: (3)

Seeva Selliah - Royal Bank of Scotland - chennai | 12 June, 2012, 12:39

Nice thought. Would like to differ a bit though.

Speech recognition technology (at different maturity levels) has been around for a long time now. However, I have not heard it as a mainstream success in any retail business. The key challenges that hurdle its widespread growth are consistency & security.

From a m-banking perspective it is very risky to leverage speech recognition for key functionalites like login, payments initiation, payments authorization and change of personal details. Further, I doubt very much if banking regulations will accomodate this. For non-transactional functionalites in m-banking, speech recognition can mitigate the risks of 'fat finger' typo errors. However, the cost associated to adopt speech recognition only for this few set of m-banking features (that does not generate direct revenue)  may not prove to be a valid business case for banks.

Tony Ballardie - Capito Systems Limited - London | 12 June, 2012, 13:38

Seeva, thanks for your comment.

I should clarify – I wasn’t implying that speech recognition directly impacts the bottom line – it clearly does not, but I would argue that indirectly, it does. There are plenty of industries where speech recognition has been successfully deployed, primarily as a “value-add” to improve usability and convenience, particularly so on mobile devices. Healthcare, in-vehicle navigation, government, education... the list goes on. The size and success of Nuance is testimony to the success of speech recognition, and why it’s a key part of Apple’s and Google’s strategies.

As for security, there are two points: first, in any well designed speech app that involves a financial transaction or modification of personal data, or a payment initiation, the final execution step has to be a touch or click so the user can review and verify what’s on the screen before confirming. So, I don’t think it need impact “banking regulations” at all. Second, with regard to login or other types of authorisation or authentication, voice biometrics has now advanced to the point that it is actually more secure than a password or PIN. E.g. see http://goo.gl/spceU .

So my argument is that speech recognition is an additional modality that represents a significant value-add in the form of simplicity, speed and convenience. No other modality can reduce a lengthy multi-level navigation/search into a single step. As I’m in the business of speech recognition, I know customers value this enough to want to pay for it. Here’s a practical example: think of in-play sports betting. Mobile sports betting apps are complex and offer many hundreds of markets and within each market sometimes hundreds of possible options. Navigating to a particular market to view odds or place a bet can be a lengthy process and usability is one of the more common gripes of users of betting apps. Voice input offers a tremendous improvement in convenience and speed, and that, in turn, can mean increased customer loyalty. You can view a speech-enabled sports betting app prototype in action in this short youtube video: http://goo.gl/cFV4S . It's easy to see how the voice feature offers a significant user benefit.

Ketharaman Swaminathan - GTM360 Marketing Solutions - Pune | 14 June, 2012, 10:54

You're probably right about the relevance of voice recognition in healthcare, government education, sports gambling and other apps. But, the functionality set currently offered in many mobile banking apps - a/c balance, last 5 transactions, mini statement, fund transfer, etc. - is so sparse that the icons are located comfortably apart on the screen, even in relatively small (2-3") screens, so fat finger is not a big problem. Of course, all that could change in the next generation of mobile banking apps and voice recognition could become important for them.

Comment on this story (membership required)
Log in to receive notifications when someone posts a comment

Latest posts from Tony

Real technology research still happening?

07 September 2011  |  2776 views  |  0  |  Recommends 0 TagsOnline bankingMobile & online
name

Tony Ballardie

job title

CEO

company name

Capito Systems Limited

member since

2008

location

London

Summary profile See full profile »
Co-founder & CEO of Capito Systems Limited. Capito provides world leading solutions in contextual...

Tony's expertise

What Tony reads
Tony writes about
Tony's blog archive
2012 (1)2011 (1)

Who is commenting on Tony's posts