Join the Community

24,140
Expert opinions
40,674
Total members
334
New members (last 30 days)
209
New opinions (last 30 days)
29,292
Total comments

Contextualised Speech Recognition ready for prime time

  0 1 comment

The potential for speech recognition to augment and enhance mobile banking has been expressed several times in this group over the past few months. It makes sense. After all, m-banking apps have the potential to offer a vast array of options to search and navigate, all of which can result in a poor, time-consumning user experience, compounded by the fact that we all have fat fingers when it comes to mobile screens and keypads. For actions like checking account balance, displaying recent transactions or latest statement, or initiating payments, voice input is quick and easy. Contextualised speech recognition gives us a simple, fast and convenient method of interacting with our mobile apps, and what could be more intuitive than speaking to them! It might not always be appropriate for every circumstance, but speech should be available as an additional modality for when the user wants a fast and easy way to search, navigate and initiate actions in a single step (utterance).

So what is the state-of-the-art in contextualised speech recognition, and can it be made to work reliably given a wide range of accents and the multitude of ways a user might request an action? The short answer is ‘yes’! The “smarts” in this technology are twofold: first, in recognising spoken words, including domain jargon, given the variety of accents and regional dialects. This is a task that can be done very reliably provided the appropriate language modelling is done. More complex is the understanding of a complete spoken utterance so it can be mapped to a specific and relevant action on the device. This is where natural language processing comes in – and even here, significant advances have been made over the past few years, aided by greater computing power and speed. For a particular application domain such as m-banking, language models and interpretation components need to be developed and combined with sophisticated machine learning techniques so that a domain-specific natural language understanding system can quickly learn and refine itself constantly as it’s used, leading to ever more reliable exploitation of the app’s context.

So speech recognition and natural language understanding have come a long way and Apple’s Siri has shown us how it can be applied in a general way and integrated with several applications on a device. For an application like m-banking, what’s needed is a domain-specific speech recognition and understanding capability, for which the technology and expertise is out there for it to be developed and deployed. It just needs the banks to embrace it and trial it because it will greatly enhance the user experience on mobile devices.

External

This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

Join the Community

24,140
Expert opinions
40,674
Total members
334
New members (last 30 days)
209
New opinions (last 30 days)
29,292
Total comments

Trending

Mete Feridun

Mete Feridun Chair at EMU Centre for Financial Regulation and Risk

The Crypto Crash: A Stress Test for Global Financial Stability

Alex Kreger

Alex Kreger Founder and CEO at UXDA Financial UX Design

From Inside-Out to Outside-In: Why UX Now Underpins Future Banking

Robert Kraal

Robert Kraal Co-founder and CBDO at Silverflow

What Do Merchants Really Want from Payments Technology?

Now Hiring