Join the Community

22,366
Expert opinions
44,299
Total members
374
New members (last 30 days)
157
New opinions (last 30 days)
28,802
Total comments

Contextualised Speech Recognition ready for prime time

  0 1 comment

The potential for speech recognition to augment and enhance mobile banking has been expressed several times in this group over the past few months. It makes sense. After all, m-banking apps have the potential to offer a vast array of options to search and navigate, all of which can result in a poor, time-consumning user experience, compounded by the fact that we all have fat fingers when it comes to mobile screens and keypads. For actions like checking account balance, displaying recent transactions or latest statement, or initiating payments, voice input is quick and easy. Contextualised speech recognition gives us a simple, fast and convenient method of interacting with our mobile apps, and what could be more intuitive than speaking to them! It might not always be appropriate for every circumstance, but speech should be available as an additional modality for when the user wants a fast and easy way to search, navigate and initiate actions in a single step (utterance).

So what is the state-of-the-art in contextualised speech recognition, and can it be made to work reliably given a wide range of accents and the multitude of ways a user might request an action? The short answer is ‘yes’! The “smarts” in this technology are twofold: first, in recognising spoken words, including domain jargon, given the variety of accents and regional dialects. This is a task that can be done very reliably provided the appropriate language modelling is done. More complex is the understanding of a complete spoken utterance so it can be mapped to a specific and relevant action on the device. This is where natural language processing comes in – and even here, significant advances have been made over the past few years, aided by greater computing power and speed. For a particular application domain such as m-banking, language models and interpretation components need to be developed and combined with sophisticated machine learning techniques so that a domain-specific natural language understanding system can quickly learn and refine itself constantly as it’s used, leading to ever more reliable exploitation of the app’s context.

So speech recognition and natural language understanding have come a long way and Apple’s Siri has shown us how it can be applied in a general way and integrated with several applications on a device. For an application like m-banking, what’s needed is a domain-specific speech recognition and understanding capability, for which the technology and expertise is out there for it to be developed and deployed. It just needs the banks to embrace it and trial it because it will greatly enhance the user experience on mobile devices.

External

This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

Join the Community

22,366
Expert opinions
44,299
Total members
374
New members (last 30 days)
157
New opinions (last 30 days)
28,802
Total comments

Trending

Bo Harald

Bo Harald Chairman/Founding member, board member at Trust Infra for Real Time Economy Prgrm & MyData,

The secure way to deploy your personal and organisation AI agents

Now Hiring