Add Speech-to-Text to Your Chatbot (with recorder)
Beginner
20 min.
Let users talk with your SAP Conversational AI chatbot inside an SAPUI5 app using the speech-to-text APIs of the Web Client bridge, including the media recorder feature, and an external speech-to-text service.
You will learn
- How to add speech-to-text to your chatbot inside SAPUI5
- How to use the built-in media recorder functionality of the chatbot
Prerequisites
- Any SAPUI5 app (feel free to build the simple SAPUI5 app described in the 2 Minutes of SAPUI5 playlist)
- An SAP Conversational AI chatbot to deploy in your SAPUI5 app (you can simply use a new one with the Greetings skill)
- Knowledge on how to deploy a chatbot to a web page with the Web Client. The tutorial Deploy an SAP Conversational AI Chatbot on a Web Site describes a similar process for the Web Chat client.
- An account with IBM Cloud with at least a free plan for the Speech to Text service
The speech-to-text capabilities of SAP Conversational AI include:
- Adding a microphone button and capturing the user’s click.
- Automatically handling capturing voice from the browser (Media Record)
- Creating a small area to view in real-time the transcription of the voice
- Enabling developers to add the interim transcription to this transcription area
- Adding hooks for, among other events, recognizing when the speech has stopped and adding the transcribed text as a message in the chatbot conversation

Ways to implement speech to text
Basically, there are 4 ways to implement speech to text for SAP Conversational AI:
- Without using the speech-to-text features of the chatbot, handling the voice recognition outside the chatbot, and sending a message to the chatbot using the Web Client APIs.
- Using the speech-to-text features, but without the Media Recorder and interim text features.
- Using the speech-to-text features, but without the Media Recorder feature.
- Using all the speech-to-text features.
This tutorial will show an example of #4 – all the features from SAP Conversational AI, including having the chatbot handle capturing the audio via the browser.
The speech-to-text APIs are documented in the
SAPConversationalAI / WebClientDevGuide
GitHub repo.