Updated: Mar 29, 2019
Let us look how the speech recognition can be implemented on Office 365 SharePoint portals. The article contains the introduction to speech recognition service (speech to text conversion), detailed approach for SharePoint, code and snapshots for easier implementation.
Speech recognition helps recognizing the real-time audio from the microphone and converts it to the respective text. This kind of interfaces helps in building the voice triggered apps like chat bots, etc.
There are two approaches of implementing speech recognition on SharePoint.
First using the Speech Recognition interfaces.
Other way is using the Azure Bing Speech API. It is built on top of WebSockets API. The Speech SDK is available as extensions, which can be leveraged for development.
Let us go with the first approach in this post. We are going to see how the speech recognition interfaces are integrated on to SharePoint applications. In my future articles, I will detail out the integration of Bing Speech API (second approach). Note: The modules depend upon WebRTC, so the modern day browsers like Chrome/Firefox will only be supported, which can get access to the microphone.
Web Speech API Interfaces
Web Speech API helps enabling voice data into the web apps. There are two parts to it.
Speech Recognition Interface – Converts speech into text. (This is what we are going to look into)
Speech Synthesis Interface – converts text into speech.
Integrating into SharePoint
Create a content editor webpart and place the following HTML text. This html contains the elements like button for start/stop recording, span for displaying the spoken text.
<div style="margin: 20px;">
<input type="button" id="micBtn" value="Start Speaking" onClick="record()" style="display: inline-block;" />
<div style="display: inline-block;margin: 10px;">
<p>Detected Audio: <span className="userVoice" id="userVoice"></span></p>
Then required script logic has to be developed. The following details out the interfaces and how the same is integrated into SharePoint. Get the Speech Recognition interface and initialize an object. Then initialize the required properties. In our case, we are only setting the following property.
lang – Language of the spoken text. For English, language code will be en-US.
The other properties that can be considered are
interimResults – Boolean, default to false. Decides whether interim results to be shown or not.
maxAlternatives – Integer, default to 1. Number of alternative text for the speech given.
Continuous – Boolean, default to false. Sets if continuous results to be shown or not.
Then add the event handlers for tracking the audio. The following event handlers are considered in the sample.
onspeechstart – Fired when some sound is detected by speech recognition service.
onspeechend – Fired when speech recognition service stops detecting sound.
onresult – Fired when speech recognition service returns the text after recognition and processing.
The methods considered in the sample are,
start – Start the speech recognition service listening to incoming audio
stop – Stop the speech recognition service from listening to incoming audio
More details can SpeechRecognition interface be found here.
Script File - speechrecognizer.js
The following code snippet shows how the speech recognition service is being integrated into SharePoint portal.
var userVoice = document.getElementById('userVoice');
var SpeechRecognition = (window).SpeechRecognition || (window).webkitSpeechRecognition;
var recognition = new SpeechRecognition();
// voice language - ta-IN for Tamil Language
recognition.lang = 'en-US';
// English Language
// num of alternative text content for the spoken audio
recognition.maxAlternatives = 1;
// Detects speech/audio
recognition.onspeechstart = function()
console.log('Speech has been detected. Results will be displayed once the service stops detecting audio/speech');
// Shows back result as equivalent text from the service
recognition.onresult = function(e)
console.log('Result has been detected.');
let last = e.results.length - 1;
let text = e.results[last].transcript;
document.getElementById('userVoice').innerText = text;
// Speech/Audio detection stops
recognition.onspeechend = function()
document.getElementById('micBtn').value = "Start Speaking";
// Start/Stop recording audio
if (document.getElementById('micBtn').value == "Start Speaking")
recognition.start(); document.getElementById('micBtn').value = "Stop Speaking";
recognition.stop(); document.getElementById('micBtn').value = "Start Speaking";
The following snapshot shows the recognized speech on the Office 365 SharePoint portal.
Note: The above logic is applicable to SharePoint 2013, SharePoint 2016 or SharePoint online versions.
In my next article, you will see how Azure Bing Speech API can be integrated on to SharePoint.