Login

Blog Source:

[To see links please register here]

GitHub Repo:

[To see links please register here]

Hello and welcome to another tutorial on Java, In this tutorial we'll be creating a Voice command application using Java and Sphinx4 Speech Recognition Library for Java.

If you are new to this Voice Command term, there are many apps that serve as an example in reality. If you are an Android user, you must have used the Google App where you speak "Ok Google" and it listens to your command and if you say something like "open google", it'll automatically launch Chrome and open Google.com on it.

Now when you speak into your mic, the computer might not be able to understand what is it that you are saying so we'll be providing our computer with the ability to recognize the words that we say and then covert them into a form that the computer is able to understand hence basis of the term Speech Recognition. You might be wondering how in the world are we going to do that? Well we don't have to worry about anything because we have been blessed with a library called Sphinx4 which does all the complex work for us hence we only have to call certain methods in order to create our Voice Command app.

Approach

So what is it that our app is going to do?

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Here we have our basic approach on creating our Voice command Application.

Requirements

For this app, you'll require the following:

1)

[To see links please register here]

2)

[To see links please register here]

(Download the latest Alpha 5 Version)
Goto

[To see links please register here]

and download the Alpha 5 sphinx-core.jar and sphinx4-data.jar files.
3)

[To see links please register here]

4) A good quality Microphone.

About Models

There are basically three models required for speech recognition in Sphinx4:

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

The sphinx4-data.jar comes with the English version of Acoustic Model as Default hence we will be using that, if you are using other language then you'll have to download it from Here.

Since we are creating a Voice Command app so we'll be creating our own Language Model and the Phonetic Dictionary because our vocabulary will be limited i.e. our commands only. Now lets create our needed files,

Creating Language Model & Dictionary

As said above our vocabulary is limited hence making the model and dict will be a breeze thanks to

[To see links please register here]

. But first we have to make a corpus (Data using which we will train our Language Model) file containing our commands for which we will create our Language Model and Dictionary. For this tutorial I'll be choosing 4 commands.

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Now type these commands in your text file and save it. Then navigate to the Sphinx Online Base Generator, click Choose File and select your corpus text file. Now in response the site will give you a list of files, for now we are interested in the files ending with .dict and .lm extension, so download them and place them in your project folder.

Importing Jar Files

We'll create a new Java Project in NetBeans and then import some jar files for our project because they are required by Sphinx4. So when you have created your project, goto

Run > Set Project Configuration > Customize > Libraries > Add JAR/Folder

Now select the 2 jar files you downloaded earlier, sphinx4-core.jar and sphinx4-data.jar

Press Ok and you are all set, Now lets get to the coding part.

Coding the Application

Now that we are done creating and importing important files, we now have to create a Configuration object and pass it to the Recognizer so that it can make use of the required files, Create a new class called voiceLanucher in the project which will serve as our main class.

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Replace the PATH_TO_YOUR_.DIC_FILE with the .dic file and PATH_TO_YOUR_.LM_FILE with the .lm file you downloaded from the Sphinx Online Base Generator earlier from the Creating Language Model and Dictionary.

The configuration object is now set and we need to pass it to the recognizer. Also we need the recognizer to use our microphone as a source of input, Gladly the latest (Alpha 5) makes it really easy. We just have to create a LiveSpeechRecognizer object, pass in the configuration and call the startRecognition method.

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Now that the recognition process has started, the recognizer will take your speech when ever you speak into the mic and then processes. For the voice command app we definitely need to check that what type of command is the user giving hence we need the recognizer to display the result that what command has it recognized from the speech.

For that we will use the getHypothesis() method from the SpeechResult object, using a while loop we will be able to get all the recognized speech that the user will speak.

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

The command variable will store the recognized speech from the user (The command that you speak) in string format hence we can compare whether the recognized command matches any from our list of commands and then execute the command. We will be using if conditions but you can do it using switch conditional.

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Since the recognized speech is stored in our command variable, we can now compare using String comparison easily. Now run the code and speak into your mic one of the 4 commands If you speak "open filemanager" it should print "File Manager Opened".

After your testing is complete, it's time to add real commands like the one's that'll open the file manager when you speak the "open file manager" command. We will store the command in a variable and then use the Process library to execute the commands.

Hidden Content

You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Run this code and speak the "open browser" command into your mic, it should open the file manager, test all the other commands as well.

Adding more commands

In order to add more commands, just add your new commands in your previous corpus.txt file and then repeat the steps from the Creating Language Model and Dictionary.

If-Else Spaghetti

There might come a time when you'll have a lot of commands in your program and putting them in if-else would be an absolute mess, so what to do? The best thing would be to load all the commands from corpus text file inside a HashTable and map the speech command to it's respective executable command. I'll add an updated code in the github repo of this tutorial so in case if you needed it.

That's it for this tutorial, have fun with your voice command app :smile:
Regards,
Ex094

Very nice and detailed share, it as an interesting read for sure. Thank you for sharing this.

Detailed Is somewhat of an understatement.

You've obviously put a lot of effort to document this tutorial.
As per usual, excellent work @"Ex094".

excellent work my teacher

Thanks for the share will defo be using this tuit when i get the time

berangeredoy

enact215

prudish225

cotter133

persistiveness500423