Plugins - SpeechRecognition

Recognize simple keyword lists as well as more complex grammar using Windows 10 and Unity's speech recognition features. When starting with this project the assumption was I can wrap the simple looking API in a plugin and build a simple demo scene in just a few hours. However, making a reasonable plugin UI, implementing error handling and necessary performance features as well as circumvent VaM quirks with file dependencies took a bit longer.

(Watch video with sound!)

Features

This is not "AI", it does only understand phrases you teach it to understand. It simply triggers whatever you hook up to it when a phrase is recognized.
Keyword Phrase: Simple list of keyword variants. Works great for things like "Yes", "Yeah!", "Yep", ... or "No", "Nope", ...
Grammar Phrase: Provide an XML file that defines a grammar structure, which helps system to make more sense of what you are saying, provide synonyms for individual words, etc. I recommend the Microsoft documentation here for more details what is possible.
The plugin allows you to listen for multiple phrases simultaneously. Each have their own trigger actions. To provide context you can enable/disable phrases via triggers depending on what makes sense at the time in the scene.
UI should be familiar if you used LogicBricks before.
Simple Demo scene from the video.

Examples
Some sentences that should be recognized by the demo scene's grammar definitions:

"launch the terminator robots"
"deploy killer androids"
"release the mutant penguins"
"retrieve the sharks"
"recall ninja sharks"

In this case I did stick to simple sentences structured as "action - fluff - object". Of course you could define variation not just in words but also in sentence structure.

Notes / FAQ

This plugin is using your system default microphone, not your microphone setting in Oculus or SteamVR software! Make sure your mic is enabled and has a reasonable volume setting.
While SpeechRecognition works with other languages, to run the demo you need to go to your Windows 10 speech settings and make sure they are set to English (US). You may have to install the voice package as well. Note that other English variants, like UK or Australian, won't work, it has to be US! This is a different setting than your regular Windows region/language setting. You can try to enable "Recognize non-native accents", but at least for me it actually reduced the recognition reliability. See screenshot for details:
WHY on earth does this use a ".json" file extension for ".xml" files??! Because....VaM. When creating a VAR package this allows dependencies to be handled automatically, making sure your scene does not break. As the list of extensions properly supported by VaM is fixed, ".json" is simply the least terrible choice.
If you XML files refer to other XML files using the ruleref tag, you can/should add a dependency hint for each file needed, so VaM can find them. For each file add a comment line like this:

Of course Windows 10 Speech Recognition does not know what a VAR package is and how to get files out of it. Therefore, when the plugin encounters XML files (".json") that are located inside a VAR package, all ".json" files in that package are extracted to "Custom\PluginData\SpeechRecognition" to make them accessible. Sadly that includes other files likes scenes that don't really need to be extracted. If you have lots of those, you might want to put your XML files into a separate VAR package to avoid extracting more than needed.
The problem with speech recognition is that a creator will have a hard time to think of all the possible things a player could say and setup recognition and reactions for those. There is simply no way to implement everything. Usually you will have to give the user some kind of hint what types actions can be done.

Credits for EvilCorpHQ scene

Speech synthesis for assistant voice by Amazon Polly (via ttsmp3.com)

Dependencies for EvilCorpHQ scene

All the needed dependencies can be found for free on the Hub, just use VaM's handy "Scan Hub For Missing Packages" function.

License

This was an EarlyAccess release! Download is now available for free under CC BY-SA license. You are allowed to reference this package in your own VAR packages, even if they are paid or use a different license. Links to my Patreon are always appreciated.

Plugins SpeechRecognition

More resources from MacGruber

Share this resource

Latest updates

Version 3 (free)

Version 2 (free)

Version 1 (free) (forgot the demo scene)

Latest reviews