Hi,
I haven't read the whole thread but some stuff popped out as overkill to so here's how i would do it if helps anybody here get there faster and simpler.
requirements
- install freeware
xampp or any other local web server. web browsers in vam will point to something like 127.0.0.1/vambot/*
- install freeware
balabolka for TTS, it has a very useful balcon.exe tool that reads texts from the command line with tons of options (pitch,voice,rate etc).
v0.1: web bare-bones (estimate 1-2h)
- create a php page like vambot/talk.php that reads a text input in a form ,e.g.
127.0.0.1/vambot/talk.php
- in the script do a simple function like
processMessage($message){ return "Received the following message: ". $message;}
- in the script run a
exec("balcon.exe -t " . processMessage($input)); command (probably something like
exec("nohup [command here] > /dev/null 2>&1 &"); to prevent for the page to wait after balabolka )
goal: if you go in a normal browser to 127.0.0.1/vambot/talk.php and type abc you should hear "Received the following message:abc" etc
v0.2: vam bare-bones commands (estimate 1-2h)
- add a few buttons on your page like "Pick a random number", "Say a joke", etc that submit the form with that same input filled based on the button e.g. "pick a number"
- in processMessage($message) add stuff like
if($message=="pick a number") return "I pick:".rand(0,100);
- in vam add a web panel that goes to 127.0.0.1/vambot/talk.php
- when you click a button you should hear the audio you set for that message
goal: simple vam command interaction/mini-game scene
v1: vam chat (1h-infinity)
- the bot will work with
typing the text in the page input field. To
dictate at first you basically need to be focused on the input in the browser.
- you can format the page with css/js to display a message like "Listening..." when focused
- I haven't played with speech recognition but all dictation software should work, even microsoft's default e.g. "Start listening" command
- once you have this set up, in php you can update
processMessage() to do anything you can image. I would do it from the grounds up, all the bots suck anyway imo. You can start with stuff like most text games did it: "[[Hmm... | Ok. | Sure. | No problem. |That's easy.| Numbers...| ]] [[The number I | I | What I]] [[choose|pick|select|want|like]] is $number ", and process it in php to find strings encapsulated by [[ and ]] and split them by "I" and pick a random value. Very quick and simple to do and you get lots of variety, .105 different messages just from that, imo that's more than enough for VAM and even for assistant uses. But you can easily integrate an actual chatbot with AIML (e.g.
https://github.com/Program-O/Program-O) or external services but to me that's overkill.
v2: sexy lipsync magic (4-10h?)
- use this plugin
https://hub.virtamate.com/resources/realtime-lipsync.1286/
- update your
127.0.0.1/vambot/talk.php script to export the text to a file instead of playing it (
http://www.cross-plus-a.com/bconsole.htm)
- save the file always as the same name in a vam folder e.g. "/Audio/vambot/latest_response.mp3"
- add a vam button to the scene "Talk!" that when pressed loads that audio and plays it. This might be tricky a bit but can be done
v3: improve flow
- to hide the talk button and automatically read the response file in vam you can do through a vam script that checks for a flag (e.g. timestamp file) and when done it automatically does what the button did
- vam button/UI like "TALK", when clicked automatically focus through a script on a web panel
- maybe use
https://hub.virtamate.com/resources/speechrecognition.6865/ to get the focus, like a "Hey VAM" command using this script to switch the focus to the webpanel, and from the there it's dictation mode