• Hi Guest!

    We are extremely excited to announce the release of our first Beta for VaM2, the next generation of Virt-A-Mate which is currently in development.
    To participate in the Beta, a subscription to the Entertainer or Creator Tier is required. Once subscribed, download instructions can be found here.

    Click here for information and guides regarding the VaM2 beta. Join our Discord server for more announcements and community discussion about VaM2.

Suggestions for AI driven plugin

UnknownPeasant

New member
Joined
Jun 17, 2025
Messages
6
Reactions
5
Hi! Im a long time lurker but have finally decided to make some contributions.
I have been working on a AI driven interaction plugin and are curious about what features the community would like to have in such a plugin.

This is the current features and requirements that I'm been working towards (and that's now mostly working)

Requirements:
* 100% locally running
* speach recognition
* models can talk back
* interaction with local ollama model
* support for mid-end systems, amd and nvidia
* easy to setup
* locally running server with options to distribute processing to other devices (eg other computers on the network)
* less than 2 sec response times
* ability to send commands that interact with vam

Implemented features
* speach to text integration (100%)
* speach processing and filters (100%)
* local llm integration (ollama) on the server) (100%)
* text to speach integration (100%)
* sending commands as 32bit int (100%)
* accessing the built in animations for a person (play anim/stop anim) (100%)
* accessing available clothing (~75%)
* accessing morphs (100%)
* gui in vam (~75%)
* server running locally (100%>

Left to implement:
*gui in vam needs some love
*server gui or at least console commands
*server options (set llm paths/set tts engine/more)
* dynamic commands set by vam gui to interact with other pluggins/triggers


Whats currently working:
The user can have a conversation with an vam character and will get voice responses with a basic lipsync (using amplitude atm). Responses takes between 0.5 to 3 sec to get, this delay actually don't feel that bad. The user can request that the model does stuff and the llm will, if it decides to, do those actions. The llm have access to the following actions as of now: clothes off, clothes on, play specific animation depending on mood/requests, talk back and reason with the user.

What essential is implement is a way to talk to a llm from within vam and get ai response back with possible actions.

I do know that there are other solutions avaliable out there for this but from what I can tell my implementation is more straight forward and is running way better on mid-end systems. My solution is also made to run 100% locally but can easily be distributed to other devices on the network.
(for example, some old laptop is running the server, an raspberry pi is running the tts engine, another machine is running the llm and finally your gaming computer is running vam, or you just run everything on your gaming computer)

What features would you like to see in this kind of plugin?

I will update this post with videos of it all running when i find the time to do so.
 
That sounds very exciting. Things I would celebrate:

1.) respond as quickly as possible
2.) Setting options SFW/NSFW answers
3.) multilingual
4.) emotions in the voice (not a robot voice)
5.) simple and easy UI
6.) Dialogue storage for further conversations

These are the things that I have in mind when I want to connect ki with vam. I'm really excited to see what you create and it's great that you're working on it. good luck.
 
That sounds very exciting. Things I would celebrate:

1.) respond as quickly as possible
2.) Setting options SFW/NSFW answers
3.) multilingual
4.) emotions in the voice (not a robot voice)
5.) simple and easy UI
6.) Dialogue storage for further conversations

These are the things that I have in mind when I want to connect ki with vam. I'm really excited to see what you create and it's great that you're working on it. good luck.
Ty 4 your feedback, focus is always on fast responses. The sfw/nsfw setting will highly depend on the selected llm and it's set context, this is a user configuration. As of now there's support for 10+ languages but this also highly depend on the models used. Will do my best to make it easy for the user to configure.
Emotions are a hard criteria to tackle as almost no existing models with emotion support also support streaming. One can however choose a model with trained voices to make it sound more human.
Im a designer at core and will not release a unusable ui.
Storing session data is something that I'm planing to add but it will not be part of my first release.

Thanks for your feedback!
 
Hello!
This sounds promising. Do you connect VaM to Kobold or another LLM engine?
I am also trying to make a similar plugin to connect to Koboldccp. You can check what other members asked me in the discussion of my plugin. First of all, it is support for various TTS options, AI vision and advanced roleplay options. Also, it is connecting to AI Horde, because not everyone has a powerful PC or even two powerful PCs.

Function calling via LLM (like changing clothes or poses via LLM request) is also an interesting option. But IMHO, to work out of the box, this will need predefined collections supplied with the plugin ("action bundles"), as vaan20 provides.
 
Last edited:
Would be interesting to see at what state your plugin is to have a feeling of what could be improved. Also on what hardware you run all this.
On ONE machine if understand you correctly?
Curious to see what you did here :)
 
Im only interested in TTS I started working on one a few months ago and stopped; then started again after I completed voicemodvam but stopped because I didn't want the headache at this time. My plan was to jack into Vam's speech and thought bubble at first and then see if I can get it to work with signs and vamstory after and to keep it light weight and easy I was planning on using only microsoft tts to power it. Yeah the voice won't be all that great but that's why I made VMV. I'm glad someone is giving it a go. Wish you luck and can't wait to see it if you are really going to create it.
 
Hello!
This sounds promising. Do you connect VaM to Kobold or another LLM engine?
I am also trying to make a similar plugin to connect to Koboldccp. You can check what other members asked me in the discussion of my plugin. First of all, it is support for various TTS options, AI vision and advanced roleplay options. Also, it is connecting to AI Horde, because not everyone has a powerful PC or even two powerful PCs.

Function calling via LLM (like changing clothes or poses via LLM request) is also an interesting option. But IMHO, to work out of the box, this will need predefined collections supplied with the plugin ("action bundles"), as vaan20 provides.
In my approach im setting up a external server (outside of vam) and make requests from within vam to this server. VaM is extremely limited in terms of functionality so an external server is almost a requirement for this to work. My server then prices everything together and sends a response to vam. I connect my server directly to the needed llm/tts/stt. In my response to vam I'll also send commands that vam (trough my plugin) can act upon. It's a really straight forward design pattern that also removes most of the limitations impose by vams outdated libs
 
Im only interested in TTS I started working on one a few months ago and stopped; then started again after I completed voicemodvam but stopped because I didn't want the headache at this time. My plan was to jack into Vam's speech and thought bubble at first and then see if I can get it to work with signs and vamstory after and to keep it light weight and easy I was planning on using only microsoft tts to power it. Yeah the voice won't be all that great but that's why I made VMV. I'm glad someone is giving it a go. Wish you luck and can't wait to see it if you are really going to create it.
Stt -> llm -> tts is already up and running, im not limiting myself to ms service but rather use some ai models to get it to feel better.
 
Would be interesting to see at what state your plugin is to have a feeling of what could be improved. Also on what hardware you run all this.
On ONE machine if understand you correctly?
Curious to see what you did here :)
Yes I run it all locally, Im at work atm but will post a update asap, my pc spec is not anything special, its a mid-end gaming rig really.. And its on amd without any cuda cores.
 
Probably impossible with quality:

Would like AI-like features that are working offline and don't need a account anywhere.
 
In my approach im setting up a external server (outside of vam) and make requests from within vam to this server. VaM is extremely limited in terms of functionality so an external server is almost a requirement for this to work. My server then prices everything together and sends a response to vam. I connect my server directly to the needed llm/tts/stt. In my response to vam I'll also send commands that vam (trough my plugin) can act upon. It's a really straight forward design pattern that also removes most of the limitations impose by vams outdated libs
You are right. I also just adapted a python http proxy at the end because in VaM most networking features are just prohibited. In the other hand, things are really easy when using a regular python server.
 
Back
Top Bottom