• Hi Guest!

    We are extremely excited to announce the release of our first Beta1.1 and the first release of our Public AddonKit!
    To participate in the Beta, a subscription to the Entertainer or Creator Tier is required. For access to the Public AddonKit you must be a Creator tier member. Once subscribed, download instructions can be found here.

    Click here for information and guides regarding the VaM2 beta. Join our Discord server for more announcements and community discussion about VaM2.
  • Hi Guest!

    VaM2 Resource Categories have now been added to the Hub! For information on posting VaM2 resources and details about VaM2 related changes to our Community Forums, please see our official announcement here.

Pipeline for high quality viseme animation (question)

RMM

New member
Joined
Dec 19, 2020
Messages
10
Reactions
5
So one of the things I find most interesting these last couple years is how accessible the AI models are becoming and how helpful they already are at generating immersive, custom voice audio. There are a few text-to-speech models out there that give surprisingly good baseline results, and also a few applications that can brilliantly change your own voice acting into a variety of same or opposite gender characters. Properly embedded, this kind of thing can really shift the immersion level in VAM (or anything, really).

I'm not a 3d dev / artist, but I'm trying to see about making a fast, quality development path for matching viseme animation rigs to these audio clips. I've read from meshed that he's well ahead of the curve on this type of thing with 2.x, where these kinds of links and imports can be managed by all kinds of clever external models (ones that aren't invented yet). But since I don't know when the beta for that kind of thing will be available, I'm playing around with it in 1.2x.

Right now I'm using an AI portal to generate videos of my characters speaking, with my custom audio as the soundtrack. d-id studios has an impressive service that does this, including the ability to upload a photo of your VAM character, so the result is useful and the animation is generally a very good baseline. I'm then going into acidbubbles' timeline manually and generating the animations using viseme / phoneme morphs. Since timeline is so versatile, and has the clever ability to link up head audio timing and frames, regardless of machine performance, this actually produces a shockingly good result.

All that said... it's really, really slow doing it that way. As I mentioned, I'm not a keen developer, so I wanted to ask the community for any suggestions about how to get this further. The goal would be to take these well centered, straight-on 2d videos of the avatars and generate animation rigs from them, and then port the facial animations directly into VAM. I know we have lfe's facial motion capture plugin for the iPhone, so maybe this idea of mine isn't too far off.

Does anyone know of a service available that can do something similar to this, or have any suggestions about how to ramp up throughput on this kind of thing? I'm interested in making hundreds of lines of dialogue eventually.
 
Do we not have a plugin that will generate audio from text? Something that you could feed custom string / text info into? I thought we had this? I remember seeing a chatbot type scene a few weeks ago but i didnt try it - i presumed it did this already (recieved txt line and fed to model).
 
Back
Top Bottom