• Hi Guest!

    We have posted a new VaM2 dev log on Patreon, starting a monthly cadence of written progress updates between Beta releases. Highlights include the new Gizmos System, Selection Carousel, and Modes System with Context-Specific Editing. Beta1.2 is 15 of 21 items complete.

    Read the full post on Patreon, or follow progress on the public Trello roadmap.
Voice Model (Text-To-Speech, Neural Network based)

Plugins + Scripts Voice Model (Text-To-Speech, Neural Network based)

Go to download
Thanks to some good feedback from @JimjackS0N I have updated the instructions. This will avoid some potential problems when installing the app, which some people experienced.
Hi everyone,

It took me another long time, but I've been cleaning up the source audio, using some custom scripts I created myself. Long story short, I fixed the "pitch' of the source material to all be in the same range, and the "breath" noises due to inhaling air before speaking have been removed. After that I retrained the model on this improved source data. It improved the quality of the model by a lot.

Improvements:
  • Less raspy, cleaner audio
  • No more sudden pitch drops (male sounding voice)
  • Doing two sentences in one prompt (sentences separated by a . ) works a lot better (the old model would generate garbage if you did that)
Installation instructions:
  • See the original instructions
  • Download the "VAM Voice Model v2.zip" from mega unzip it in the "models" directory. You should end up with YOUR_PATH/data/models/VAM Voice Model v2.0/checkpoint_510426
Back
Top Bottom