Yeah, those are really good! Especially the timing for some of her tweets. Like, she actually gets the timing for the punchline right. You can just keep on training and write down somewhere that you like the quality at this iteration. So you have a model at this checkpoint you like, but if you train it a bit more, it will probably sound a bit more crisp.Ok, I'll probably kick the training back off tonight, but I've gotta say, it's pretty dang near perfect already. For most paragraphs if I hit submit a second time, the 2nd time I submit the exact same thing it comes out perfectly. The first render always seems to be read really fast, and the second render comes out just right. I tried to find some good samples of text that would cover a wide variety of scenarios. First I tried using a Wikipedia entry about her, so that's there. Then I thought "why not just use a bunch of her tweets?" So, I found a variety article with a bunch of their 'favorite tweets' of hers, and I had the AI read out her tweets. Pretty freaking perfect, listen for yourself. Attached.
By the way, before you would continue training, you can also try to do some custom hifi-gan training. (Train your own vocoder). This works the same way. Now when you start/continue training your model, click on "Train Custome one" select the base vocoder files, and they will start training. Vocoder training goes pretty fast. In half a day to a day you will probably have the vocoder trained. The effect for me was minimal, but it does make it sound more crisp.
Question: when you said that the score was wobbling around 67-68 did you mean validation score? Because that's like an insanely high validation score. Mine are between 0.3 and 1.0 for the voices I train.