CPU Performance Patch (Up to 30% faster physics, up to 60% more FPS)

Other CPU Performance Patch (Up to 30% faster physics, up to 60% more FPS)

turtlebackgoofy

Well-known member
Messages
222
Reactions
453
Points
63
turtlebackgoofy submitted a new resource:

CPU Performance Patch (Up to 30% faster physics, up to 60% more FPS) - A CPU Performance Patch without any downsides

As requested in this thread https://hub.virtamate.com/threads/benchmark-result-discussion.13131/page-37 here is a release of the cpu performance patch.
Please share before and after benchmarks with your settings using this scene: https://hub.virtamate.com/resources/benchmark.11336/
Also share if you encounter any bugs like skin flopping around or other plugins breaking.

Summary:
A native C implementation of CPU intensive functions, offloaded to a dll, which gets...

Read more about this resource...
 
Thank you a lot!!!
I see that the version available for download is "patched9" and not "patched9_morph_clutter_fix3". Is this included or the morph clutter fix is put aside for the moment?
I hope you will/can continue to develop on this patch...we need every single bit of performance we can get ♥️ (y)
 
Some of my more intense scenes typically drag my i9 down to 30-32 fps on average. Holding a steady 50+ with multiple pixel lights, detailed hair, and 8xx antialiasing!!!

As a less-talented software developer myself, I can tell this was a lot of complex work, so hats off to you. Do you have a litecoin address we can tip you at man? this is dope.
1707185564756.png
 
Hi, amazing work, ty very much for this (VAM needed this for a long, long time, imagine this coupled with something like FSR 2, oh boy).

Anyways, I just wanted to say, however, that the readme is a little weirdly-worded when it comes to adjusting the .INI file values.

Here, I quote what it says:

computeColliders: should be 75% of the amount of threads your CPU has (including hyperthreads)
skinmeshPart: should be 1 if you have a fast CPU or up to 4 if you have a slow one
CCD: Intel users set it to 0, AMD users who have a ryzen with multiple CCDs set it to 1 or 2, depending which CCD is faster
IterateCCD: leave at 0 or set to 1 if you want to try switching CCDs between every frame, new X3D ryzens might benefit from it. Requires CCD=0 if you want to use it.

So I just wanted to point out the bits that confuse me a little.

1) "75% of the amount of threads your CPU has (including hyperthreads).

So, my 7900X3D has 12 cores, 24 threads. So... I have 24 threads. If I'm not mistaken, each one of the physical Cores have two threads (hence, the "multithread" part, so indeed 12x2 so 24). This may confuse some people into thinking that there's "more" than the threads their CPU is advertised with, into thinking something along the lines of "Wait, do I have hyperthreads over just my normal threads?".

Maybe it's just how I read it first. But basically my number should be 18, since 75% of 24 is 18. Instead, in the readme, I would simply recommend you to just plain and simply type what the actual number to enter should be for this value based on the number of Cores the CPU has (since all CPUs since many years are hyperthreaded anyways).

2) SkinmeshPart description is clear... until you start wondering what actually is considered a "Fast" and a "Slow" CPU? Obviously, the absolute best and most expensive ones are the "best", but where do we draw the line and start saying "Slow" for VAM? (IMO as far as VAM is concerned, there's no such thing as "Fast" really, if I could I would go into the future by +50 years, take a CPU there, and come back to the present to just brute force +300 FPS into this thing). I would recommend for this section to give some examples of AMD / Intel CPU models that would correspond to applying a Value of 1, 2, 3 and 4.

3) As an AMD user, when you say "which CCD is faster" based the choice of number 1, or 2 is a no go. Since for us it's either CCD 0, or CCD 1; there's no CCD 2 (unless there's a model with CCD 2 that I'm not aware of, but my 7900X3D only has CCD 0, which contains the X3D cores, and CCD 1 which contains the non-X3D cores). So here if I enter a value of 1, it leads me to think that it refers to the non-X3D CCD, which is not the CCD I want VAM to focus on (but instead to focus on CCD 0, the X3D cores).

4) The description for IterateCCD is also weird to me. To "leave at 0 or set to 1, if you want to try switching..."

It's either I want to try the 'switching' thing, or not. I cannot leave it at either 0 or 1 and expect to try switching. The way it's worded it basically says that the switch thing will happen at both a value of 0 and 1. Either it happens, or it doesn't, lol. This part just made me scratch my head. So as you say though, it requires CCD 0 if I want to use this feature, which I do. But the setting above this one tells me to set the value to either 1 or 2 for CCDs which still confuses me since I don't have CCD 2. So does this setting here (IterateCCD) actually depend on the value we set above? Or is this feature independent to the value above?

Anyways...

I do genuinely thank you very much for this. But the readme is a bit of a head-scratcher to me.
 
Hi, amazing work, ty very much for this (VAM needed this for a long, long time, imagine this coupled with something like FSR 2, oh boy).

Anyways, I just wanted to say, however, that the readme is a little weirdly-worded when it comes to adjusting the .INI file values.

Here, I quote what it says:



So I just wanted to point out the bits that confuse me a little.

1) "75% of the amount of threads your CPU has (including hyperthreads).

So, my 7900X3D has 12 cores, 24 threads. So... I have 24 threads. If I'm not mistaken, each one of the physical Cores have two threads (hence, the "multithread" part, so indeed 12x2 so 24). This may confuse some people into thinking that there's "more" than the threads their CPU is advertised with, into thinking something along the lines of "Wait, do I have hyperthreads over just my normal threads?".

Maybe it's just how I read it first. But basically my number should be 18, since 75% of 24 is 18. Instead, in the readme, I would simply recommend you to just plain and simply type what the actual number to enter should be for this value based on the number of Cores the CPU has (since all CPUs since many years are hyperthreaded anyways).

2) SkinmeshPart description is clear... until you start wondering what actually is considered a "Fast" and a "Slow" CPU? Obviously, the absolute best and most expensive ones are the "best", but where do we draw the line and start saying "Slow" for VAM? (IMO as far as VAM is concerned, there's no such thing as "Fast" really, if I could I would go into the future by +50 years, take a CPU there, and come back to the present to just brute force +300 FPS into this thing). I would recommend for this section to give some examples of AMD / Intel CPU models that would correspond to applying a Value of 1, 2, 3 and 4.

3) As an AMD user, when you say "which CCD is faster" based the choice of number 1, or 2 is a no go. Since for us it's either CCD 0, or CCD 1; there's no CCD 2 (unless there's a model with CCD 2 that I'm not aware of, but my 7900X3D only has CCD 0, which contains the X3D cores, and CCD 1 which contains the non-X3D cores). So here if I enter a value of 1, it leads me to think that it refers to the non-X3D CCD, which is not the CCD I want VAM to focus on (but instead to focus on CCD 0, the X3D cores).

4) The description for IterateCCD is also weird to me. To "leave at 0 or set to 1, if you want to try switching..."

It's either I want to try the 'switching' thing, or not. I cannot leave it at either 0 or 1 and expect to try switching. The way it's worded it basically says that the switch thing will happen at both a value of 0 and 1. Either it happens, or it doesn't, lol. This part just made me scratch my head. So as you say though, it requires CCD 0 if I want to use this feature, which I do. But the setting above this one tells me to set the value to either 1 or 2 for CCDs which still confuses me since I don't have CCD 2. So does this setting here (IterateCCD) actually depend on the value we set above? Or is this feature independent to the value above?

Anyways...

I do genuinely thank you very much for this. But the readme is a bit of a head-scratcher to me.
Thanks for the feedback.
CCD=0 is for intel, which dont have a CCD.
The settings were mostly meant for experimenting and feedback in the other thread, but then users reminded me to release it as a resource.
I will soon release an easier version with a few example configs. The IterateCCD btw turned out to be useless.

Currently breaking other records by optimizing for ryzens lol

Vanilla out of the box experience:
vanilla.png


Vanilla limiting to one CCD and no HT threads:
vanilla_ccd1_no_HT.png


Patched version limiting to one CCD and no HT threads:
Benchmark-20240206-030259.png
 
Thanks for the feedback.
CCD=0 is for intel, which dont have a CCD.
The settings were mostly meant for experimenting and feedback in the other thread, but then users reminded me to release it as a resource.
I will soon release an easier version with a few example configs. The IterateCCD btw turned out to be useless.

Currently breaking other records by optimizing for ryzens lol

Vanilla out of the box experience:
View attachment 332091

Vanilla limiting to one CCD and no HT threads:
View attachment 332092

Patched version limiting to one CCD and no HT threads:
View attachment 332093

Holy mother of all that is made of God, what magic is this?!

Then I would recommend doing this for your next release, once your Ryzen-specific optimizations are done: simply make two separate files (I mean two downloads), one for Intel users, and one for AMD / Ryzen users, and in each one, in the respective Readme files, just specify what sort of values we should be using (based on either being Intel, or AMD). Just so that your releases are very clear and concise for easier User experience. Please. And again, thank you very much for doing this magic stuff.
 
Last edited:
14900kf 4090 how to setting ? 😄
It is suggested to directly generate a list of models corresponding to installation packages.
 
Some of my more intense scenes typically drag my i9 down to 30-32 fps on average. Holding a steady 50+ with multiple pixel lights, detailed hair, and 8xx antialiasing!!!

As a less-talented software developer myself, I can tell this was a lot of complex work, so hats off to you. Do you have a litecoin address we can tip you at man? this is dope.
View attachment 332063
Woah, not bad and ALSO, she looks gorgeous :love:
Is she available somewhere to download? thank you ♥️
Sorry for hijacking the thread with a non relevant request
 
Last edited:
I think my 4770k is too old to make use of this. My performance on the benchmark got worse for everything except physics and there the average went up only slightly.

Top = unpatched, bottom with patch. There was a slight increase in physics but a drop for everything else.
1707236817236.png

not sure If I should try increasing the skinmeshpart setting or not but I get the feeling I just don't have enough cores/threads to be throwing at vam.

edit: Tested skinmesh 2, which worked better than 4, going to test 3 next.
edit 2: 3 ended up being worse than 2 and 4 (forgot to save results).

Benchmark-20240206-160836.png
Benchmark-20240206-155134.png

Skinmeshpart = 2
Benchmark-20240206-164409.png

Skinmeshpart = 4
Benchmark-20240206-170021.png
 
Last edited:
I think my 4770k is too old to make use of this. My performance on the benchmark got worse for everything except physics and there the average went up only slightly.

Top = unpatched, bottom with patch. There was a slight increase in physics but a drop for everything else.
View attachment 332237
not sure If I should try increasing the skinmeshpart setting or not but I get the feeling I just don't have enough cores/threads to be throwing at vam.

edit: Tested skinmesh 2, which worked better than 4, going to test 3 next.
edit 2: 3 ended up being worse than 2 and 4 (forgot to save results).

1707243440756.png

you halfed your physics time and doubled your fps in the most demanding benchmark lol
the other benchmarks are probably limited by something else going on in unity or even your background windows since your CPU is very old.
 
View attachment 332266
you halfed your physics time and doubled your fps in the most demanding benchmark lol
the other benchmarks are probably limited by something else going on in unity or even your background windows since your CPU is very old.

Yes, after finding what seems to be ideal setting (skinmesh 2) it does provide a bump, though the starting point is so low it isn't game changing but does smooth things out a bit. It also does seem to sort some other issues like skin spazzing out when changing timeline poses so I'll definitely keep running it.
 
Anyone else have this oddity occur using this dll where some morphs do not show up in the menu anymore?

Example without the dll:
1707246970772.png


When using the dll I am missing a tremendous amount of loose morphs and I dont know what else it could be:
1707247013784.png


Once I remove the dll I get all my morphs accessible from the menu back. I know these aren't morphs, but categories, but even the categories go away.

Any ideas what the cause and fix is?
 
Anyone else have this oddity occur using this dll where some morphs do not show up in the menu anymore?

Example without the dll:
View attachment 332280

When using the dll I am missing a tremendous amount of loose morphs and I dont know what else it could be:
View attachment 332281

Once I remove the dll I get all my morphs accessible from the menu back. I know these aren't morphs, but categories, but even the categories go away.

Any ideas what the cause and fix is?
are those local morphs? it will be fixed in the next version
 
In my case I don't find an improvement in the MacGruber benchmark (most likely my GPU bottlenecked my CPU :ROFLMAO:), but in the CPU High Physics Benchmark that is by default in Vam, I went from 145 Fps to 135 Fps with the Fix .
I correct before publishing, I made the mistake of not modifying CCD, before I had it in 1 and the FPS were in ~130 and when I changed it to 3/4 the FPS changed to ~160. Since my GPU is bottlenecking my CPU, I will not run the benchmark again.
Edit 1:Change skinmeshPart=1 to 4 and my FPS went 160 to 190 o_O🤯
My ini settings:
[threads]
computeColliders=9
skinmeshPart=4 (In the benchmark are 1)
CCD=4 (In the benchmark are 1)
IterateCCD=0

Before:
Benchmark-20240207-114737.png

After IterateCCD=0
Benchmark-20240207-123143.png

After IterateCCD=1
Benchmark-20240207-124722.png
Built in CPU benchmark
CCD=1
1707311948013.png

CCD=3/4
1707312925000.png

@turtlebackgoofy If possible, can you make a chart with the CPU and Ini settings?, this would help other users who do not understand much about the subject (myself included :ROFLMAO:).
Edit 2:Change the ini settings
CPU:Ryzen 5600X
[threads]
computeColliders=9
skinmeshPart=4
CCD=4
IterateCCD=0
1707314609454.png
 
Last edited:
@meshedvr I wonder if this would be worth baking into VAM once it's mature and if the author consents.
already suggested to give him the source for free. I can confirm however, that most of the performance upgrades will PROBABLY happen with newest unity engine and selective IL2CPP on its own.
MeshedVR will probably include the small fix with loading data from .zips faster. But that wont increase FPS in all scenes.
 
Will it work with vam as shadow of tomb rider ( sotr.exe ) so vam maade for using nvidia tools ?
 
Back
Top Bottom