CPU Performance Patch (Up to 30% faster physics, up to 60% more FPS)

Other CPU Performance Patch (Up to 30% faster physics, up to 60% more FPS)

Looking through VaM code I wonder if there is even a need for unity lol. Unity seems to just be for running mono, displaying UI, keeping a list of game objects, forwarding positions to physx and starting rendering. Everything else was written by @meshedvr from scratch.

You are not entirely wrong, and VaM2 will be similar in some regards. I do very much rely on things like Unity's asset management, UI system, VR integration, and rendering system. But where Unity fails, I'm try to go further. Part of what made VaM different than others is I didn't rely completely on Unity. I had to do a lot of stuff myself to push the limits of what was possible (within this game engine and C#). I do know it could be better not relying on Unity and C#, but that was not something I could really take on at the time, or even now. Unity's improvement with c#/jobs/burst has me convinced I can eek out a lot more from the engine than previously. I can also get around Physx limitations with custom written physics where needed.
 
Can I just say that reading the back and forth between turtle and meshed is super fascinating and I'm loving it. Watching two obviously very smart dudes and talented coders discus how this stuff functions under the hood is crazy interesting. You're both huge nerds and I'm so here for it. Haha.
 
indeed fascinating to see two 10x developers go at it lol. this memory optimization reminds me of the times back when coding was actually difficult and you had to care about memory considerations before the era of install some frameworks and stack overflow copy+paste your way to a lumbering deliverable written at the highest level of abstraction possible.
 
I went ahead and ran it via admin cmd like you said, and didn't see any noticeable increase, but I also bypass steamVR and oculus runtime which helps performance by like 10-15% for me usually. Either way, still seeing gains over vanilla so we're still winning. Thanks so much for putting so much energy into working on performance. Its the main pain point imo, especially in VR. Love my 5800X3D, but its hard not to consider high clocked intel for my next build.
Don't do it! Remember the X3D chip's 3d cache make up for the higher clocks. 7800x3d is equal to the more expensive 13900k in vam benchmarks, and use much less power doing it. The next gen AMD chips are supposed to be a huge jump over that this year. Dont think theyll compete with 9800X3D (late 2024) any time soon for sim gaming (VaM).

Or do it, whatever floats you boat :)
 
Last edited:
Allright, stutters got better due to less garbage beeing generate for collection, but there is still stutter if you have lots of morph deltas that need to be demand loaded while a scene is playing. Nothing I can do without breaking things, guess @Acid Bubbles Timeline could add an option to preload all morph's deltas it will use in a scene to remove the last stutter. (hint: dazmorph.LoadDeltas())

I think I got all obvious things that could be optimized now.
1707774211853.png

Now you just need to betatest it.
 
Last edited:
I think I found the cause of the regular garbage collection stutter, VaM creates 2926 strings per second which then need to be garbage collected. A fix is easy.
Edit: done, I had 1-2 stutters in the 70secs of benchmark baseline3, now I have 0 lol. Will be released in next version.
Dude, I'm not sure if people will realize how much of a game changer this 'simple' thing will do.

I cannot count the number of scenes (pretty much 100% of them) where at one point or another my fun was interrupted by a stutter every 2 mins. It was also greatly limiting me in recording dancing animations since it would put the animation to a complete halt while the music would keep on going. Same goes with basically anything in motion in a scene.

I'm SO happy that finally this problem will go away (to me it's even better than the extra FPS at this point). Ty very much!
 
Interesting, for some reason unity allocs a lot of memory for an ENUM lol (it should not allocate heap memory for an int)
0​
187088​
3741760​
PositionStateFreeControllerV3:set_currentPositionStateMotionAnimationControl:ApplyStep
0​
170080​
3401600​
RotationStateFreeControllerV3:set_currentRotationStateMotionAnimationControl:ApplyStep
0​
23386​
467720​
Int32FreeControllerV3:set_currentPositionStateMotionAnimationControl:ApplyStep
0​
23386​
467720​
Int32FreeControllerV3:set_currentRotationStateMotionAnimationControl:ApplyStep
 
Just so you know, the new patch 12 somehow interferes with Passthrough in vr introducing alot of artifact snow. I can fix it slightly by changing chroma settings, but still there a little. Kinda a random far use case by-product in which i cant even begin to understand its correlation. But it still gave me 26% more fps than Patch11 in my two person morph heavy test scene!

7800X3D(SMT off)/4090/Quest3

[threads]
computeColliders=8
skinmeshPart=8
applyMorphs=8
skinmeshPartMaxPerChar=2
applyMorphMaxPerChar2
affinity=1,2,3,4,5,6,7,8

[threadsVR]
computeColliders=6
skinmeshPart=6
applyMorphs=6
skinmeshPartMaxPerChar=2
applyMorphMaxPerChar=2
affinity=1,2,3,4,5,6,7,8

[profiler]
enabled=0
 
Last edited:
Just so you know, the new patch 12 somehow interferes with Passthrough in vr introducing alot of artifact snow.
I get that snow in VD everytime I am pushing the settings to the maximum and not getting enough FPS. It's possible using more threads for VAM makes it harder for VD to compute fast enough to prevent the artifacts.
 
Just so you know, the new patch 12 somehow interferes with Passthrough in vr introducing alot of artifact snow. I can fix it slightly by changing chroma settings, but still there a little. Kinda a random far use case by-product in which i cant even begin to understand its correlation. But it still gave me 26% more fps than Patch11 in my two person morph heavy test scene!

7800X3D(no SMT)/4090/Quest3

[threads]
computeColliders=6
skinmeshPart=1
applyMorphs=6
skinmeshPartMaxPerChar=6
applyMorphMaxPerChar=6
affinity=1,2,3,4,5,6,7,8

[threadsVR]
computeColliders=6
skinmeshPart=1
applyMorphs=2
skinmeshPartMaxPerChar=4
applyMorphMaxPerChar=2
affinity=1,2,3,4,5,6,7,8

[profiler]
enabled=0
I actually get this also, but it isn't too bad. It almost looks like GPU artifacting.
 
Interesting, for some reason unity allocs a lot of memory for an ENUM lol (it should not allocate heap memory for an int)
0​
187088​
3741760​
PositionStateFreeControllerV3:set_currentPositionStateMotionAnimationControl:ApplyStep
0​
170080​
3401600​
RotationStateFreeControllerV3:set_currentRotationStateMotionAnimationControl:ApplyStep
0​
23386​
467720​
Int32FreeControllerV3:set_currentPositionStateMotionAnimationControl:ApplyStep
0​
23386​
467720​
Int32FreeControllerV3:set_currentRotationStateMotionAnimationControl:ApplyStep
I just found another bit of weirdness that might be worth looking into. First of all, the work you are doing is simply incredible, thank you so much. The improvement has been so dramatic that a benchmark wasn't even necessary to confirm it (i9-13900K with a RTX 4090).

So here's the weirdness: with your patch (version 12), in desktop, I can run the default scene at a performance monitor average of 405 fps. It varies depending on physics rate and update cap, of course, but the baseline was 72hz and a cap of 2. Soft body physics on.

The very complex scene I'm working on was running at an average of 190 fps. Same settings, just one additional character (male, for VR, advanced colliders disabled). Aside from that, loads of plugins and "helper atoms" that I use for distance calculations via Scripter. CPU and GPU temperatures and loads were consistently well below their limits, no bottlenecks going on.

I managed to bring this scene up from 190 fps to an average of 240 fps. Same camera angle, same settings, no movement except for head sway of Gaze plugin. My optimization method? I decreased the scale of six collision triggers from 10 (the maximum, which I set so high by mistake) to 0.010 (the minimum). They are parented to the pelvis (object, not control) of the person atom, but changing the parent or switching their status to "on" didn't make a difference. I also broke the Scripter script they were connected to and then deleted it. No difference. The only thing necessary to make the scene go from 190 to 240 fps on average is simply: load it, decrease the scale of the six collision triggers to minimum, done. Increasing them again lowers performance down to 190. And yes, I know the UI tanks FPS so I made sure it wasn't active when I measured the performance monitor averages.

This is not as easy to reproduce as one might expect. I added all-new collision triggers and increased them to a scale of 10. Positioning seems to affect the FPS drop, and those new triggers brought the scene down to 220 fps. Still significant, but clearly the six original ones are more draining for some reason. I don't know if it's unique to my setup or the scene. I tested a bunch of collision triggers in the default scene and measured a similar performance drop, but afterwards I did it again with different positioning and it wasn't too evident. In any case, in the scene I'm working on it's a massive improvement, so I'm mentioning in case it might hint at some deeper optimization issue in VAM.
 
Patch 12 close HT is more stable【140~160FPS】 than open HT【110~140FPS】

INI:
[threads]
computeColliders=6
skinmeshPart=6
applyMorphs=6
skinmeshPartMaxPerChar=6
applyMorphMaxPerChar=6
#affinity=1,3,5,7,9,11
affinity=1,2,3,4,5,6

Close HT
Benchmark-20240213-081302.png

Open HT
Benchmark-20240213-084226.png

Patch 9 close HT
3080ti 5600 patch9 关HT.png
 
There is no reason to have have skinmeshPartMaxPerChar > skinmeshPart, it will still only run at most skinmeshPart amount of threads per character, in the baseline3 scene there are 3 characters, so 6/3=2 is the amount of threads that will be run per character in that regard. You could also try

[threads]
computeColliders=6
skinmeshPart=6
applyMorphs=6
skinmeshPartMaxPerChar=1
applyMorphMaxPerChar=1
affinity=1,3,5,7,9,11

and see what happens, it could improve performance if the 3 CPU cores then clock higher with less threads running. It really depends on your memory clock and latency. If you have fast memory, less threads are sometimes faster.
Also check if you have other processes running in the background like chrome, steam or discord. That baseline3 is really sensitive to other processes stealing CPU time.

Your max1% times in baseline3 are pretty much exactly the same as mine on my 5950x.

View attachment 334244
This means that your physics happen as fast as mine. As the scene has 3 characters, my CPU runs 8/3=2 threads per character and yours runs 6/3=2 threads per characters, so we should have the exact same benchrmark results... You can see that for me 1000/9.00 = 111fps, which is spot on on my avg FPS, while in your benchmark

View attachment 334245

1000/10.20 should equate to 98 FPS. Something is stealing away the CPU time after VaM stopped recording the time it needed for work and waited for the next frame to start.
Hey, Sorry for the delay, just got around to trying this out again.

So, I closed out of Discord, which was the only thing open when I did the other benchmarks, and set MaxPerChar back to 6 instead of 8, this was the results.
1707819764876.png
And this was the result of the benchmark with:
[threads]
computeColliders=6
skinmeshPart=6
applyMorphs=6
skinmeshPartMaxPerChar=1
applyMorphMaxPerChar=1
affinity=1,3,5,7,9,11
1707819830166.png

From the looks of things, 6 seems better. I'm gonna try 3 out here and see how it goes. One thing to note however, was that when MaxPerChar was set to 1, load times of the program seemed like 2x longer. From starting up VaM to loading the Benchmark Scene, it was all at the very least 2x slower. However, CPU Utilization of any Single Core peaked at 70%.

As a Side note, I do have multiple displays which could be lowering my FPS. I haven't gone out of my way to temporarily disable them for the benchmark, as I likely would never use VaM without them on. So I suppose my FPS may differ from yours to a certain degree. Also you have a 4xxx series card and mines a 3xxx, there is on average a base 25-30% performance difference between our cards. That may cause our FPS to differ as-well?

EDIT:
Alright MaxPerChar set to 3 got me this.
1707821503246.png
 
Last edited:
Total Noob here, Im curious about this and I already dumbed all files into the vam folder... then I dont know what to do, is there are video how to do this?

I dont know what THIS step even do


EDIT: Ok, i found where to edit and add these values in.
now testing.
1707820783970.png
 
Last edited:
I just found another bit of weirdness that might be worth looking into. First of all, the work you are doing is simply incredible, thank you so much. The improvement has been so dramatic that a benchmark wasn't even necessary to confirm it (i9-13900K with a RTX 4090).

So here's the weirdness: with your patch (version 12), in desktop, I can run the default scene at a performance monitor average of 405 fps. It varies depending on physics rate and update cap, of course, but the baseline was 72hz and a cap of 2. Soft body physics on.

The very complex scene I'm working on was running at an average of 190 fps. Same settings, just one additional character (male, for VR, advanced colliders disabled). Aside from that, loads of plugins and "helper atoms" that I use for distance calculations via Scripter. CPU and GPU temperatures and loads were consistently well below their limits, no bottlenecks going on.

I managed to bring this scene up from 190 fps to an average of 240 fps. Same camera angle, same settings, no movement except for head sway of Gaze plugin. My optimization method? I decreased the scale of six collision triggers from 10 (the maximum, which I set so high by mistake) to 0.010 (the minimum). They are parented to the pelvis (object, not control) of the person atom, but changing the parent or switching their status to "on" didn't make a difference. I also broke the Scripter script they were connected to and then deleted it. No difference. The only thing necessary to make the scene go from 190 to 240 fps on average is simply: load it, decrease the scale of the six collision triggers to minimum, done. Increasing them again lowers performance down to 190. And yes, I know the UI tanks FPS so I made sure it wasn't active when I measured the performance monitor averages.

This is not as easy to reproduce as one might expect. I added all-new collision triggers and increased them to a scale of 10. Positioning seems to affect the FPS drop, and those new triggers brought the scene down to 220 fps. Still significant, but clearly the six original ones are more draining for some reason. I don't know if it's unique to my setup or the scene. I tested a bunch of collision triggers in the default scene and measured a similar performance drop, but afterwards I did it again with different positioning and it wasn't too evident. In any case, in the scene I'm working on it's a massive improvement, so I'm mentioning in case it might hint at some deeper optimization issue in VAM.
I didnt quite get it, did my patch make something worse than vanilla? Maybe its just a fluke:
From 240fps to 220fps, the difference is 0.38ms bigger render time.
The same 0.38ms bigger render time turns 190fps into 177fps.
It looks like enabling tjose collisions is more draining because they "steal" more FPS (20fps vs 13fps earlier), but that's just how FPS works. At 400fps they would turn it into 347fps, while having the same 0.38ms cost.
 
Hey, Sorry for the delay, just got around to trying this out again.

So, I closed out of Discord, which was the only thing open when I did the other benchmarks, and set MaxPerChar back to 6 instead of 8, this was the results.
And this was the result of the benchmark with:
[threads]
computeColliders=6
skinmeshPart=6
applyMorphs=6
skinmeshPartMaxPerChar=1
applyMorphMaxPerChar=1
affinity=1,3,5,7,9,11

From the looks of things, 6 seems better. I'm gonna try 3 out here and see how it goes. One thing to note however, was that when MaxPerChar was set to 1, load times of the program seemed like 2x longer. From starting up VaM to loading the Benchmark Scene, it was all at the very least 2x slower. However, CPU Utilization of any Single Core peaked at 70%.

As a Side note, I do have multiple displays which could be lowering my FPS. I haven't gone out of my way to temporarily disable them for the benchmark, as I likely would never use VaM without them on. So I suppose my FPS may differ from yours to a certain degree. Also you have a 4xxx series card and mines a 3xxx, there is on average a base 25-30% performance difference between our cards. That may cause our FPS to differ as-well?

EDIT:
Alright MaxPerChar set to 3 got me this.
Your last benchmark should display 104fps and 162fps according to the TotalTime. Very weird... Everyone elses benchmarks are pretty much 1000/totaltime=fps, but not yours.
Does the benchmark run look visualy like 104fps and 162fps or is it obviously 66fps?
Do you have HDR in windows enabled?
Do you have vsync in nvidia panel disabled?
1707823918619.png
 
Your last benchmark should display 104fps and 162fps according to the TotalTime. Very weird... Everyone elses benchmarks are pretty much 1000/totaltime=fps, but not yours.
Does the benchmark run look visualy like 104fps and 162fps or is it obviously 66fps?
Do you have HDR in windows enabled?
Do you have vsync in nvidia panel disabled?
View attachment 334612
I actually do have Vsync disabled in VaM, and Nvidia has it set to application controlled. Cuz often Vsync doesn't benefit me in like 90% of games, so I always turn it off, as it uses up some resources. I'll enable it here and see if my FPS goes up in the benchmark?

However, It doesn't "look" like 60fps, it looks over 100, so maybe that's it? And no I don't have HDR enabled. Monitor supports HDR-1000, but more often then not, it looks like crap due to w10's poor HDR support.
 
I actually do have Vsync disabled in VaM, and Nvidia has it set to application controlled. Cuz often Vsync doesn't benefit me in like 90% of games, so I always turn it off, as it uses up some resources. I'll enable it here and see if my FPS goes up in the benchmark?

However, It doesn't "look" like 60fps, it looks over 100, so maybe that's it? And no I don't have HDR enabled. Monitor supports HDR-1000, but more often then not, it looks like crap due to w10's poor HDR support.
"Fast" Vsync in nvidia panel is actually "do not limit the rendering, but only take a frame from the rendered frames every time the monitor refreshes", basicaly disabling any limits on FPS.
 
Just curious, speaking of the NVIDIA Control Panel stuff, I think that setting the Threaded Optimization option to Off actually helps a bit. Maybe, not 100% sure on that, still need to do more testing.
 
"Fast" Vsync in nvidia panel is actually "do not limit the rendering, but only take a frame from the rendered frames every time the monitor refreshes", basicaly disabling any limits on FPS.
So I've ran the benchmark again with Vsync enabled, I also disabled my other monitors, and still got virtually the exact same results. Set vsync to Fast, still same results. I'm not really sure whats up :p My FPS in everything else often seems to be actually higher then what most people report when it comes to most games I've played. When I'm running the benchmark my GPU is only at 61 Degrees, so there is still a bunch of thermal headroom for it work harder. So I guess it's gotta be something to do with my CPU or RAM? Ram's at 3600mhz with EXPO enabled. I AI Overclocked my CPU to that base 4.55GHz. Basically everything was running quite a bit faster after that. I even saw FPS increase in VaM at that point in time, aswell. I've never crashed, or had any issues across the board. Maybe something funky is going on with my VaM or something. It's like a 400GB folder so who knows lol.

Also, my CPU was only like 59 degree's near the end of the Baseline 3 Benchmark.
 
Last edited:
So I've ran the benchmark again with Vsync enabled, I also disabled my other monitors, and still got virtually the exact same results. Set vsync to Fast, still same results. I'm not really sure whats up :p My FPS in everything else often seems to be actually higher then what most people report when it comes to most games I've played. When I'm running the benchmark my GPU is only at 61 Degrees, so there is still a bunch of thermal headroom for it work harder. So I guess it's gotta be something to do with my CPU or RAM? Ram's at 3600mhz with EXPO enabled. I AI Overclocked my CPU to that base 4.55GHz. Basically everything was running quite a bit faster after that. I even saw FPS increase in VaM at that point in time, aswell. I've never crashed, or had any issues across the board. Maybe something funky is going on with my VaM or something. It's like a 400GB folder so who knows lol.

Also, my CPU was only like 59 degree's near the end of the Baseline 3 Benchmark.
yeah its something with the benchmark. Did you try loading another scene and perfmon with vanilla and the patched vam? Maybe there you see a difference in FPS.
 
yeah its something with the benchmark. Did you try loading another scene and perfmon with vanilla and the patched vam? Maybe there you see a difference in FPS.
I also find this problem, The relationship between FPS and TotalTime is not very clear.
Benchmark-20240211-140531.png
 
Patch 12 close HT is more stable【140~160FPS】 than open HT【110~140FPS】

INI:
[threads]
computeColliders=6
skinmeshPart=6
applyMorphs=6
skinmeshPartMaxPerChar=6
applyMorphMaxPerChar=6
#affinity=1,3,5,7,9,11
affinity=1,2,3,4,5,6

Close HT
View attachment 334585
Open HT
View attachment 334587
Patch 9 close HT
View attachment 334588
Hello, I'm also using the 5600. What should I configure to enable close HT? (V12)
my SkinMeshPartDLL.ini setting is this
[threads]
first core = 1
computeColliders=6
skinmeshPart=6
applyMorphs=6
skinmeshPartMaxPerChar=6
applyMorphMaxPerChar=6
affinity=1,3,5,7,9,11
 
Back
Top Bottom