it doesnt support AVX2 so it crashes. You wouldnt get much benefit without AVX2 anyway.I tried your latest patch 9 but it's crashing on my really old CPU (i7-3770K) during opening the benchmark.
Error log:
it doesnt support AVX2 so it crashes. You wouldnt get much benefit without AVX2 anyway.I tried your latest patch 9 but it's crashing on my really old CPU (i7-3770K) during opening the benchmark.
Error log:
Please atleast fix the unneeded decompression bug without loading the full zip, it's just 2 lines. I proposed one adjustment and one fix, please do the small fix.I think the best path is probably you keep developing this and work out the kinks, and possibly at some future point I can roll into application if you are willing to let me do that. If not, it can live on as a side patch.
Seriously great work and I'm very impressed!
Please atleast fix the unneeded decompression bug without loading the full zip, it's just 2 lines. I proposed one adjustment and one fix, please do the small fix.
they also happen in fancy strip scenes when the animation gets loaded while the scene runs, noticeable in a short stutteringI'll try. I have another pending request from another user I would like to put in as well.
Knowing this is how the zip reader works, I'm tempted to review the rest of the binary reads I do. But I suppose the worst offender is the morphs since there can be a lot of them.
Here is how the cores are numbered with and without HT enabled in BIOS:At first i thought something went really, really wrong for intel this time.
V10 - HT Disabled - Suggested config from the resource page
View attachment 332634
So i started the test 2nd time:
View attachment 332635
So... it was terrible, like 5 times worse than base game QQ
So i copied my own config i been using:
View attachment 332636
Better, but not even close to my previous results.
So...
Let's try to turn on HT:
V10 - HT Enabled - My settings
View attachment 332637
Better, but still missing something.
Maybe suggested settings?
View attachment 332638
Still not quite...
Rolling back to 9 fix 3.
9 fix 3 - HT Enabled - Suggested config:
View attachment 332639
Uh...
Maybe my own?
View attachment 332640
Still bad.
Disabling HT.
9 fix 3 - HT disabled - My settings:
View attachment 332641
Yup, thats works.
So.. For intel users... If you gonna update the patch to v10 make sure to ENABLE HT in the bios.
Also, it seems previous version is somehow faster for us [stock cpu settings no cores tuned anyhow, with just HT turned off]
[threads]
computeColliders=6
skinmeshPart=1
affinity=1,3,5,7,9,11,13,15
[threads]
computeColliders=12
skinmeshPart=1
affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
[threads]
computeColliders=18
skinmeshPart=1
affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24
[threads]
computeColliders=24
skinmeshPart=1
affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24
[threads]
computeColliders=24
skinmeshPart=1
affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32
[threads]
computeColliders=32
skinmeshPart=1
affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32
Sorry, i won't be able to run these many tests at the moment, at least for the next 48 hours.Here is how the cores are numbered with and without HT enabled in BIOS:
HT enabled in BIOS:
Core1 - real
Core2 - HT core
Core3 - real
Core4 - HT core
and so on...
which means
affinity=1,3,5,7,9,...
HT disabled in BIOS:
Core1 - real
Core2 - real
Core3 - real
and so on...
which means
affinity=1,2,3,4,5,6,...
My theory was to run VaM on all REAL cores and skip the HT cores.
That worked best on my AMD. Could you please try the v10 and these configs?
HT enabled in BIOS:
Code:[threads] computeColliders=6 skinmeshPart=1 affinity=1,3,5,7,9,11,13,15
HT enabled in BIOS:
Code:[threads] computeColliders=12 skinmeshPart=1 affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
HT enabled in BIOS:
Code:[threads] computeColliders=18 skinmeshPart=1 affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24
HT enabled in BIOS:
Code:[threads] computeColliders=24 skinmeshPart=1 affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24
HT enabled in BIOS:
Code:[threads] computeColliders=24 skinmeshPart=1 affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32
HT enabled in BIOS:
Code:[threads] computeColliders=32 skinmeshPart=1 affinity=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32
I wanna find out how exactly those "efficient cores" are connected.
nice, that is a the current record for baseline3 I think. What happens if you enable HT in bios and use 1,3,5,7,9,11,13,15 ? In theory they should be the same.Sorry, i won't be able to run these many tests at the moment, at least for the next 48 hours.
But i just made one quick test.
I did tested 'disable efficiency cores' route in the past, and results were worse than with them. But forcing VaM to use only 'performance' ones for the physics was a good idea.
I'm pretty sure 'performance' cores are the 'very first' for intel. So it's 0-7 [or 1-8], no 1,3,5 etc. So with that in mind i changed config, and here are results:
View attachment 332707
Holy crap!!
I made that one yesterday, 5th image in my previous post [1.56 physics time].nice, that is a the current record for baseline3 I think. What happens if you enable HT in bios and use 1,3,5,7,9,11,13,15 ? In theory they should be the same.
Ah, now I see. Very peculiar, my only explanation is that intel downclocks their cores if HT is enabled, which doesnt happen with AMD. Also 4th and 5th image are pretty much identical performance wise, the few fps in such a high FPS scenario dont matter. So my guess is that on Intel disabling HT for VaM doesnt matter, but disabling HT system-wide does. Maybe intel core schedueler is already smart enough to not scheduel vam on HT cores?I made that one yesterday, 5th image in my previous post [1.56 physics time].
Yeah, it seems so.So my guess is that on Intel disabling HT for VaM doesnt matter, but disabling HT system-wide does.
Damn, the 9700k is faster than my 5950x in the baseline3 benchmark. But it makes sense since the 9700k has ~50ns memory latency, while 5950x has about 65ns. Mind sharing the whole .csv? I need the zoomed in view of only a few frames while the benchmark is running.Patch10 with the following settings (9700k has no hyperthreading)
[threads]
computeColliders=6
skinmeshPart=1
affinity=1,2,3,4,5,6,7,8
View attachment 332744
Here's the plot for that
View attachment 332746
just as I suspected, the unity engine itselfs runs better on your 9700k than on my 5950x and better times in CharacterRun might give you more average fps. Try skinmeshPart=2 or 3 in the thread settings.Rename the attached file as ThreadProfile.zip (vamhub hates zips)
I've overclocked my 9700k to 4.9GHz all cores with no AVX offset - which helps
Yeah I tried 2 and it improved things a little further -will try 3just as I suspected, the unity engine itselfs runs better on your 9700k than on my 5950x and better times in CharacterRun might give you more average fps. Try skinmeshPart=2 or 3 in the thread settings.
the difference between 250fps and 350fps is 1.14ms, multithreaded skinmeshPart barely makes a difference once you have a complex scene and even lowers performance on AMD. So in my experience it only benefitted on static scenes where just one character is shown without movement and there you get 250+ fps anyway, who needs 350fps? I will think about making it switch to multithreaded when only 1 character is registered in the scene.Results as follow
skinmeshPart=1 - 128fps
skinmeshPart=2 - 133fps - winner
skinmeshPart=3 - 130fps
Not much in - but when loading a single character in an empty scene with skinmeshPart=2 I was getting 350fps - with skinmeshPart=1 only 250fps - that's quite a difference.
One question, can these paremeters be changed on the fly? could they be hooked into via a plugin?
as expected my patch is just a flat increase on all all cases and the more FPS you have the less FPS difference it makes. However another user saw very slight gains on disabling e-cores and hyperthreading on his 13900k https://hub.virtamate.com/threads/performance-patch-up-to-30-more-physics-speed.49679/post-148760Didn't have enough time to run the full MacGruber Benchmark with small changes.
So tried with the Cyber Striptease CuddleMocap scene. I saw it was CPU limited with ~50-60% GPU usage at 1440p. Didn't see much FPS change with this CPU patch in scenes that were GPU limited >95% GPU usage.
Tested a couple of different ways to boost FPS for CPU limited scenes.
Seeing large gains (15-20%) from all three methods. Combine for maximum effect Go from 90->177 FPS
- Apply turtlebackgoofy's CPU Patch
- Turn off Glute Physics
- CPU OC all-core by +0.3 Ghz
View attachment 333005
The hybrid Intel CPUs might have weird behavior in Win 10 on disabling e-cores or HyperThreading. I saw 5% less performance on HyperThreading disable.
And super-inconsistent frame rate on disabling e-cores althogether with CPU process affinity in the CPU patch. Sometime FPS was 50% lower, sometimes on par with all-core affinity.
Best Setting (at least for my CPU-GPU in Win 10):
- HyperThreading ON + All cores affinity for Intel Hybrid CPUs
- Use turtlebackgoofy's CPU patch
- Turn off glute physics if possible
- OC core clocks within safe temperatures