when number goes up, its a new versionIs this the updated fixed one you posted in benchmark thread, or the original one?
added a patch for it, please testMaybe even simply changing the condition could help a lot (without going into two lists straight away)?
from
if(!morph.disable)
to
if(morph.active)
what version did you use? My patch doesnt remove the morphs, it just prevents the engine from rendering the ones that have a value of 0.0. Does it only happen to local morphs? Can you still see morphs from vars?Just a question, since i haven't tested v9 yet [last version i tested was v7 back from benchmark thread]... I see you started messing with morphs, and it makes me wonder... If the plugins still will be able to use the morphs which weren't loaded on scene launch? Like Naturalis, Shake it etc?
I just noticed it also makes my local morphs disseapearing. I worked a bit past few days on a model, creating a few versions of hers morph, and once i start VaM it shows only the active morph with hers name. I need to rescan morphs to see the other versions, and no, they're named differently, like 'character name moph', and next one is 'character_m_v2, next 'character_final' etc, it's not VaM 'versioning'.
All of the morphs were installed since the begining ofc.
Maybe it is on purpose...?
I confirm too, unless they are used in the scene (so different value from 0), the local morphs don't appear at all in the morphs list. A "reload custom morphs" will make them appear as usual.what version did you use? My patch doesnt remove the morphs, it just prevents the engine from rendering the ones that have a value of 0.0. Does it only happen to local morphs? Can you still see morphs from vars?
Naturalis works as it should. Once a morph has been changed from 0.0 once in the scene, it gets rendered until scene reload. All installed morphs are available in the UI, they are just not applied to the model until they are changed from 0.0 once.
Try it with the newest patched9_morph_clutter_fix2.zip on the first post.
how do I create local morphs so I can test the bug?Okay i actually installed yesterday V8 according to files dates. Sorry, thought i'm still at v7.
It hides pretty much all of my local morphs, and as for vars i have no idea what criteria it takes.
My install is very custom, i manually editted and created two versions of 'morphs' packages and i'm switching them between 'play' and 'work' sessions, one with preload turned off and 2nd, with preload on.
Still in my 'play session' i have 950 pages of female morphs [mostly cuz of expressions which are lighter than fully rigged morphs], but with patch i see only 237 pages of totally random morphs.
Well, as long as plugins can 'hot-load' the missing morphs durring a scene play it doesn't bother me much.
/ Edit /
Installed v9_fixed
and it's the same. I can see only the part of my morphs from vars and nothing from local files. Reload button in morphs tab is loading the local ones, but still can't see the rest of morphs from vars - guess that the purpose to disable them when no need them. It might be problematic for scenes content creators - if they won't be able to see expression morphs etc.
Like i said it's not a problem for me, if it's stay that way. It seems like i have already 100+ pages of my local morphs [mostly models i made\ported lol] so it might be better if theirs morphs are disabled until i actually need them. And since i can quickly load them up via in game reload button, for example when starting new model with brand new morphs, it's all fine.
You can just take a morph from a .var and copy it into the appropriate folder in \Custom\Atom\Person\Morphs\Female (or the others if is a genital or male morph)...how do I create local morphs so I can test the bug?
Oops, I made all morphs ondemand loaded in version 5 for testing, although it didnt give any performance benefit and accidentally released the change. I updated the first post, so update to patched9_morph_clutter_fix3 https://pixeldrain.com/u/XPexV2Rf and its fixed again.You can just take a morph from a .var and copy it into the appropriate folder in \Custom\Atom\Person\Morphs\Female (or the others if is a genital or male morph)...
It will appear in your morphs list with blue background after a "reload custom morphs" or restarting VaM
It does not benefit to make morphs ondemand loaded, it was a red hering when I tested it. All known morphs except local morphs were loaded at scene load anyway, so the only thing it did was introduce the bug lol.So... uhm, i'm sorry but seems like i was blind lol
Like i said my install is pretty well tuned 'for play' without morphs bloat effect and... i have only 350 pages, not 950 as i said before. ?
I'm using 4K TV, and started Vam at windowed 1080p for testing and seems like i can't even read properly lol. Sorry.
So the 'issue' was only with local morphs, which is now fixed with v9_fix3.
And it's now makes me wonder... if it's possible to make it toggleable in ini? I know you just reverted it, but it might actually benefit havin local morphs 'on demand' for people like me, with many of them. I thought it affects them in vars too, and thats why i even started this, if i knew it was really just for local ones i wouldn't even say anything in regard of this QQ
means that local morphs are visible, but not really 'loaded' on the model until they are changed? That super cool. [reverting back from v9fix2 to v9fix3 ]However what DOES benefit is not RENDERING morphs while they are loaded, but werent changed yet. This behaviour is now included in the newest version you just ran.
maybe try setting it lower? If your cpu is very fast and the computecolliders goes through very fast, the time to wait for the threads takes longer than the work. Also trywhat are you guys setting in the INI file for an i9-13900k?
based on raw thread count I set mine like this
[threads]
computeColliders=24
skinmeshPart=1
CCD=0
IterateCCD=0
Not seeing any gains on my system...
if its disabled in bios or ryzen master:Can I set CCD to 0 in skinmeshpartdll.ini if I have an AMD 5600X with two CCDs?
One is disabled and only CCD2 is functional.
Also what skinmeshpart value should I use?
wow the main dev noticed. If you want I can send you a cleaned up source code so you can integrate it into the main code. It's really not that hard to integrate it into the main development. I moved it to "other" and removed the external links.Thanks for doing this!
I mentioned on the other thread as well before I realized you had a dedicated one. We reached out to you to see ask to move this to resources where we can flag it for providing a replacement dll.
I considered rewriting the morph iteration code you mention but unfortunately there were some complexities in doing so that I couldn't easily address with the time I had. VaM2 has already fixed this by morphs registering changes to their values on-demand. Event driven. That way the iteration is only done on morphs that changed that frame. Also VaM2 is using Unity's new Burst and Jobs system with better memory structs for the code that is most demanding. VaM2 also doesn't support morph-controlled-morphs or formula morphs which is the cause of some of the extra possible iterations in the morph code.
I appreciate your effort to make the original VaM more performant. In the other thread I also mentioned you are sometimes fixing things that are running on a side thread and would typically complete before the main thread needed the data. But I suppose on some CPUs it is not able to so the main thread waits on that side thread to complete and lowers overall FPS. I spent most of my time trying to optimize main thread code as that is generally where the bottleneck is.
1) (side threads) doing float point math like in SkinMeshPart is a lot faster if you compile the same operations with native SSE2/AVX2 instead of doing them in C# IL
2) disabling your threaded skinmeshing and instead letting the .dll do the thread management natively is faster.
3) if skinmeshing is already very fast you can skip the threading altogether, it also fixes the bug where skinmeshes flop around for a few frames, because you processed the vertexes in the wrong operation order, you already tried to mitigate it by splitting the threads not by bones but by vertexIDs, but some vertexes are shared between bones and those can spazz out
4) (main thread 0) if you call a lot of unity engine methods like getPosition or getTransform, its faster to call them directly from native->native using their native names internalcall_XXXX instead of C#->native, because every C#->native transition costs a lot of CPU. ComputeColliders did those 100k times per frame, this was the biggest performance gain.
5) (all threads) setting core affinity for the process, making sure the code stays on one CCD so all threads share a highspeed CPU cache. This is basicaly free performance on top and should be configurable for advanced users
I know it will be fixed in VAM2, but it only took half an hour patching it using dnspy
Another performance patch was to mark dazmorphs as "touched" once appliedValue or morphValue was set via the setters and to add them to a seperate list in their corresponding morphbank, so that on every frame in ApplyMorphsThreadedFast() only those in the "touched list" get iterated. Otherwise you iterate all installed morphs, eventhough they are at 0.0 and while it looks harmless in terms of operations wasted, it pollutes the CPU cache heavily.
oh and in DAZMorph.LoadDeltasFromBinaryFile() you need to load all deltas in one go, instead of loading them by float, like this:Thanks for doing this!
I mentioned on the other thread as well before I realized you had a dedicated one. We reached out to you to see ask to move this to resources where we can flag it for providing a replacement dll.
I considered rewriting the morph iteration code you mention but unfortunately there were some complexities in doing so that I couldn't easily address with the time I had. VaM2 has already fixed this by morphs registering changes to their values on-demand. Event driven. That way the iteration is only done on morphs that changed that frame. Also VaM2 is using Unity's new Burst and Jobs system with better memory structs for the code that is most demanding. VaM2 also doesn't support morph-controlled-morphs or formula morphs which is the cause of some of the extra possible iterations in the morph code.
I appreciate your effort to make the original VaM more performant. In the other thread I also mentioned you are sometimes fixing things that are running on a side thread and would typically complete before the main thread needed the data. But I suppose on some CPUs it is not able to so the main thread waits on that side thread to complete and lowers overall FPS. I spent most of my time trying to optimize main thread code as that is generally where the bottleneck is.
public void LoadDeltasFromBinaryFile(string path)
{
try
{
using (FileEntryStream fileEntryStream = FileManager.OpenStream(path, true))
{
using (BinaryReader binaryReader = new BinaryReader(fileEntryStream.Stream))
{
this.numDeltas = binaryReader.ReadInt32();
this.deltas = new DAZMorphVertex[this.numDeltas];
int num = 16;
using (MemoryStream memoryStream = new MemoryStream(binaryReader.ReadBytes(this.numDeltas * num)))
{
using (BinaryReader binaryReader2 = new BinaryReader(memoryStream))
{
for (int i = 0; i < this.numDeltas; i++)
{
DAZMorphVertex dazmorphVertex = new DAZMorphVertex();
dazmorphVertex.vertex = binaryReader2.ReadInt32();
Vector3 vector;
vector.x = binaryReader2.ReadSingle();
vector.y = binaryReader2.ReadSingle();
vector.z = binaryReader2.ReadSingle();
dazmorphVertex.delta = vector;
this.deltas[i] = dazmorphVertex;
}
}
}
}
}
}
catch (Exception ex)
{
Debug.LogError(string.Concat(new object[] { "Error while loading binary delta file ", path, " ", ex }));
}
}
Otherwise you call the zip library 4 times per delta per numDeltas, which results in 100k calls and 100k small reads from the disk. This removes lag in scenes with prerecorded animations.
public void LoadDeltasFromBinaryFile(string path) {
//Debug.Log("Loading deltas for morph " + morphName + " from "+path);
try {
using (FileEntryStream fes = FileManager.OpenStream(path, true)) {
using (BinaryReader binReader = new BinaryReader(fes.Stream)) {
numDeltas = binReader.ReadInt32();
deltas = new DAZMorphVertex[numDeltas];
for (int ind = 0; ind < numDeltas; ind++) {
DAZMorphVertex dmv = new DAZMorphVertex();
dmv.vertex = binReader.ReadInt32();
Vector3 v;
v.x = binReader.ReadSingle();
v.y = binReader.ReadSingle();
v.z = binReader.ReadSingle();
dmv.delta = v;
deltas[ind] = dmv;
}
}
}
}
catch (System.Exception e) {
Debug.LogError("Error while loading binary delta file " + path + " " + e);
}
}
I actually tried the JobSystem first (without the burst compiler) and it made things even worse, because there was alot of additional memory copying and in the end the unity jobs were just glorified C# threads that you already use in skinmeshing. The newest version of unity jobs allows you to access transformations without the C#->Native transition. I think if you also do IL2CPP for those special jobs (is it even possible without making the whole game IL2CPP and breaking all third party scripts?) you might see some benefit, but it will not match a .dll compiled for AVX2 and optimized with clang.Yeah I can imagine. This is essentially what VaM2 does with new skinning and morph engine (using Unity Jobs system). I would have loved to do that for VaM but I had to move on to VaM2.
Yeah gotta look it up again, I think I mixed something up. Please correct me if I didnt understand it correctly: When you process skinmeshing you iterate over all bones and then apply the weights (not an expert on game engines) and as a result you get morphed vertexes that get rendered with the skin. What I learned was that you need to morph the vertexes in the order of the bones, otherwise you get a wrong result and the skin is all over the place. Basicaly you need to process every DAZSkinV2VertexWeights of every bone one by one and cant multithread it in theory. You however did a trick where each thread processes each bone in the same order, but each thread has a range of final vertexes that they are allowed to touch.I'm not quite sure what you mean there. The subthreads of the skinning all write to consistent vertex array and then that array is not used until all subthreads are complete. Something doesn't sound right here.
I'm curious what you used for profiling. VaM was only optimized using Unity's profiler on several different testcases, and without moving to native calls there was limited amount of things that could be done. Unity doesn't allow accessing native methods without making a C-based plugin. It is a bit of a pain. I guess you did that though.
In ApplyMorphsThreadedFast() you iterate over all morphs and then only continue with the morph if (dazmorph.morphValue != 0.0 || dazmorph.appliedValue != dazmorph.morphValue). My solution caches the morphs where this condition can even be true in a _touchedMorphs list. This condition can only ever be true if appliedValue or morphValue have ever been changed from their default 0.0 value. Once one of those two setters were called, they are marked as "touched" and put into that _touchedMorphs list. Both lists are ofc just references to the same object, since they arent structs. So I dont see any reason where that might break things.That is how VaM2 works, but on VaM we have morphs that can modify other morphs values and I believe it doesn't work correctly with what you describe.