• Hi Guest!

    This is a notice regarding recent upgrades to the Hub. Over the last month, we have added several new features to improve your experience.
    You can check out the details in our official announcement!

Surprising effect of hair curve density on framerate

everlaster

Well-known member
Featured Contributor
Messages
505
Reactions
2,839
Points
93
Website
patreon.com
Twitter
everlasterVR
Patreon
everlaster
I did some framerate comparisons with different hair curve densities, and noticed that the relationship wasn't simply "more density = less fps". In fact, I got better performance with a high density than a low density, which was pretty weird to notice.

First, I tested with the Simone hairstyle by RenVR which I happen to use regularly, with sim on/off and collision on/off, and with vertex versus pixel lighting:

haircurvedensity-styles.jpg
curvedensity_comparison.jpg

  • The screenshots show how close the camera was to the head - the hair filled up the height of the window.
  • Hair multiplier was 64 to emphasize the impact on fps.
  • The lights were normal InvisibleLight atoms.

Vertex lighting seemed to exaggerate the difference in framerate between low-mid and high-mid curve densities.

Toggling hair collision while sim was enabled didn't seem to matter a lot: in both cases (red and yellow lines) the frame rate was highest around density 32 and lowest around 12 or 16.

Disabling hair sim, the frame rate scaled more linearly with curve density. Since the GPU load from the hair physics sim was removed, all of the difference was purely down to rendering. This implies that the physics sim itself is responsible for those the ups and downs in the mid range, at least with this particular hairstyle.

Next, I wanted to figure out if other factors would affect the result. The below tests were done in a different VAM session than the above tests, but in the same scene. Sim and collision were enabled and lighting was pixel.

Hair multiplier

multiplier_vs_density.jpg

  • Hair multiplier has no effect beyond just lowering the overall performance, curve density has the same effect.
  • It's odd that there was less of an improvement in the midrange this time, just a small spike at 24.
Hairstyle

For contrast with the Simone hairstyle, I picked a short straight hair - the short4 base style from short hair4 by @ddaamm .

hairstyle_vs_density-styles.jpg

hairstyle_vs_density.jpg


With the short4 hair, the optimal density for performance was 16 instead of 24 (disregarding the ugly very low densities).

Camera position

Continuing with short4, I tested if the result was different from Simone because of how much hair was being rendered per area of screen by moving the camera closer.

camerapos_vs_density-styles.jpg

camerapos_vs_density.jpg


Inconclusive result... Moving the camera this close, curve density didn't really matter.

Curl scale and frequency

curliness_vs_density-styles.jpg

curliness_vs_density.jpg


I thought increasing curliness could bring out differences between different curve densities, but it's likely that the camera being too close was preventing that.

Conclusions...

Based on these initial results, the optimal curve density seems to depend most on the hairstyle and the viewpoint's distance from the hair.

In general, low densities below 24 seem to be not worth using. 24 to 32 is probably always good, and in some cases, high values of 40 and above can perform as well or better than values below 24.

However the results in the first graph with the Simone hairstyle, showing a significant improvement in fps above density 24, need reproducing.

Many more hairstyles could be tested and collected into a single graph. I'll try to find time to do that since it'd be interesting to get a clearer picture of how much the effect of curve density on framerate varies, and if there's a pattern that emerges.

Setup

Hardware: Ryzen 7 3700X, 32GB DDR4-3200, RTX 3080 10GB

The framerate was recorded as a rolling average over 15 seconds, waiting at least 20-25 seconds for the average to fully stabilize after changing settings. The built in performance monitor isn't really optimal for quickly getting averaged readings like this, I used my own plugin for that (Perfomance Overlay, paid).



Your thoughts?

:coffee:
 
In case anyone wants to try to test for themselves, I've attached a cleaned up version of the scene I used. Basically just 2 lights, default atom in T-pose, front and back camera angles (which are close to but not exactly what were used in the original post).

Dependency: Spawn Point
 

Attachments

  • hairperfscene.json
    73.5 KB · Views: 0
Honestly, I had trouble finding this post without going again to Discord as I never go to the lower sections. They do make sense to exist, but they're seldom used.

It's a pity I don't care much for baldies, the FPS increase is humongous.
Trying with Simone, your conclusions do match and it's very surprising the below 24 values being so much worse than 24-40.

Tried a couple of single item hairs from Roac and Prestigities, and while the Curve Density changes were not as dramatic as with Simone's it still shows that going under 24 not only makes hair worse, it also lowers FPS.
 
Last edited:
Interesting find!

I'm by no means an expert, but my best guess is that this might at least in part be due to shader divergence.

Please note that I only have a tenuous grasp on the concepts and mechanics behind GPU optimization (just enough to be dangerous I guess), so take everything I say with a grain of salt.

Hair has sim particles and render particles. Once a frame of simulation is complete, render particles are computed from sim particles and they don't need to match 1:1. When you change the curve density you change the number of render particles to create from the sim. At least this is what I've gathered from my work on AttachToVertex (which uses render particles), and CUAClothing (which uses both).

The render particle buffer is organized by strand and there's no 'padding', i.e. if you have a curve density of 15, particle 0 will be the root of the first strand and particle 15 will be the root of the second strand and so on. This buffer gets sent to the hair rendering shader, and IIRC rendering lighting/shadows is the most expensive part of hair and is likely what has the most impact on framerate. It might be interesting to try the experiment again with no lights in the scene.

GPU threads are organized into groups of 32 (or maybe 64 on some hardware?). When you have a curve density other then 32, some of the threads will be working on one strand and some on another. A GPU thread group only has one instruction pointer, so branches in the code need to be executed one at a time (with some of the threads sleeping). People who know what they're talking about call that execution divergence. Depending on how the shader is written, working on different hair strands in a single group might cause execution divergence. Similarly, the groups share a cache and shaders work best when all group threads access similar memory, so memory divergence is also a thing.

All that being said, the only thing this theory would explain is why 32 is an efficient number. According to this theory 8 threads sould be more efficient than 6, but maybe there the additional work of a buffer that's 33% larger washes out the gain. There could also be more complicated things going on.
 
Last edited:
Back
Top Bottom