Issues with VAT implementation

   3379   4   0
User Avatar
Member
3 posts
Joined: March 2018
Offline
Hi Folks,

I've noticed a few issues with the implementation of VATs for rigid bodies

1. The function provided seems super expensive(656 vertex instructions). I was able to write a function to read the same data with interpolation for almost a 3rd of that (256 vertex instructions). I don't understand why you are interpolating the frames in the shader when you can just use the texture filtering and get it for free?

2.Why are you encoding the Quaternions? I've been able to store unencoded quaternions in a texture and read them without issue. I want to be able to use Houdini VATs and read them in Niagara and our own VAT readers, but the encoding adds a lot of unnecessary overhead and complexity to reading them. UE5 has a very solid Quaternion reader that does not require any decoding (See attached)

3.There doesn't appear to be any protection against zero values which causes pieces to appear and reappear sometimes.

There might be some things in there that are relevant for animations that don’t have consistent topology.

Perhaps we could get a pipeline that is more focused on consistent topologies that we can strip out a lot of the complexity?

Attachments:
UE5 Quaternion reader.JPG (160.9 KB)

User Avatar
Staff
99 posts
Joined: Feb. 2021
Offline
jrs300
Hi Folks,

I've noticed a few issues with the implementation of VATs for rigid bodies

1. The function provided seems super expensive(656 vertex instructions). I was able to write a function to read the same data with interpolation for almost a 3rd of that (256 vertex instructions). I don't understand why you are interpolating the frames in the shader when you can just use the texture filtering and get it for free?

2.Why are you encoding the Quaternions? I've been able to store unencoded quaternions in a texture and read them without issue. I want to be able to use Houdini VATs and read them in Niagara and our own VAT readers, but the encoding adds a lot of unnecessary overhead and complexity to reading them. UE5 has a very solid Quaternion reader that does not require any decoding (See attached)

3.There doesn't appear to be any protection against zero values which causes pieces to appear and reappear sometimes.

There might be some things in there that are relevant for animations that don’t have consistent topology.

Perhaps we could get a pipeline that is more focused on consistent topologies that we can strip out a lot of the complexity?

Hey, thanks for your questions. I wanted to address them both from a technical standpoint and a development one.

1. Yes the RBD MF is expensive if all the expensive features are enabled, but the expenses are scalable. If the vertex count × instruction count becomes a bottleneck, you can turn off Smoothly Interpolated Trajectories, Support Surface Normal Maps, Interpolate Color, Interpolate Spare Color, turn on Allow Exporting Real-Time Data JSON File in Houdini and Support Legacy Parameters and Instancing in UE. Turning off Auto Playback and set Display Frame directly further reduces instruction counts. Finally, in Houdini, set Pivot Accuracy to "High".

If you do all that and only leave Interframe Interpolation on, VS instruction count is ~347. It will say ~386, but that's misleading as Pivot decoding part of the graph doesn't actually branch when Pivot Accuracy is not "Very High", so it's effectively ~347.

If you only care about interpolating positions, then we can add a static switch to skip interpolating rotations, in which case, the count falls to ~252. If you skip Interframe Interpolation entirely, the count can go as low as ~240.

As for texture filtering, it only really works in limited circumstances. Some studio posted an example of using filtering with their version of VAT, that was only because they made it possible for their engine to only filter vertically. It doesn't make sense to filter the pixels horizontally, as the data belong to discreet points in that dimension. Furthermore, vertical filtering, even if it's implemented in engine, still imposes a strict limit that each vertical pair of pixels must represent the same point in two consecutive frames. Therefore the number of pieces you can export will be limited by the maximum width of the texture, in many cases, 8K at most.

Texturing filtering also only works for position, as quaternions can't be linearly interpolated like positions or colors.

2. The reason I'm encoding quaternions is so that we can support proper rotation interpolation. Standard quaternion slerp breaks down if a piece rotates more than 180° between two consecutive frames (which actually happens a lot). In that case the pieces will start to flip-flop between rotation directions between frames, as they don't know which of the two directions to interpolate towards the next frame. We implemented a special quaternion slerp called Multi-Revolutions-Per-Frame Slerp that can deal with high-speed rotations. It requires the encoding of the revolution count completed by each piece within each frame. So we throw away one of the component of the quaterion in order to store the revolution count. The screenshot of the UE function rotates a vector by a quaternion. The nodes in our graph used to accomplish the same task is actually very simple:



3. There is protection against near-zero values in multiple places. That type of protection is part of the reason why we don't always throw away a fixed component of the quaternion as mentioned in previous point. We throw away the maximum component, so as to ensure we are not using sqrt() on a tiny number. The disappearing and reappearing issue is rather a Unreal bug than a built-in "feature" (based on the reported and solved cases so far). If that's observed, restarting the Unreal Editor after the initial VAT asset import typically fixes the problem. If not, consider caching your geometry before VAT ROP and follow the other debug steps mentioned on the "VAT3.0: Debugging (Unreal)" slide in this doc/PDF: https://www.artstation.com/artwork/zOyke6. [www.artstation.com] I have so far never seen that issue not going away after all 5 debugging options.



From the development perspective, we have indeed given careful thought to every trade-off we needed to make. The reality is a lot of customers want VAT to be changed in a specific way to meet their needs, so we really tried to be inclusive and equip the tool with a lot of advanced features. Especially, nowadays people from virtual production, TV/film are actively using VAT, and to that user group, fidelity is more important than performance. So I'm using static feature switches to keep things scalable, until there is a point, there is just no easy way to build a shader switch, such as creating two different ways of exporting/reading quaternions, or creating two versions of the VAT shaders of drastically different complexities.

So unfortunately, between the different axes of objectives, 1) ease of use, 2) visual fidelity & rich features, 3) data compression, 4) and low internal complexity / ease of replication, the 4th axis is the one that had to been sacrificed.

I can't conclusively say we will never support a stripped-down version of VAT, but it basically means creating/maintaining 8 more shaders (4 for UE, 4 for Unity) plus another set of export configuration inside the VAT ROP. We have only so far got two requests to do this, out of the tens of VAT requests we get. The current VAT is also scalable and adaptable, for instance, you can export @orient as color or spare color if you don't want the quaternion to be encoded for your custom shader.

Everything considered, adding a stripped-down VAT version is just not very feasible, resource-wise, at the present moment. We are hoping the customers who have unique needs can spare some of their resources to build their unique setups, and we are always willing to provide information and assistance.

Thanks again for your feedback, I realize my answer may not be exactly what you are looking for. But I will take a closer look at whether or not there is still room to cut down instructions or add better scalability switches within the current framework. I shall also add a section in an upcoming tutorial that focuses on how to configure settings for maximum performance, as discussed in the first point.
Edited by MaiAo - Oct. 30, 2021 15:30:23

Attachments:
UE4Editor_MFObeNEJFN.png (145.3 KB)
sidefx-labs-powerpnt-f0dskcruwu.jpg (702.0 KB)

Mai Ao
Senior Technical Lead of SideFX Labs
youtube.com/@notverydarkmagic
User Avatar
Member
3 posts
Joined: March 2018
Offline
Thanks for the detailed and well explained response Mai Ao!

Also, apologies if the tone of my original question was a little intense. Trying to get a lot of stuff done quickly and sometimes I find it hard to conceal my frustrations

Sounds like a lot of thought has gone into this system and I really appreciate you taking the time to explain your choices.

It's defiantly a tricky thing to balance and it's easy for people like myself to narrowly focus on my own needs.

I'm looking to use VATs to drive Niagara as this will allow it to be compatible with Nanite particles.

I should be able to build what I need now with the information you've given me.

I think there are definitely some optimizations we can do for the specific use case of performance sensitive realtime applications.

I'll do some experiments and post my results in case it's useful for future development of your tools.

Thanks again!
User Avatar
Member
3 posts
Joined: March 2018
Offline
Just an additional follow up to this.

What do you feel would be the issue with using Euler rotations or something similar that might work better for linear interpolation between frames?
User Avatar
Staff
99 posts
Joined: Feb. 2021
Offline
jrs300
Just an additional follow up to this.

What do you feel would be the issue with using Euler rotations or something similar that might work better for linear interpolation between frames?
jrs300
Thanks for the detailed and well explained response Mai Ao!

Also, apologies if the tone of my original question was a little intense. Trying to get a lot of stuff done quickly and sometimes I find it hard to conceal my frustrations

Sounds like a lot of thought has gone into this system and I really appreciate you taking the time to explain your choices.

It's defiantly a tricky thing to balance and it's easy for people like myself to narrowly focus on my own needs.

I'm looking to use VATs to drive Niagara as this will allow it to be compatible with Nanite particles.

I should be able to build what I need now with the information you've given me.

I think there are definitely some optimizations we can do for the specific use case of performance sensitive realtime applications.

I'll do some experiments and post my results in case it's useful for future development of your tools.

Thanks again!

Oh don't worry about it Game dev is not easy and things get frustrating sometimes.

As for the question of rotation interpolation, quaternion slerp is the well-established best solution. It's more compact, computationally efficient, and numerically stable, compared to Euler angles or rotation matrices. Linearly interpolating values of any form of representation just don't work well. Euler angles are susceptible to Gimbal lock and consequently strange interpolated paths. This video is a good demo of why that happens (https://www.youtube.com/watch?v=zc8b2Jo7mno). If the rotation is represented using Euler angles, it also means rotating vertices will involve trig functions, whereas quaternions only need the much cheaper cross products, mul and add.

In short, as explained here (https://youtu.be/C7JQ7Rpwn2k?t=2705), quaternions are the way to go.
Edited by MaiAo - Nov. 2, 2021 07:11:36
Mai Ao
Senior Technical Lead of SideFX Labs
youtube.com/@notverydarkmagic
  • Quick Links