WIP | Vision ML tools [Utility]

   19365   44   5
User Avatar
Member
69 posts
Joined: 1月 2014
Online
btw you guys might want to try this new refactored version of vml-streamer [github.com]



changes:
- interface is now QT based (better integrated to your OS, also uses a Houdini stylesheet)
- improved performance (faster webcam reads, no crashes when changing stream types)
- device names/resolutions available directly in the UI (no more "device #0" kind of thing). This also helps improving performance a lot since I'm reading resolutions directly from hardware (no opencv resizes!)
- improved mediapipe tracker previews over video (which you can turn on/off as needed)
- tooltips! (explaining some of the stream settings)
- for python people: adding custom stream types is now much nicer and simple. I've included documented examples.

And here's the new github page [github.com] if you want to follow the next developments, I already stopped working on the previous version (but will leave it there until the end of the challenge).

** please note that this new version still has problems with:
- playback of a video file
- mediapipe body tracker (for video and webcam)

I'm working on fixes for these already.
Edited by fabriciochamon - 2024年2月22日 02:21:54

Attachments:
vml_streamer.png (187.8 KB)

User Avatar
Member
13 posts
Joined: 4月 2017
Offline
Just updated to the newest version. Works for me. Thank you Fabricio!

I was able to fix the orients by using a rest skeleton and retarget the point positions with an IK solver. First I get the hand orientation with look at the root of index and pinky using the root of middle as an up vector.

It sort of works, the main issue is that the length of the fingers changes, so the IK has sometimes difficulties to reach the target.


white is rest
red is just copying the point positions of the animiation, with broken orients
green is after the finger IK with fixed orientations but slightly different pose due to different bone scale.

Edited by luisguggenberger - 2024年2月26日 04:21:29

Attachments:
hand_retarget.png (69.1 KB)
Untitled-2.jpg (207.3 KB)

User Avatar
Member
21 posts
Joined: 2月 2016
Offline
fabriciochamon
btw you guys might want to try this new refactored version of vml-streamer [github.com]

Image Not Found


changes:
- interface is now QT based (better integrated to your OS, also uses a Houdini stylesheet)
- improved performance (faster webcam reads, no crashes when changing stream types)
- device names/resolutions available directly in the UI (no more "device #0" kind of thing). This also helps improving performance a lot since I'm reading resolutions directly from hardware (no opencv resizes!)
- improved mediapipe tracker previews over video (which you can turn on/off as needed)
- tooltips! (explaining some of the stream settings)
- for python people: adding custom stream types is now much nicer and simple. I've included documented examples.

And here's the new github page [github.com] if you want to follow the next developments, I already stopped working on the previous version (but will leave it there until the end of the challenge).

** please note that this new version still has problems with:
- playback of a video file
- mediapipe body tracker (for video and webcam)

I'm working on fixes for these already.

Awesome stuff! Still having my eye on this and great work on the new version - looks very polished now! I'll test it out as best I can and I'll feedback here.

Again, thanks so much for the development, it is very inspiring!


luisguggenberger
Just updated to the newest version. Works for me. Thank you Fabricio!

I was able to fix the orients by using a rest skeleton and retarget the point positions with an IK solver. First I get the hand orientation with look at the root of index and pinky using the root of middle as an up vector.

It sort of works, the main issue is that the length of the fingers changes, so the IK has sometimes difficulties to reach the target.
Image Not Found

Image Not Found

white is rest
red is just copying the point positions of the animiation, with broken orients
green is after the finger IK with fixed orientations but slightly different pose due to different bone scale.



Very cool tests luis! I was wondering if throwing chops on it will help with the slight jitter. I must say it is pretty amazing to be able to do this from just a webcam, what a time to be alive!

Would you be willing to share the hip so I can dissect what you did with the joints to make them work?
Edited by olivetty - 2024年2月23日 20:32:36
Mage level 2
www.headofoliver.com
User Avatar
Member
13 posts
Joined: 4月 2017
Offline
Hi Oliver,

attached is the hip file. I cleaned it up a bit. Left hand only at the moment. You can crank up the "Smooth Factor" in VML Streamer to get less jitter. I had another idea to force finger bone lengths in vex before feeding it into the IK solver.


Here is a reference of how to match the neutral pose. The hand should face the camera with fingers stretched out.
Edited by luisguggenberger - 2024年2月26日 04:21:12

Attachments:
mediapipe_hands_03.hiplc (2.1 MB)
screenshot.png (306.4 KB)

User Avatar
Member
21 posts
Joined: 2月 2016
Offline
luisguggenberger
Hi Oliver,

attached is the hip file. I cleaned it up a bit. Left hand only at the moment. You can crank up the "Smooth Factor" in VML Streamer to get less jitter. I had another idea to force finger bone lengths in vex before feeding it into the IK solver.


Here is a reference of how to match the neutral pose. The hand should face the camera with fingers stretched out.
Image Not Found

A scholar and a gentlemen! Thanks so much! I'll be sure to play around with it, seems like a great start into making use of this fantastic tool by Fabricio!
Mage level 2
www.headofoliver.com
User Avatar
Member
69 posts
Joined: 1月 2014
Online
Hi everyone,

thank you Luis for the hip!! I'll have a look.

Today I was able to do a deeper test, started from scratch but looks like I ended up with the same as you: rest skeleton -> IK solvers to match animation.

This kind of works (some trashy tendon blendshapes just for fun ):


Hip file attached and all commented if you want to check. Comes with a built-in recorded animation (the one from video above, with a chop filter applied), but also prepared from live capture if you want to try.

The challenges are defintely:
1- hand rotations: palm facing away from webcam. The 2d->3d projection from mediapipe is really a hack, not sure how much I can improve this
2- joint jumps during realtime captures due to finger occlusion.
3- skeleton distortions (varying bone lengths / joints with physically incorrect rotations / etc)

for (3) I was able to alleviate by solving IK individually for each finger in a for loop. I added a line from root to tip of the finger and repositioned middle joint (used as twist vector) in a way it stayed on the same z axis as that line.

The thumb also proved to be difficult, its easy to break it when spreading fingers past a relaxed pose.

I'll keep trying more stuff, keep you guys posted.
---

Adding other sample hips here [github.com]
Edited by fabriciochamon - 2024年2月28日 00:22:06

Attachments:
mp_hands_kinefx.hiplc (2.6 MB)

User Avatar
Member
21 posts
Joined: 2月 2016
Offline
fabriciochamon
Hi everyone,

thank you Luis for the hip!! I'll have a look.

Today I was able to do a deeper test, started from scratch but looks like I ended up with the same as you: rest skeleton -> IK solvers to match animation.

This kind of works (some trashy tendon blendshapes just for fun ):


Hip file attached and all commented if you want to check. Comes with a built-in recorded animation (the one from video above, with a chop filter applied), but also prepared from live capture if you want to try.

The challenges are defintely:
1- hand rotations: palm facing away from webcam. The 2d->3d projection from mediapipe is really a hack, not sure how much I can improve this
2- joint jumps during realtime captures due to finger occlusion.
3- skeleton distortions (varying bone lengths / joints with physically incorrect rotations / etc)

for (3) I was able to alleviate by solving IK individually for each finger in a for loop. I added a line from root to tip of the finger and repositioned middle joint (used as twist vector) in a way it stayed on the same z axis as that line.

The thumb also proved to be difficult, its easy to break it when spreading fingers past a relaxed pose.

I'll keep trying more stuff, keep you guys posted.
---

Adding other sample hips here [github.com]

This looks really good and promising! I can definitely see this come in handy! Thanks for sharing the hipfile as well!
Mage level 2
www.headofoliver.com
User Avatar
Member
69 posts
Joined: 1月 2014
Online
Another update being pushed soon:

Android Sensors

User Avatar
Member
21 posts
Joined: 2月 2016
Offline
fabriciochamon
Another update being pushed soon:

Android Sensors



Another great update and the UI is starting to look really nice as well! Are you fore-seeing this becoming like a poor-mans virtual camera in the future if we can get more sensors in there? Like what Unreal Live Link is doing with iPhones to make the phone the camera so to speak.

Nevertheless, looks great and good luck with the rest of implementation. Following this thread with great interest!
Mage level 2
www.headofoliver.com
User Avatar
Member
69 posts
Joined: 1月 2014
Online
olivetty
fabriciochamon
Another update being pushed soon:

Android Sensors



Another great update and the UI is starting to look really nice as well! Are you fore-seeing this becoming like a poor-mans virtual camera in the future if we can get more sensors in there? Like what Unreal Live Link is doing with iPhones to make the phone the camera so to speak.

Nevertheless, looks great and good luck with the rest of implementation. Following this thread with great interest!

thanks Oliver! I think I'm heading more middle ground here: maybe provide a more robust AR-like movement other than raw sensors, but without the bi-directionality of live link (no viewport showing on the phone).

I don't want to dig into the rabbit hole of a mobile app and its own complex domain, even more when performance is crucial.

Hopefully VisionML becomes this sort of robust middleware where users can build Houdini animations tools on top. Let's see.
User Avatar
Member
21 posts
Joined: 2月 2016
Offline
fabriciochamon
olivetty
fabriciochamon
Another update being pushed soon:

Android Sensors



Another great update and the UI is starting to look really nice as well! Are you fore-seeing this becoming like a poor-mans virtual camera in the future if we can get more sensors in there? Like what Unreal Live Link is doing with iPhones to make the phone the camera so to speak.

Nevertheless, looks great and good luck with the rest of implementation. Following this thread with great interest!

thanks Oliver! I think I'm heading more middle ground here: maybe provide a more robust AR-like movement other than raw sensors, but without the bi-directionality of live link (no viewport showing on the phone).

I don't want to dig into the rabbit hole of a mobile app and its own complex domain, even more when performance is crucial.

Hopefully VisionML becomes this sort of robust middleware where users can build Houdini animations tools on top. Let's see.

Ah, yes, the bidirectionality is overrated, I always watch the screen anyways! This already looks really nice!
Mage level 2
www.headofoliver.com
User Avatar
Member
13 posts
Joined: 4月 2017
Offline
Hi guys,

Attached is my new hip file. I had best results using two full body IKs (first for the palm, second for the fingers), since I could limit joint rotations to prevent unnatural motion. I couldn't figure out a way to limit axis rotations with the vop ik solver. I also found a way to prevent bone scaling with rig vops. Thank you Fabricio for your awesome input and all your work! I used some of your nodes in my hip file, I hope you don't mind! I apologize in advance for the messy hip file.



Best,
Luis
Edited by luisguggenberger - 2024年3月1日 10:12:31

Attachments:
mediapipe_hands_11.hiplc (13.0 MB)

User Avatar
Member
69 posts
Joined: 1月 2014
Online
Woah Luis that looks really good!! thank you for the hip!
I can see you also found a way to manage hand rotations as well, excellent.

Do you mind if I add your hip (with credits of course) to the github sample scenes folder ?
User Avatar
Member
13 posts
Joined: 4月 2017
Offline
fabriciochamon
Do you mind if I add your hip (with credits of course) to the github sample scenes folder ?

Sure, please do! If I find time this week I will upload a cleaner hip file, which you could update on github.

Best,
Luis
User Avatar
Member
69 posts
Joined: 1月 2014
Online
Great, thank you Luis, much appreciated!


@everyone: here's a new release [github.com], including:

Changelog Mar-04-2024:

- Added new "Android Sensors" stream type
- Added new "Rotation from device" hda (rotates an object using sensor data from Android devices)
- Fixed transform issues with MediaPipe Body and MediaPipe Face hda decoders
- Exposed more settings for MediaPipe Body:
  • Num bodies to track
  • Min body presence confidence
  • Min landmark trackers confidence
  • 3D Coordinates
- Overall UI polishes
- Overall documentation polishes (HDA's / Streamer tooltips)
- Added sample hip files for MediaPipe Hands -> Kinefx workflows
Edited by fabriciochamon - 2024年3月4日 19:36:50
User Avatar
Member
21 posts
Joined: 2月 2016
Offline
fabriciochamon
Great, thank you Luis, much appreciated!


@everyone: here's a new release [github.com], including:

Changelog Mar-04-2024:

- Added new "Android Sensors" stream type
- Added new "Rotation from device" hda (rotates an object using sensor data from Android devices)
- Fixed transform issues with MediaPipe Body and MediaPipe Face hda decoders
- Exposed more settings for MediaPipe Body:
  • Num bodies to track
  • Min body presence confidence
  • Min landmark trackers confidence
  • 3D Coordinates
- Overall UI polishes
- Overall documentation polishes (HDA's / Streamer tooltips)
- Added sample hip files for MediaPipe Hands -> Kinefx workflows


Amazing! I am deep down in AI world for now but can't wait to jump on and test this one! Wohoo!
Mage level 2
www.headofoliver.com
User Avatar
Member
13 posts
Joined: 4月 2017
Offline
One thing I realized using "Flip image" in VML Streamer it also flips the hand assignment. So when flipped I'm controlling the right hand using my left hand. It's an easy fix in houdini obviously, just realized this behavior changed from the previous version and was wondering if it was intentional.
User Avatar
Member
69 posts
Joined: 1月 2014
Online
luisguggenberger
One thing I realized using "Flip image" in VML Streamer it also flips the hand assignment. So when flipped I'm controlling the right hand using my left hand. It's an easy fix in houdini obviously, just realized this behavior changed from the previous version and was wondering if it was intentional.

This is intentional, "flip image" changing hand assignments helps making 2 hand motions quickly accessible with a single hand, so you don't have to let go of the mouse.

At least that was my experience when using the tool, but of course I'm open for suggestions. I can add a toogle on the HDA(or streamer) to prevent the behavior, what do you think ?
User Avatar
Member
13 posts
Joined: 4月 2017
Offline
Yes it's certainly a good feature to have, been able to still use the mouse. When others in my studio tested it, they felt it was a bit unintuitive, that the webcam image doesn't behave like a mirror. When one hand accidentally got out of frame they moved the wrong hand first to bring it back into frame. (hope that discription makes sense)

If it would be me, I would probably have either two toggles (flip image/flip hands) or have the mirrored image as the default without actually mirroring the hands.

That said it's an easy fix inside Houdini for anyone who wants a different behavior.

I made a first test today adding finger motion to a body mocap and it worked out pretty good! It still needs cleanup for final animation but for previs it does a great job. It was a bit of a struggle to record the motion capture and playing back live the body animation, so that body and hands match time wise. I don't quite understand the "Time Dependant" toggle on Mocap Stream. It helps in terms of performance when the real time toggle is switched on, but for some reason I had to timeshift to get the finger animation into place.
User Avatar
Member
69 posts
Joined: 1月 2014
Online
luisguggenberger
Yes it's certainly a good feature to have, been able to still use the mouse. When others in my studio tested it, they felt it was a bit unintuitive, that the webcam image doesn't behave like a mirror. When one hand accidentally got out of frame they moved the wrong hand first to bring it back into frame. (hope that discription makes sense)

If it would be me, I would probably have either two toggles (flip image/flip hands) or have the mirrored image as the default without actually mirroring the hands.

That said it's an easy fix inside Houdini for anyone who wants a different behavior.

I made a first test today adding finger motion to a body mocap and it worked out pretty good! It still needs cleanup for final animation but for previs it does a great job. It was a bit of a struggle to record the motion capture and playing back live the body animation, so that body and hands match time wise. I don't quite understand the "Time Dependant" toggle on Mocap Stream. It helps in terms of performance when the real time toggle is switched on, but for some reason I had to timeshift to get the finger animation into place.

(New release with all updates below: VisionML 1.0.1 [github.com])

Good suggestion, thanks Luis. I've removed the default image flip from vml streamer hands stream and added a preferred orientation mode in the hands decoder HDA. We now have:

First person (palms facing Z-) or
Mirror video (palms facing Z+).

The hand sides now align with real world, so it should be more user friendly. I still left "first person" as the default choice so I don't break previous hips (hopefully, haven't checked!).

---

Other than that I've re-worked on the pose compare hda to be more robust. It now considers hand local space, joint angles(more accurate) and supports point weights. You can use the weight attribute to, for example, give more "comparision" priority to a single finger or a couple of joints, disregarding other areas where you don't need to match as much. This will alleviate jitter errors and improved the pose matching on my tests.


There's new sample scene to demo the pose compare HDA changes, running inside a solver to create a minimal drawing "app". Used android sensors so my phone can control brush size (rotating front/back), which is also a nice example of multiple streams being used together:
Edited by fabriciochamon - 2024年3月7日 17:35:09
  • Quick Links