WIP | Vision ML tools [Utility] | Forums

Forums H20 Tech Art Challenge WIP | Vision ML tools [Utility]

WIP | Vision ML tools [Utility]

21855 44 5

First
1
2
3
Last


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Dec. 10, 2023 9:42 p.m.

I'm building a set of hdas to bring computer vision and machine learning algorithms into Houdini.

It uses SideFX Labs "Mocap stream" node as a generic UDP client that receives data from a program I created called VML Streamer.

VML Streamer runs on both windows and linux, uses openCV, and for now it can stream the webcam image + Google's MediaPipe hands/body/face realtime trackers from a simple webcam. (body still needs work!)

The code was conceived in such a way that anything can be plugged in easily, and streamed into Houdini as a python dictionary, with very little effort on the UI and data handling side.

In Houdini, a number of hdas are able to decode received data into geometry, using opencl nodes where possible for faster performance. For realtime trackers, users can snapshot poses into a pose library and create triggers anywhere based on pose matching.

There are also a couple onnx node helpers that facilitate transforming image grids to and from tensors.

If time allows, I'd like to implement an OpenPose binding (to get better full body tracking since MediaPipe is not really suited to 3d world coordinates, as stated by the devs), and also openCV Aruco markers, that can help with rigid object tracking and camera tracking.

Here's VML Streamer (python), it needs a bit of testing on different envs as I was mainly focusing on the alpha features for now: VML Streamer Github [github.com]

A speed up demo (no sound):

playground:

I'll be posting any updates here before the final submission.

Edited by fabriciochamon - Dec. 11, 2023 08:52:41


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 10, 2024 5:32 p.m.

Jan-10 update:

VML Streamer:
- added support for video playback
- fixed crashes on windows

Houdini HDAs:
- polished all node interfaces
- added help cards and hda icons

---

I think this will be the last major update before release. I tried to implement OpenCV Aruco markers, but it will take a bit more work since it heavily depends on camera calibration.

Help cards got some love, examples attached.

Video playback demo:

Puppet rig demo:
(As Mediapipe inference on linux is GPU accelerated, there is a lot of dropped frames due to the screen capture, sadly.)

Help card examples:

All nodes:

Edited by fabriciochamon - Jan. 10, 2024 17:36:16

Attachments:
help_cards_sample.jpg (240.8 KB)
sops.jpg (141.8 KB)


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 11, 2024 11:17 a.m.

A quick face rig test.

Google's MediaPipe won't export true 3d world coordinates, as it is mostly targeted at mobile stuff (filters, overlays, etc), so there are some distortions (think of it as a 2.5d tracker). Also this is definitely not a replacement for other facial mocap solutions.

But I believe in its potential for a region based driver, like adding some life to the eyes, etc.

It is also realtime, easy to setup and doesn't rely on fancy hardware (just simple monocular webcams), which is a nice environment for quickly prototyping stuff.

The example below was captured on a logitech C270 (720p) webcam, and the motion is recorded using the Mocap Stream built-in recording system.

The output of the recording is a motion clip, but since it is still a byte array (not yet decoded by any VisionML node), it can't make use of the nice motionclip nodes (retime, etc). But as soon as the motionclip is evaluated and converted into points, one can use timewarp or chop nodes to massage the data.


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 12, 2024 2:17 p.m.

Here's one more experiment.

Left hand drives camera rotation, closing hand will zoom in.
Right hand uses pose matching to switch between 3 different models.

The whole experience feels much nicer without screen capture, since it causes lot of dropped frames!


olivetty: Member; 21 posts; Joined: Feb. 2016; Offline

Jan. 17, 2024 6:18 p.m.

This is absolutely amazing, I have been searching for so long for something like this. Today, again, I thought I would give it a go and after 7h by the computer, looking at maybe doing some crazy hack between 3 different repos to get this working at all I found your thread! Is there anywhere I can donate or anything to keep this project alive? I don't have much money but I will throw whatever I can at this because it is essential to get these tools in artists hands for even better and more streamlined workflows + your interface and implementation is just amazing!

I am about to test the alpha right now, so cool to see this and again - THANK YOU!!!

You say Mediapipe doesn't do inference with GPU on windows, how come? I think I've seen other projects using GPU on Windows? :O

EDIT: I can't seem to find where the HDAs shown in the videos are? I've got VML Streamer up and running!
EDIT2: I tried this video just for testing ( https://www.pexels.com/video/a-woman-dancing-at-home-6003986/ [www.pexels.com] ) and downscaled it to 540x960 and tried running that. It played fine in the Streamer but then when choosing face or body didn't give anything and choosing hands crashed Streamer! I will try with webcam tomorrow instead, also if I can find the HDAs!

EDIT3: I found the other thread were the HDAs are available. I will test it out!
EDIT4: Sorry for all the edits BUT! I have installed it but there are two problems. My main webcam shows up as interlaced, extremely stretched and is mirrored on top of itself. I did overcome this with doing a OBS Virtual Camera instead, but the aspect ratio is wrong. NEXT when selecting anything in the TYPE nothing happens on the video in VML Streamer. I can see no MediaPipe overlay or anything like that in your demos. Houdini gives generic Python error when laying down a Mocap Stream and then a VideoCapture.

VML shows MT: 142.0 CV: 60.0 and I'm streaming in 720p.

Great potential in this - so amazing and I am so happy if I can get this to work!

-
Oliver

Edited by olivetty - Jan. 17, 2024 20:02:08

Mage level 2
www.headofoliver.com


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 17, 2024 7:41 p.m.

Thank you for the kind words Oliver! I'm also excited with the possibilities and I'm glad this looked useful to you too!

Please bear in mind that this is, like you mentioned, an alpha release. I was the only tester, so there will definitely be few bugs. Please feel free to report them on the github repo. Some things that might get in the way:

- VML streamer needs a working webcam to start! (have not implemented video only solution yet)
- Video playback might crash depending on which file you load (due to video size/codec type/random). Prefer smaller ~300px wide videos
- Avoid adding too many streams at once(specially for MediaPipe. Try working on one MP stream at a time)
- Sometimes the body tracker stops working for no apparent reason, only a program restart will fix it.

As far as mocap data, MP does NOT produce true 3d coordinates unfortunately, so the use case is limited (ie: not a replacement for body/facial mocap). Hands do work good enough for more things, hence more examples in the video. On top of that **MP is not GPU accelerated on Windows** (so if you can, use Linux!).

I like to do all the disclaimers so people don't make too much assumptions about the tool at its current state.

Anyway that said, the main plugin design is done now, I tried to pave the way for some future tech, so if a better mocap solution comes up (and its free) I can plug into the tool relatively easy.

But right now just having bare hands/body/face as an interface for anything inside Houdini is too much fun! I can't stop thinking of ways I can drive stuff. Camera, rigs, geo deformations, a virtual theremin where hand trackers control pitch and volume in a chop node generates (oh sorry too specific! :lol

Regarding donations, don't worry! If enough people use it, I'll defintely keep improving! If you still like to contribute thou, google my name and "patreon" (I don't want to spam this thread with the link). Alternatively just email me (fabricio.chamon@gmail.com) with any suggestions and questions, I'll be happy to help!!

Edited by fabriciochamon - Jan. 17, 2024 19:42:44


olivetty: Member; 21 posts; Joined: Feb. 2016; Offline

Jan. 17, 2024 8:13 p.m.

fabriciochamon
Thank you for the kind words Oliver! I'm also excited with the possibilities and I'm glad this looked useful to you too!

Please bear in mind that this is, like you mentioned, an alpha release. I was the only tester, so there will definitely be few bugs. Please feel free to report them on the github repo. Some things that might get in the way:

- VML streamer needs a working webcam to start! (have not implemented video only solution yet)
- Video playback might crash depending on which file you load (due to video size/codec type/random). Prefer smaller ~300px wide videos
- Avoid adding too many streams at once(specially for MediaPipe. Try working on one MP stream at a time)
- Sometimes the body tracker stops working for no apparent reason, only a program restart will fix it.

As far as mocap data, MP does NOT produce true 3d coordinates unfortunately, so the use case is limited (ie: not a replacement for body/facial mocap). Hands do work good enough for more things, hence more examples in the video. On top of that **MP is not GPU accelerated on Windows** (so if you can, use Linux!).

I like to do all the disclaimers so people don't make too much assumptions about the tool at its current state.

Anyway that said, the main plugin design is done now, I tried to pave the way for some future tech, so if a better mocap solution comes up (and its free) I can plug into the tool relatively easy.

But right now just having bare hands/body/face as an interface for anything inside Houdini is too much fun! I can't stop thinking of ways I can drive stuff. Camera, rigs, geo deformations, a virtual theremin where hand trackers control pitch and volume in a chop node generates (oh sorry too specific! :lol

Regarding donations, don't worry! If enough people use it, I'll defintely keep improving! If you still like to contribute thou, google my name and "patreon" (I don't want to spam this thread with the link). Alternatively just email me (fabricio.chamon@gmail.com) with any suggestions and questions, I'll be happy to help!!

Awesome, thanks for the fast response! Yeah, I have been able to get the VideoDecoder to work (shows my webcam in Houdini VP) but anything with Mediapipe or anything like that doesn't work at all. Can't see anything with it. I have it installed in the requirements and it shows up when prompted in commandline and working for other apps. But maybe I am doing something wrong? Any suggestions?

Oh, yeah, I would like to play with the hands as well, I have some simulation stuff I want to try out, that's why I got so excited, so many opportunities for sure! That Theremin would be such a fun experiment!

I'll check out your Patreon, thanks so much again!!

Hope

Mage level 2
www.headofoliver.com


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 17, 2024 8:50 p.m.

olivetty
This is absolutely amazing, I have been searching for so long for something like this. Today, again, I thought I would give it a go and after 7h by the computer, looking at maybe doing some crazy hack between 3 different repos to get this working at all I found your thread! Is there anywhere I can donate or anything to keep this project alive? I don't have much money but I will throw whatever I can at this because it is essential to get these tools in artists hands for even better and more streamlined workflows + your interface and implementation is just amazing!

I am about to test the alpha right now, so cool to see this and again - THANK YOU!!!

You say Mediapipe doesn't do inference with GPU on windows, how come? I think I've seen other projects using GPU on Windows? :O

EDIT: I can't seem to find where the HDAs shown in the videos are? I've got VML Streamer up and running!
EDIT2: I tried this video just for testing and downscaled it to 540x960 and tried running that. It played fine in the Streamer but then when choosing face or body didn't give anything and choosing hands crashed Streamer! I will try with webcam tomorrow instead, also if I can find the HDAs!
EDIT3: I found the other thread were the HDAs are available. I will test it out!
EDIT4: Sorry for all the edits BUT! I have installed it but there are two problems. My main webcam shows up as interlaced, extremely stretched and is mirrored on top of itself. I did overcome this with doing a OBS Virtual Camera instead, but the aspect ratio is wrong. NEXT when selecting anything in the TYPE nothing happens on the video in VML Streamer. I can see no MediaPipe overlay or anything like that in your demos. Houdini gives generic Python error when laying down a Mocap Stream and then a VideoCapture.

VML shows MT: 142.0 CV: 60.0 and I'm streaming in 720p.

Great potential in this - so amazing and I am so happy if I can get this to work!

-
Oliver

Answering your questions:

"You say Mediapipe doesn't do inference with GPU on windows, how come?"
When I run MP detectors with the GPU delegate I get this error "NotImplementedError: GPU Delegate is not yet supported for Windows". Also I've read more than once in their github issues page that GPU was not yet implemented. Those messages were from around mid last year. Maybe that's a limitation of the python bindings only?

"EDIT: I can't seem to find where the HDAs shown in the videos are"
"EDIT3: I found the other thread were the HDAs are available. I will test it out!"
Yeah sorry, all important links are consolidated in the main entry post.

"EDIT2: I tried this video just for testing and downscaled it to 540x960 and tried running that. It played fine in the Streamer but then when choosing face or body didn't give anything and choosing hands crashed Streamer!"
hm.. interesting. From my tests most crashes happen due to video resolution. I've been cropping and resizing videos to very small sizes (or the specific region I want to capture). for body: dont be afraid to use small videos (MP usually does a good job on small videos anyway!).

Here's a test, I've scaled down to 200x354 for body tracker:

And another test with hands, where I cropped and resized to 320x544:

So I guess the limitation is probably on the UI side, I'll investigate. But for now, yeah, use small videos.

"EDIT4: Sorry for all the edits BUT! I have installed it but there are two problems. My main webcam shows up as interlaced, extremely stretched and is mirrored on top of itself."
Can you tell me what is your webcam model and maybe send a screenshot of the stretched UI please?

Attachments:
screenshot.jpg (41.3 KB)
screenshot2.jpg (46.4 KB)


olivetty: Member; 21 posts; Joined: Feb. 2016; Offline

Jan. 18, 2024 7:40 a.m.

fabriciochamon
olivetty
This is absolutely amazing, I have been searching for so long for something like this. Today, again, I thought I would give it a go and after 7h by the computer, looking at maybe doing some crazy hack between 3 different repos to get this working at all I found your thread! Is there anywhere I can donate or anything to keep this project alive? I don't have much money but I will throw whatever I can at this because it is essential to get these tools in artists hands for even better and more streamlined workflows + your interface and implementation is just amazing!

I am about to test the alpha right now, so cool to see this and again - THANK YOU!!!

You say Mediapipe doesn't do inference with GPU on windows, how come? I think I've seen other projects using GPU on Windows? :O

EDIT: I can't seem to find where the HDAs shown in the videos are? I've got VML Streamer up and running!
EDIT2: I tried this video just for testing and downscaled it to 540x960 and tried running that. It played fine in the Streamer but then when choosing face or body didn't give anything and choosing hands crashed Streamer! I will try with webcam tomorrow instead, also if I can find the HDAs!
EDIT3: I found the other thread were the HDAs are available. I will test it out!
EDIT4: Sorry for all the edits BUT! I have installed it but there are two problems. My main webcam shows up as interlaced, extremely stretched and is mirrored on top of itself. I did overcome this with doing a OBS Virtual Camera instead, but the aspect ratio is wrong. NEXT when selecting anything in the TYPE nothing happens on the video in VML Streamer. I can see no MediaPipe overlay or anything like that in your demos. Houdini gives generic Python error when laying down a Mocap Stream and then a VideoCapture.

VML shows MT: 142.0 CV: 60.0 and I'm streaming in 720p.

Great potential in this - so amazing and I am so happy if I can get this to work!

-
Oliver

Answering your questions:

"You say Mediapipe doesn't do inference with GPU on windows, how come?"
When I run MP detectors with the GPU delegate I get this error "NotImplementedError: GPU Delegate is not yet supported for Windows". Also I've read more than once in their github issues page that GPU was not yet implemented. Those messages were from around mid last year. Maybe that's a limitation of the python bindings only?

"EDIT: I can't seem to find where the HDAs shown in the videos are"
"EDIT3: I found the other thread were the HDAs are available. I will test it out!"
Yeah sorry, all important links are consolidated in the main entry post.

"EDIT2: I tried this video just for testing and downscaled it to 540x960 and tried running that. It played fine in the Streamer but then when choosing face or body didn't give anything and choosing hands crashed Streamer!"
hm.. interesting. From my tests most crashes happen due to video resolution. I've been cropping and resizing videos to very small sizes (or the specific region I want to capture). for body: dont be afraid to use small videos (MP usually does a good job on small videos anyway!).

Here's a test, I've scaled down to 200x354 for body tracker:
Image Not Found

And another test with hands, where I cropped and resized to 320x544:
Image Not Found

So I guess the limitation is probably on the UI side, I'll investigate. But for now, yeah, use small videos.

"EDIT4: Sorry for all the edits BUT! I have installed it but there are two problems. My main webcam shows up as interlaced, extremely stretched and is mirrored on top of itself."
Can you tell me what is your webcam model and maybe send a screenshot of the stretched UI please?

Ah, yes, it seems that Mediapipe is not initializing at all. I have no landmarks show up in the video like you have, even after cropping and resizing. The video plays fine in your streamer but no landmarks what-so-ever.

Sure, here is a screenshot of how my webcam looks. It's a 720p stream from my phone to Camo Studio. It does work when I do it as an OBS Virtual Camera instead, but direct feed doesn't work. I tried different resolutions but it was the same result. OBS worked but was stretched a little vertically, so it looks like there is a hardcoded aspect ratio somewhere?

And here is a little GIF showing how it looks if I wave my hand

And here is the output if I do it as an OBS Virtual Camera - notice the wrong aspect ratio!

Edited by olivetty - Jan. 18, 2024 07:44:17

Attachments:
fU0mbdpiZw.jpg (674.2 KB)
python_2TfPsGUqxz.gif (1.4 MB)
obs64_Px2kKTmheL.png (498.4 KB)

Mage level 2
www.headofoliver.com


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 18, 2024 11:21 a.m.

olivetty
Camo Studio

Thank you for the all the info and screenshots Oliver! I'm already working on an update to better acquire webcam resolution at the start. Will hopefully push an update in a few hours.


Rob Chapman: Member; 53 posts; Joined: July 2013; Offline

Jan. 18, 2024 12:40 p.m.

oh I forgot to post! I got mine working with the hands but not the body! thank you Fabricio

web cam is logitech BRIO

Edited by Rob Chapman - Jan. 18, 2024 12:40:31

https://tekano.artstation.com/ [tekano.artstation.com]


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 18, 2024 6:53 p.m.

Hi Oliver and Rob,

I just pushed a big update to the video player, could you guys please re-download the plugin and test ? Links remain the same:

VisionML for Windows [fabriciochamon.com]
VisionML for Linux [fabriciochamon.com]

I believe this will fix all the incorrect image aspect/interlaced issues.

@Oliver: Bit of tech info, but the openCV video capture api's are very picky about different video sources. On windows, Direct Show is faster for some webcams like my logitech c270 (dshow was the default api until now!). The downside is that dshow does not work in some cases, like phone > camo studio. (obs virtual camera kind of works, but not always). So I reverted default api to the "first available" and now it all seems to work nicely, including loading videos form file.

@Rob: Can you retry and see if this update fixes body trackers for you? Also, does it work in a brand new vml streamer session with just a single stream set to MP body ? I belive there might be bugs happening when you add/switch stream types many times. Try to choose one (or few) stream types and stick to them for an entire session for now. I'll investigate.

Also changelog Jan-18-2024:

- Available webcam device ports are now fetched at the start
- VML Streamer now starts in the "no video" mode. its faster if all you need are video files, no webcam is loaded.
- Video capture API is exposed in the main menu. (Windows: Direct Show is faster, but doesn't work with all video sources!)
- Added "Video Size" control, so you can resize video before sending data through streams (and trackers). (This can help with general performance/avoid crashes on large resolutions). *Note: always prefer downsampling directly at the webcam driver if possible (example: camo studio has controls for setting resolution), this will help even more with performance.
- Removed fps info overlay from video feedback, and added as a simple UI text together with current stream video resolution. Also helps with performance and info does not get in the way.

Thank you guys for your feedback! Let me know how it goes now.

Edited by fabriciochamon - Jan. 18, 2024 18:56:19

Attachments:
vml1.jpg (29.5 KB)
vml2.jpg (20.4 KB)


Rob Chapman: Member; 53 posts; Joined: July 2013; Offline

Jan. 19, 2024 9:20 a.m.

Hey! new version tested on windows. Body tracker now works!

I was setting / testing it originally on the one single stream, first load no other streams, on the previous version and I was getting a wierd signal like what Oliver had, but now it works! it takes longer for default first instance video to kick in now, maybe over 10 seconds once Ive selected the web cam, previously it used to be instant. not a problem though actually as setting to 'Direct Show' before selecting the webcam it appears instantly as before.

video window is bigger now though which is good.!

Also I forgot to mention previously that adding the face tracker (again no other streams and first time per instance) previously just closed down the VML Streamer, and it still does the same thing with this version. I see on your demo you used a video file rather than the web cam maybe I should record a video and test as before when I selected video file it crashed also. - aaand just tried on this version. I record a video with my web cam, (MP4) select it with the file browser in the VML streamer then it just closes down the streamer

Edited by Rob Chapman - Jan. 19, 2024 10:02:39

Attachments:
Capturebodytrack.JPG (112.6 KB)

https://tekano.artstation.com/ [tekano.artstation.com]


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 19, 2024 4:29 p.m.

Hi Rob, thanks for the feedback!

Good to know video is more reliable now. (And yes, direct show it much faster! if it works for your webcam device, turn it on at the start). Note that dshow can crash while loading video files thou, that's why I didn't make it default anymore!

For face tracker, can you please try recording a quick 5-10 sec video and resize it to something small like 300 px wide. See if that works. (I'm just trying to eliminate that the crash is related to your specific OS/webcam versions or something).

If that works, I probably need to touch code for performance once again (even more on windows, since the trackers are not gpu accelerated). There might be just too much data being passed over at the moment. For now the workaround is to simply use smaller videos (or the video resize box).

As a benchmark, these tests on my release video were made in linux, reading from a logitech c270 webcam directly in 320x240px (ie: no video resize, just plain smaller resolutions direclty from the device). That's the best approach, but rather complex to support in all available resolutions from most of drivers in Win/Linux. But its on my radar for future updates.


olivetty: Member; 21 posts; Joined: Feb. 2016; Offline

Jan. 20, 2024 7:05 p.m.

fabriciochamon
Hi Rob, thanks for the feedback!

Good to know video is more reliable now. (And yes, direct show it much faster! if it works for your webcam device, turn it on at the start). Note that dshow can crash while loading video files thou, that's why I didn't make it default anymore!

For face tracker, can you please try recording a quick 5-10 sec video and resize it to something small like 300 px wide. See if that works. (I'm just trying to eliminate that the crash is related to your specific OS/webcam versions or something).

If that works, I probably need to touch code for performance once again (even more on windows, since the trackers are not gpu accelerated). There might be just too much data being passed over at the moment. For now the workaround is to simply use smaller videos (or the video resize box).

As a benchmark, these tests on my release video were made in linux, reading from a logitech c270 webcam directly in 320x240px (ie: no video resize, just plain smaller resolutions direclty from the device). That's the best approach, but rather complex to support in all available resolutions from most of drivers in Win/Linux. But its on my radar for future updates.

Hi again! Thanks for the update, CAMO now works beautifully BUT I cannot track the face. Body works fine and hands work fine, but when selecting MP Face the image from my cam turns black and stays that way or other times it just crashes VML!

Again, thanks so much for bringing us these tools, I know time is hard to find so I am really appreciative of all that you have already done! Any idea of what it might be, maybe I can check on my end?

Mage level 2
www.headofoliver.com


Rob Chapman: Member; 53 posts; Joined: July 2013; Offline

Jan. 21, 2024 9:32 a.m.

Hey Fabricio , yes, If I set the video settings on the capture on the BRIO down to 120P 30fps (640 x 480) and it still crashes. BUT if I set the video size in you VML window settings to 0.5 or 0.25 (320x240) or (160 x 120) it does not crash! live video is working at this size with Face tracking enabled. whilst I do get video working and even when my face goes out of camera it freezes the video but bring my face back in the video restarts fine BUT im not getting any tracking data or face lines overlay like it does with body and hands.. so indeed the smaller size is helping it not crash , but maybe my face (and beard) is too tricky for it to computer perhaps. IU'll try the missus, she doesnt have a beard

UPDATE didnt work with my better halfs face either... BUT I looked at your setup video and it wasnt overlayed on your VML streamer window either! so I did the mocap stream > mediapipe face decoder node setup in houdini and et Voila! works beautifully! with live video. this was 160 x 120 really good and fast update and not too taxing on my RTX 4070 GPU either - just tried 0.5 (320 x 240) refresh is not as quick as 0.25 but its still quick and not too taxing and also works! nice One Fabricio!

Edited by Rob Chapman - Jan. 21, 2024 09:38:44

Attachments:
Captureface.JPG (184.0 KB)

https://tekano.artstation.com/ [tekano.artstation.com]


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Jan. 24, 2024 2:23 a.m.

Hey guys, just a quick update to let you know that I've probably found the culprit of so many the crashes. This likely comes from bugs in the UI module itself (its based on a python port of imgui), so I'm rewriting the entire app in QT now with some added features (auto-detect webcam devices names and resolutions, much faster webcam start times, etc).

Also on this new version we can get the frames at different resolutions directly from the driver (ie: no post image resize!) which also improves performance a lot! Another performance gain comes from the fact that QT is signal based, as opposed to the previous UI which is event loop based, so it means there is zero chance that media pipe runs the detection over the same frame twice.

Anyway, this is looking much better and more responsive overall (I tested with a simple webcam, camo studio+phone cam, obs virtual cam). But since its a complete rewrite, I'll need a couple more days to finish implementing everything.

I'll send new links when done. Again thank you for all the feedback, much appreciated!


olivetty: Member; 21 posts; Joined: Feb. 2016; Offline

Jan. 24, 2024 8:43 p.m.

fabriciochamon
Hey guys, just a quick update to let you know that I've probably found the culprit of so many the crashes. This likely comes from bugs in the UI module itself (its based on a python port of imgui), so I'm rewriting the entire app in QT now with some added features (auto-detect webcam devices names and resolutions, much faster webcam start times, etc).

Also on this new version we can get the frames at different resolutions directly from the driver (ie: no post image resize!) which also improves performance a lot! Another performance gain comes from the fact that QT is signal based, as opposed to the previous UI which is event loop based, so it means there is zero chance that media pipe runs the detection over the same frame twice.

Anyway, this is looking much better and more responsive overall (I tested with a simple webcam, camo studio+phone cam, obs virtual cam). But since its a complete rewrite, I'll need a couple more days to finish implementing everything.

I'll send new links when done. Again thank you for all the feedback, much appreciated!

It's you who should have the thanks, this is opening up so many possibilities I feel, even if it isn't state of the art mocap it can still probably get the job done for smaller task - which is still much more opportunities then we had before! Can't wait for the update!

Mage level 2
www.headofoliver.com


luisguggenberger: Member; 13 posts; Joined: April 2017; Offline

Feb. 21, 2024 9:46 a.m.

Hi guys,

found out about this tool yesterday and installed it straight away. It works like a charm. Amazing work Fabricio! I was trying to retarget the hand mocap points to a kinefx rig and had medium success with it. I have difficulties keeping the orientations stable. Has anyone tried this yet? I will post my progress next week.

All the best,
Luis


fabriciochamon: Member; 69 posts; Joined: Jan. 2014; Offline

Feb. 22, 2024 1:58 a.m.

luisguggenberger
Hi guys,

found out about this tool yesterday and installed it straight away. It works like a charm. Amazing work Fabricio! I was trying to retarget the hand mocap points to a kinefx rig and had medium success with it. I have difficulties keeping the orientations stable. Has anyone tried this yet? I will post my progress next week.

All the best,
Luis

Hi Luis, great to have more people trying this, thanks for your feedback!!

Regarding your question, short answer: no. Long answer: having decoders to output an actual kinefx skeleton was is on my radar for a long time, but since the webcam/video player performance got unexpectedly difficult, I didn't had time to look into it yet. You can see the current "skeleton" is more like a visual reference (lines/joints geo) rather than something more usable.

For stable orientations, you'd need to estimate joint angles, no easy task. Maybe record the motion and work inside a solver to increment rotations based on prev frames, then build a more stable motion frame by frame?

Another alternative, is to cancel the hand transform and build the kinefx rig with fingers at the hand space (get 3 joints that roughly form the palm, extract transforms from a stashed neutral pose, and multiply the tracked hand position by the inverse of this transform), that way you'd at least get more consistent up vectors with hands facing a frontal axis only. Then you can reapply hand transform in a wrangle after the rig is created.

I'll think about more elegant solutios regarding rigs for the next updates.

Edited by fabriciochamon - Feb. 22, 2024 02:18:47

First
1
2
3
Last

Quick Links

                    
                        Search links
                        Show recent posts
                        Show unanswered posts