Please do not set all work item indices to one, and introduce redundant items like csv_rowindex. This just makes life harder. You can't use “sort by work item index” in wait for all, because it is renamed. You have to add a separate sort node, that does nothing but restores the work item index to what it was supposed to be. And the sort node can't do much else, like reverse the order etc. like SOP sort can.
Remember, each dot on the node is supposed to represent one work item. The index of an item should of course be zero for the upper left item, one for the next etc. This is totally inconsistent now. Some nodes index the nodes in this way, other start with one, other set all items to one.
If you argue that a CSV file is just one item, then there should be only one dot on the node. If you argue that it is a partition, then make it real partition, and display it as a rectangle. But whatever you, please be consistent, and don't introduce strange exceptions where there are several items, but they all have index one.
Work Item Indices
4699 13 2- gyltefors
- Member
- 31 posts
- Joined: 2月 2017
- Offline
- chrisgreb
- Member
- 603 posts
- Joined: 9月 2016
- Offline
gyltefors
Please do not set all work item indices to one, and introduce redundant items like csv_rowindex.
If your upstream item has an index of 1, then each downstream item will also have index 1, which I think is what you are seeing. If not please let us know how to repro.
The reasoning for the csvinput index change is that it should match the convention of all the other processors, which by default calculate workitem indexes to be derived from the index of their upstream item. However it should not have changed the csvinput default behavior, so that will be fixed in the next build. (Index = csv row index by default)
- gyltefors
- Member
- 31 posts
- Joined: 2月 2017
- Offline
I have used TOPs for a few different use cases, and I have to disagree regarding the reasoning of trying to match the upstream work item index. When I need to do matching, I always create my own attribute for clarity. Let's say I have a point cloud in SOP. I want to know the index of each point, which is $PT or @ptnum. If I need to match two different point clouds, I'll create explicit indices like @my_point_id = @ptnum+1. This clear, and leaves no room for misunderstanding. Just imagine if the point indices were unpredictable, sometimes @ptnum was 1 for each point in the cloud, just because that would somehow match the upstream node, for some particular use case, for some particular user. It would just be horrible. It is the same work item indices. If a node has 10 work items, just index them 0-9, and if some user wants some particular matching, let them create an attribute @my_work_item_id = 1, or whatever. Clarity is always better in the end.
Edited by gyltefors - 2019年4月16日 00:04:41
- gyltefors
- Member
- 31 posts
- Joined: 2月 2017
- Offline
Btw, talking about the csv input node, it does not support unicode. That is a big issue if you are not living in the US. And csv output splits a string attribute containing commas into several columns, but only when specifying tab as delimiter. (Please also check this tread: https://www.sidefx.com/forum/topic/56964/?page=1#post-277657)
Edited by gyltefors - 2019年4月17日 07:06:28
- gyltefors
- Member
- 31 posts
- Joined: 2月 2017
- Offline
- kenxu
- Member
- 544 posts
- Joined: 9月 2012
- Offline
Hi gyltefors, firstly we acknowledge that some of these peripheral nodes have not received quite the level of production testing as they need, as we were more focused on the core FX workflows that would impact many more people out of the gate. However, we are working to close the gap as quickly as possible. For the first production build, both the CSV nodes and JSON nodes have received a significant amount of attention. If you take a look at the change log for csv or json, you'll see that a significant number of issues have already been resolved:
https://www.sidefx.com/changelog/?journal=17.5&categories=54&body=&version=17.5&build_0=173&build_1=234&show_versions=on&show_compatibility=on&items_per_page= [www.sidefx.com]
While that is part of the problem, the flip side of the coin is that your specific use case is not at all a simple one . WRT the JSON node, we have taken a detailed look at your use case. Part of the issue there at least is that the json file in that case is a hierarchy that is being flattened. Entries are heterogeneous, with ids that point to each other to reconstruct the hierarchy. Even if one were to write code to solve the problem, it would not be trivial. That said, we are doing all we can to help. In addition to the hierarchical array retrieval we have already added, we are planning to add:
1) The resolve path. So if you make a query like “carparks/*/address”, we'll attach for each workitem a resolved path, so that it will read carparks/1/address and carparks/2/address etc. This will help you put things back together.
2) Supporting sub-trees for queries, where the result of the query is itself a json blob representing a sub-portion of the original json. This would allow you to hierarchically pick apart a json file.
So these would fall more into the RFE rather than the BUG bucket, and should help you get further. However, we also recommend that the form of the data be restructured on your end to make it a little easier to digest. Finally, WRT to CSV, one of the issues that was blocking you - that the table sop was not supporting csv files written by csvoutput, has been solved and made the production build.
If there are any further specific problems we can help you with, please let us know.
https://www.sidefx.com/changelog/?journal=17.5&categories=54&body=&version=17.5&build_0=173&build_1=234&show_versions=on&show_compatibility=on&items_per_page= [www.sidefx.com]
While that is part of the problem, the flip side of the coin is that your specific use case is not at all a simple one . WRT the JSON node, we have taken a detailed look at your use case. Part of the issue there at least is that the json file in that case is a hierarchy that is being flattened. Entries are heterogeneous, with ids that point to each other to reconstruct the hierarchy. Even if one were to write code to solve the problem, it would not be trivial. That said, we are doing all we can to help. In addition to the hierarchical array retrieval we have already added, we are planning to add:
1) The resolve path. So if you make a query like “carparks/*/address”, we'll attach for each workitem a resolved path, so that it will read carparks/1/address and carparks/2/address etc. This will help you put things back together.
2) Supporting sub-trees for queries, where the result of the query is itself a json blob representing a sub-portion of the original json. This would allow you to hierarchically pick apart a json file.
So these would fall more into the RFE rather than the BUG bucket, and should help you get further. However, we also recommend that the form of the data be restructured on your end to make it a little easier to digest. Finally, WRT to CSV, one of the issues that was blocking you - that the table sop was not supporting csv files written by csvoutput, has been solved and made the production build.
If there are any further specific problems we can help you with, please let us know.
Edited by kenxu - 2019年4月22日 17:23:32
- Ken Xu
- jason_iversen
- Member
- 12629 posts
- Joined: 7月 2005
- Offline
In my opinion for complex cases it's (arguably) simpler to write the loader yourself. There is only so much you can expect from an UI interface such a format which can return such hugely varying data topology.
Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]
also, http://www.odforce.net [www.odforce.net]
- gyltefors
- Member
- 31 posts
- Joined: 2月 2017
- Offline
jason_iversen
In my opinion for complex cases it's (arguably) simpler to write the loader yourself. There is only so much you can expect from an UI interface such a format which can return such hugely varying data topology.
Learning Python is next on my list. I first learned HAPI, and wrote my pipeline using that. Hearing about TOPs, it seemed to be a better long term solution, so I am switching over to that. While I will eventually need to write some custom integration using Python, there are some general issues with the JSON/CSV nodes that kept popping up for different use cases, so having those ironed out while TOPs is still in early development would be very nice. Also, for standalone PDG, I expect users will start to pipe in a lot of different kinds of data, so having a basic set of flexible data retrieval nodes would likely become even more important.
Edited by gyltefors - 2019年4月23日 01:23:28
- jason_iversen
- Member
- 12629 posts
- Joined: 7月 2005
- Offline
Python should be a cakewalk compared to C++
Did you give 17.5.234+ a whirl? Did it help?
Did you give 17.5.234+ a whirl? Did it help?
Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]
also, http://www.odforce.net [www.odforce.net]
- kenxu
- Member
- 544 posts
- Joined: 9月 2012
- Offline
there are some general issues with the JSON/CSV nodes that kept popping up for different use cases, so having those ironed out while TOPs is still in early development would be very nice. Also, for standalone PDG, I expect users will start to pipe in a lot of different kinds of data, so having a basic set of flexible data retrieval nodes would likely become even more important.
Could not agree more. It's a repeated exercise at this point of looking at use cases that we may still not be addressing well, ironing out the wrinkles there, rinse and repeat. If you're open to it, we'd be up for a periodic call to see where the remaining issues are for you and see what we could do about it. Please message me in case you're interested.
- Ken Xu
- anon_user_40689665
- Member
- 648 posts
- Joined: 7月 2005
- Offline
gylteforsTry doing it with vex in sops. I'm getting a 3-second cook time when loading, reformatting, combining and filtering Four csv files with 10000 lines each… also can see what's happening via spreadsheet.
The JSON/CSV related nodes are, honestly speaking, totally utterly broken. They have been for weeks. And it have kept me frustrated with TOPs for weeks. And it has prevented any kind of progress in my project for… WEEKS
- gyltefors
- Member
- 31 posts
- Joined: 2月 2017
- Offline
kenxuthere are some general issues with the JSON/CSV nodes that kept popping up for different use cases, so having those ironed out while TOPs is still in early development would be very nice. Also, for standalone PDG, I expect users will start to pipe in a lot of different kinds of data, so having a basic set of flexible data retrieval nodes would likely become even more important.
Could not agree more. It's a repeated exercise at this point of looking at use cases that we may still not be addressing well, ironing out the wrinkles there, rinse and repeat. If you're open to it, we'd be up for a periodic call to see where the remaining issues are for you and see what we could do about it. Please message me in case you're interested.
I tried out the last stable release, and it broke my TOPs setup. I am going back to an old release for now, and will contact you directly regarding the various issues with these nodes.
Edited by gyltefors - 2023年10月17日 00:59:29
- jason_iversen
- Member
- 12629 posts
- Joined: 7月 2005
- Offline
Was your setup relying on buggy behavior, perhaps? I've had that situation before where a bug fix actually broke me. Ironic bug fix in actuality.
Edited by jason_iversen - 2019年5月1日 16:28:52
Jason Iversen, Technology Supervisor & FX Pipeline/R+D Lead @ Weta FX
also, http://www.odforce.net [www.odforce.net]
also, http://www.odforce.net [www.odforce.net]
- kenxu
- Member
- 544 posts
- Joined: 9月 2012
- Offline
-
- Quick Links