Superset Operator Tips
Hi all,
I have a process where I'm using the Superset operator to add attributes in my dataset. (Around 25,000 empty attributes each time)
However, the operator is quite a bottleneck in the process.
Does anyone have any best practices for adding large numbers of attributes to a dataset efficiently?
Thanks,
JEdward
I have a process where I'm using the Superset operator to add attributes in my dataset. (Around 25,000 empty attributes each time)
However, the operator is quite a bottleneck in the process.
Does anyone have any best practices for adding large numbers of attributes to a dataset efficiently?
Thanks,
JEdward
Find more posts tagged with
Sort by:
1 - 3 of
31
Have you tried to use the Radoop extension with Hive for that? The process is not exactly the same but the speed Radoop provides gave me at least the feeling I did not have to wait longer than I normally can handle... There is no need to build your entire process externally, I think. Ideally would be to have these processes taking a lot of time, externally running and the rest on you local machine.
Cheers
Sven