A program to recognize and reward our most engaged community members
I have had a quick look and it could work if the list of words (and therefore the columns/attributes) stayed the same... but the list of words already is large and having to set up the attributes in the de-pivot task would take a very long time each time the job was run.
<?xml version="1.0" encoding="UTF-8" standalone="no"?><process version="5.1.017"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process"> <process expanded="true" height="224" width="681"> <operator activated="true" class="retrieve" compatibility="5.1.017" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30"> <parameter key="repository_entry" value="//Samples/data/Market-Data"/> </operator> <operator activated="true" class="generate_attributes" compatibility="5.1.017" expanded="true" height="76" name="Generate Attributes" width="90" x="179" y="30"> <list key="function_descriptions"> <parameter key="AMOUNT" value="1"/> </list> </operator> <operator activated="true" class="pivot" compatibility="5.1.017" expanded="true" height="76" name="Pivot" width="90" x="313" y="30"> <parameter key="group_attribute" value="TID"/> <parameter key="index_attribute" value="ITEM"/> <parameter key="skip_constant_attributes" value="false"/> </operator> <operator activated="true" class="de_pivot" compatibility="5.1.017" expanded="true" height="76" name="De-Pivot" width="90" x="447" y="30"> <list key="attribute_name"> <parameter key="AMOUNT" value="AMOUNT.*"/> </list> <parameter key="index_attribute" value="ITEM"/> </operator> <connect from_op="Retrieve" from_port="output" to_op="Generate Attributes" to_port="example set input"/> <connect from_op="Generate Attributes" from_port="example set output" to_op="Pivot" to_port="example set input"/> <connect from_op="Pivot" from_port="example set output" to_op="De-Pivot" to_port="example set input"/> <connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator></process>
I have had a quick look at the Cut Document operator, and it would appear to do what I want, expect it does not allow for any other meta data to be passed through so I cannot tell what document the words relate to.