🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Looping Clusters and store them in Repository

User: "flo"
New Altair Community Member
Updated by Jocelyn

Hi everybody,

 

My dataset consists 4000 examples, 4 special attributes (ID, cluster, text and outlier), and 570 regular attributes from textprocessing. What I have done with the data so far was only to cluster it. Now I have 37 clusters and I want to store the 1 example set for each cluster in my repository.

Thats where my problem is: I think it should be possible with macros, "loop cluster" - and the "store" -operator, but I cant figure out how to set the parameters right.

I have a snippet attached from the data.

 

And the XML of my process so far:

<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.2.000" expanded="true" height="68" name="Retrieve Daten KAM clustered (opt.)" width="90" x="112" y="34">
<parameter key="repository_entry" value="//Datenbearbeitung MA/Filter Outliers/Daten KAM clustered (opt.)"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="8.2.000" expanded="true" height="82" name="Select Attributes" width="90" x="246" y="34">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="ID|label|text"/>
</operator>
<operator activated="true" class="set_role" compatibility="8.2.000" expanded="true" height="82" name="Set Role" width="90" x="380" y="34">
<parameter key="attribute_name" value="label"/>
<parameter key="target_role" value="cluster"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="loop_clusters" compatibility="8.2.000" expanded="true" height="82" name="Loop Clusters" width="90" x="648" y="34">
<process expanded="true">
<operator activated="true" class="filter_examples" compatibility="8.2.000" expanded="true" height="103" name="Filter Examples" width="90" x="179" y="34">
<list key="filters_list">
<parameter key="filters_entry_key" value="label.equals.%{myMacro_0}"/>
</list>
</operator>
<operator activated="true" class="store" compatibility="8.2.000" expanded="true" height="68" name="Store" width="90" x="648" y="34">
<parameter key="repository_entry" value="999TEST"/>
</operator>
<connect from_port="cluster subset" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Store" to_port="input"/>
<connect from_op="Store" from_port="through" to_port="out 1"/>
<portSpacing port="source_cluster subset" spacing="0"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="set_macros" compatibility="8.2.000" expanded="true" height="68" name="Set Macros" width="90" x="313" y="136">
<list key="macros">
<parameter key="myMacro_0" value="&quot;cluster_0&quot;"/>
</list>
</operator>
<connect from_op="Retrieve Daten KAM clustered (opt.)" from_port="output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Loop Clusters" to_port="example set"/>
<connect from_op="Loop Clusters" from_port="out 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

My goal is to apply the "Extract Topics from Document (LDA)" operator on every cluster with number of topics = 1 so that I can see the top words for each cluster.

 

Thank you all in advance

flo

Find more posts tagged with