How can I have some melting function in rapidminer?
I am beginner in dataminer,
I have a list of 10000 rows and about 200 column like this :
look,1,2,3,4,5,6,7,8
book,4,5,6,7,8,102,104,107
look,6,7,8,9
hook,100,101,102
cook,7,8,9
build,102,103,104,107
hook,103,104,105
...
at first i need to make unique list of words:
look,1,2,3,4,5,6,7,8,9
book,4,5,6,7,8,102,104,107
hook,100,101,102,103,104,105
cook,7,8,9
build,102,103,104,107
Now I need to find lines with at least 3 (or n) similar values and generate a new list:
look,1,2,3,4,5,6,7,8,9
book,4,5,6,7,8,102,104,107
cook,7,8,9
*************
book,4,5,6,7,8,102,104,107
build,102,103,104,107
*************
hook,100,101,102,103,104,105
build,102,103,104,107
*************
Please help me in anyway
thank you
Find more posts tagged with

so that's a fun puzzle. I would begin like this (you will need @land's Statistics Extension to run this process):
<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve smmsamm" width="90" x="45" y="85">
<parameter key="repository_entry" value="smmsamm"/>
</operator>
<operator activated="true" class="de_pivot" compatibility="7.6.001" expanded="true" height="82" name="De-Pivot" width="90" x="179" y="85">
<list key="attribute_name">
<parameter key="foo" value="att[2-9]"/>
</list>
<parameter key="index_attribute" value="bar"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="85">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="bar"/>
<parameter key="invert_selection" value="true"/>
</operator>
<operator activated="true" class="numerical_to_polynominal" compatibility="7.6.001" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="447" y="85">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="foo"/>
</operator>
<operator activated="true" class="rmx_stat:cross_table" compatibility="1.3.000" expanded="true" height="82" name="Extract Cross Table" width="90" x="581" y="85">
<parameter key="group_attribute_a" value="att1"/>
<parameter key="group_attribute_b" value="foo"/>
</operator>
<connect from_op="Retrieve Untitled 3smmsamm" from_port="output" to_op="De-Pivot" to_port="example set input"/>
<connect from_op="De-Pivot" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
<connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Extract Cross Table" to_port="example set input"/>
<connect from_op="Extract Cross Table" from_port="cross table output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
That said I am certain there is a cleverer way to do this!
Scott
hmm I'm not sure the extension in the marketplace is up-to-date (Sebastian?). I would go directly to the website: https://oldworldcomputing.com/products/statistics-extension-for-rapidminer
Scott
Oh thank you sir, You are the master
but These were samples data for test
my real data have about 100000 difeerent value, with this method I will have about 100000 Columns?
Is it possible to convert the list to my wanted list?
look,1,2,3,4,5,6,7,8,9
book,4,5,6,7,8,102,104,107
cook,7,8,9
*************
book,4,5,6,7,8,102,104,107
build,102,103,104,107
*************
hook,100,101,102,103,104,105
build,102,103,104,107
*************
Your flattery is noted and not deserved. There are many here who are far more masterful than I. That said, I think at this point I would recommend getting more knowledgable with RapidMiner Studio before moving forward with large data sets like the one you describe - actions such as renaming attributes and so forth are the beginning of a long journey. I would highly recommend starting with the "Getting Started with RapidMiner" YouTube playlist. The whole beauty of RapidMiner is that you can learn to create your own processes and be a master yourself!
Scott