One Problem
kinkounio
New Altair Community Member
I have a file with more data and i compare to file with one data. The result will have one data of first file. The data more proxim to data of second file.
How to ??
How to ??
Tagged:
0
Answers
-
Hi,
this question has been asked during the last few days a few times. Here are the answers:
You have two options.
1. Load the data sets and merge them. Calculate a similarity measure for the merged data set. Filter out the combinations where your single data is not part of. Sort the rest. Use the one with the highest similariy. All the necessary operators are part of RapidMiner.
2. If the amount of data is rather large, then the calculation of the full similarity matrix is probably not applicable. In that case, you have to iterate over the examples, use only the current example, calculate the similarity with your single example of interest and store it via ProcessLog. Afterwards you can change the process log back to a data set, sort it etc.
Cheers,
Ingo0 -
Good moorning .
Where is the similar post?
Thanks.0 -
Hi.
I want to compare 2 archives.
historik.txt
1 73 15 16 13 14 15
2 123 25 26 23 24 25
3 173 35 36 33 34 35
4 224 45 46 43 44 46
5 274 55 56 53 54 56
dades.txt
25 26 23 24 25
The correct result would be the second row of the first file . Value: 123
With this code he is not correct. The result with this code is 73. That I have bad?
<operator name="Root" class="Process" expanded="yes">
<parameter key="resultfile" value="/home/rm_workspace/p2/resultat.res"/>
<operator name="InputHistorik" class="ExampleSource">
<parameter key="attributes" value="/home/rm_workspace/p2/historik.aml"/>
</operator>
<operator name="FeatureRangeRemoval" class="FeatureRangeRemoval">
<parameter key="first_attribute" value="1"/>
<parameter key="last_attribute" value="1"/>
</operator>
<operator name="NearestNeighbors" class="NearestNeighbors">
</operator>
<operator name="Diari" class="ExampleSource">
<parameter key="attributes" value="/home/rm_workspace/p2/dades.aml"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
</operator>
Files aml.
dades.aml
<?xml version="1.0" encoding="UTF-8"?>
<attributeset default_source="dades.dat">
<attribute
name = "dades.txt (1)"
sourcecol = "1"
valuetype = "integer"/>
<attribute
name = "dades.txt (2)"
sourcecol = "2"
valuetype = "integer"/>
<attribute
name = "dades.txt (3)"
sourcecol = "3"
valuetype = "integer"/>
<attribute
name = "dades.txt (4)"
sourcecol = "4"
valuetype = "integer"/>
<attribute
name = "dades.txt (5)"
sourcecol = "5"
valuetype = "integer"/>
</attributeset>
historik.aml
<?xml version="1.0" encoding="UTF-8"?>
<attributeset default_source="historik.dat">
<attribute
name = "historik.txt (1)"
sourcecol = "1"
valuetype = "integer"/>
<label
name = "historik.txt (2)"
sourcecol = "2"
valuetype = "integer"/>
<cluster
name = "historik.txt (3)"
sourcecol = "3"
valuetype = "integer"/>
<attribute
name = "historik.txt (4)"
sourcecol = "4"
valuetype = "integer"/>
<attribute
name = "historik.txt (5)"
sourcecol = "5"
valuetype = "integer"/>
<attribute
name = "historik.txt (6)"
sourcecol = "6"
valuetype = "integer"/>
<attribute
name = "historik.txt (7)"
sourcecol = "7"
valuetype = "integer"/>
</attributeset>
How I can do it?
Thanks.0 -
Hi,
The answer to your problem is that for some reason only known to yourself you call column three a cluster!
<cluster
name = "historik.txt (3)"
sourcecol = "3"
valuetype = "integer"/>
I've laid out the data in one file like this...
1 73 15 16 13 14 15
2 123 25 26 23 24 25
3 173 35 36 33 34 35
4 224 45 46 43 44 46
5 274 55 56 53 54 56
6 ? 25 26 23 24 25
and made the necessary code changes to this...<operator name="Root" class="Process" expanded="yes">
and rather unsurprisingly the correct answer emerges.
<parameter key="resultfile" value="/home/rm_workspace/p2/resultat.res"/>
<operator name="InputHistorik" class="ExampleSource">
<parameter key="attributes" value="C:\Program Files (x86)\Rapid-I\RapidMiner-4.3\historik"/>
</operator>
<operator name="NearestNeighbors" class="NearestNeighbors">
</operator>
<operator name="InputHistorik (2)" class="ExampleSource">
<parameter key="attributes" value="C:\Program Files (x86)\Rapid-I\RapidMiner-4.3\historik"/>
</operator>
<operator name="ExampleFilter" class="ExampleFilter">
<parameter key="condition_class" value="missing_labels"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
</operator>
So the answer to
isHow I can do it?
With more care!
0 -
Hi, haddock.
Your code it's not the solution. I woultd compare the atribute 3-7 of file 1 with atribute of file 2 and the result there is atribute 2 of file 1.
The column "cluster" is an error for me.
I would obtain one valor of the second column of file 1. This valor is the valor where the file 1 is the same valor of file 2.
In the example my, on compare 2 files the result it would have to give the second colum of second row of file 1.
Thanks.0 -
To make it even easier for you to comprehend I've put the data into CSV form, then we don't need AML files at all. So here is the data...The correct result would be the second row of the first file . Value: 123
1, 73, 15, 16, 13, 14,15
2, 123, 25, 26, 23,24, 25
3, 173, 35, 36, 33, 34, 35
4, 224, 45, 46, 43, 44, 46
5, 274, 55, 56,53, 54, 56
6, , 25, 26, 23, 24, 25
For the same reason I've taken out the second data read and replaced it with a datacopy, like this...<operator name="Root" class="Process" expanded="yes">
If I run this I get "123" as the answer, just like before, so I'm puzzled as to what you mean by the following
<operator name="CSVExampleSource" class="CSVExampleSource" breakpoints="after">
<parameter key="filename" value="C:\Users\CJFP\Documents\rm_workspace\historik.txt"/>
<parameter key="read_attribute_names" value="false"/>
<parameter key="label_column" value="2"/>
<parameter key="id_column" value="1"/>
</operator>
<operator name="IOMultiplier" class="IOMultiplier">
<parameter key="io_object" value="ExampleSet"/>
</operator>
<operator name="ExampleFilter" class="ExampleFilter">
<parameter key="condition_class" value="missing_labels"/>
<parameter key="invert_filter" value="true"/>
</operator>
<operator name="NearestNeighbors" class="NearestNeighbors">
</operator>
<operator name="IOSelector" class="IOSelector">
<parameter key="io_object" value="ExampleSet"/>
</operator>
<operator name="ExampleFilter (2)" class="ExampleFilter">
<parameter key="condition_class" value="missing_labels"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
</operator>
Perhaps you could enlighten us?Your code it's not the solution. I woultd compare the atribute 3-7 of file 1 with atribute of file 2 and the result there is atribute 2 of file 1. 0 -
Hi,
haddock thanks.
I will prove it.0