Contructing from Attribute from other (Taxonomy)
choose_username
New Altair Community Member
Hi there,
i want to construct a new Attribute from an existing. I have a large Data set with a column named 'A' containing for example apple, orange, tomatoes and carrots or sm like that.
I want to extract Association rules and therefore i need to have annother column: It shall contain either the item 'fruit' or 'vegetable' dependend on the content column of 'A'.
apple , orange -> fruit
tomatoes, carrots -> vegetable
Is there an Operator that can fill vegetables or fruit dependend on the other column? I know theres an Op that can generate an empty column, but can i fill it in the way described above?
greetings
User
i want to construct a new Attribute from an existing. I have a large Data set with a column named 'A' containing for example apple, orange, tomatoes and carrots or sm like that.
I want to extract Association rules and therefore i need to have annother column: It shall contain either the item 'fruit' or 'vegetable' dependend on the content column of 'A'.
apple , orange -> fruit
tomatoes, carrots -> vegetable
Is there an Operator that can fill vegetables or fruit dependend on the other column? I know theres an Op that can generate an empty column, but can i fill it in the way described above?
greetings
User
Tagged:
0
Answers
-
i have tried the following approach:
"my Data Set" -> "Generating Empty Attribute" -> "Branch" -> result
The "Branch" contains on the Then-Side: A Replace Operator which maps Apple or orange to fruit
and the Else-Side: A Replace Operator which maps Tomatoes and Carrots to vegetables.
I get always the error:
The setup does not seem to contain any obvious errors, but you should check the log messages or activate the debug mode in the settings dialog in ordner to get more information about this problem.
What i have done wrong?
Greetings
User
0 -
maybe the log entry helps:
Jun 1, 2010 4:56:39 PM SEVERE: Process failed: operator cannot be executed. Check the log messages...
Jun 1, 2010 4:56:39 PM SEVERE: Here: Process[1] (Process)
subprocess 'Main Process'
+- Read ARFF[1] (Read ARFF)
+- Generate Empty Attribute[1] (Generate Empty Attribute)
==> +- Branch[1] (Branch)
subprocess 'Then'
| +- Replace (2)[0] (Replace)
subprocess 'Else'
+- Replace (3)[0] (Replace)
Jun 1, 2010 4:56:39 PM SEVERE: java.lang.NullPointerException
0 -
Hi,
first of all: please post new threads only in the correct board of this forum - this will also increase the probability that some specialist can answer you. The forum "General" --> "Data Mining" is only for generic data mining discussions (like your performance criterion question we have discussed there) and not about specific details of how to set up a process for a specific analysis problem with RapidMiner. For this type of questions, the forum "Data Mining / ETL / BI Processes" is the most appropriate which is why I moved this thread here. Please try to directly find the best board for further questions.
The solution of your problem is the operator "Generate Attributes". With this operator, you can easily write "if - then - else" conditions where the values of the new attribute can depend on the values of others (or in fact arbitrary calculations of them). The notion of the "if-then" conditions are described in the operator description of "Generate Attributes". The basic idea is
and an example for you could be
if (condition, result, else-result)
You can find a complete process example under the name "Creation of New Attribute Depending on Values of Nominal Attribute" in our Community Extension / on myExperiment.org.
if (att1 == "value0" || att1 == "value1", "T1", "T2")
Cheers,
Ingo0 -
Hello,
sry for the inconvenience about the posting (its a lil bit confusing for a beginner).
i downloaded the workflow from myexperiment.org. I had to find out that i have to set the ending of the file to .zip.
I think it was removed due to the long name or sm (maybe a hint about that would be nice for other people, just a advice to alleviate for beginners , maybe i didnt see it)
___________________________________________________________
The problem is, your solution is not extensible for a mapping of
three cases (ok i didnt mention it, i have to appologize) for example:
apple, orange -> fruit
tomatoes, carrots -> wegetables
chicken, pork -> meat
Instead i use the following: Generate Copy -> Map -> result
Generate Copy just copies the column and the values of the
copied column are mapped to the lower ones (generalisation)
I thank u for ur help, because i wouldnt get that far on my own. I really appreciate it.
greetings
User
0 -
Hi,
This is not necessary if you would just install and use the Community Extension as I have suggested :P
i downloaded the workflow from myexperiment.org. I had to find out that i have to set the ending of the file to .zip.
I think it was removed due to the long name or sm (maybe a hint about that would be nice for other people, just a advice to alleviate for beginners Cheesy , maybe i didnt see it)
Search in the forum for "Community Extension" or read our latest blog entry for more information.
By the way: the if-statement would also work for three or more cases:
Cheers,
if (att1 == "value0" || att1 == "value1", (if (att1 == "value2" || att1 == "value3", "T2")), "T3")
Ingo
0 -
ok now u owned me ::)
buts thanks alot for ur help
greetings
User0