Contructing from Attribute from other (Taxonomy)

choose_username
choose_username New Altair Community Member
edited November 5 in Community Q&A
Hi there,

i want to construct a new Attribute from an existing. I have a large Data set with a column named 'A' containing for example apple, orange, tomatoes and carrots or sm like that.

I want to extract Association rules and therefore i need to have annother column: It shall contain either the item 'fruit' or 'vegetable' dependend on the content column of 'A'.
apple , orange -> fruit
tomatoes, carrots -> vegetable

Is there an Operator that can fill vegetables or fruit dependend on the other column? I know theres an Op that can generate an empty column, but can i fill it in the way described above?


greetings

User
Tagged:

Answers

  • choose_username
    choose_username New Altair Community Member
    i have tried the following approach:

    "my Data Set" -> "Generating Empty Attribute" -> "Branch" -> result


    The "Branch" contains on the Then-Side: A Replace Operator which maps Apple or orange to fruit

    and the Else-Side: A Replace Operator which maps Tomatoes and Carrots to vegetables.



    I get always the error:
    The setup does not seem to contain any obvious errors, but you should check the log messages or activate the debug mode in the settings dialog in ordner to get more information about this problem.


    What i have done wrong?




    Greetings

    User
  • choose_username
    choose_username New Altair Community Member
    maybe the log entry helps:


    Jun 1, 2010 4:56:39 PM SEVERE: Process failed: operator cannot be executed. Check the log messages...
    Jun 1, 2010 4:56:39 PM SEVERE: Here:          Process[1] (Process)
              subprocess 'Main Process'
                +- Read ARFF[1] (Read ARFF)
                +- Generate Empty Attribute[1] (Generate Empty Attribute)
          ==>  +- Branch[1] (Branch)
              subprocess 'Then'
                    |  +- Replace (2)[0] (Replace)
              subprocess 'Else'
                      +- Replace (3)[0] (Replace)
    Jun 1, 2010 4:56:39 PM SEVERE: java.lang.NullPointerException
  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    first of all: please post new threads only in the correct board of this forum - this will also increase the probability that some specialist can answer you. The forum "General" --> "Data Mining" is only for generic data mining discussions (like your performance criterion question we have discussed there) and not about specific details of how to set up a process for a specific analysis problem with RapidMiner. For this type of questions, the forum "Data Mining / ETL / BI Processes" is the most appropriate which is why I moved this thread here. Please try to directly find the best board for further questions.

    The solution of your problem is the operator "Generate Attributes". With this operator, you can easily write "if - then - else" conditions where the values of the new attribute can depend on the values of others (or in fact arbitrary calculations of them). The notion of the "if-then" conditions are described in the operator description of "Generate Attributes". The basic idea is

    if (condition, result, else-result)
    and an example for you could be

    if (att1 == "value0" || att1 == "value1", "T1", "T2")
    You can find a complete process example under the name "Creation of New Attribute Depending on Values of Nominal Attribute" in our Community Extension / on myExperiment.org.

    Cheers,
    Ingo
  • choose_username
    choose_username New Altair Community Member
    Hello,

    sry for the inconvenience about the posting (its a lil bit confusing for a beginner).

    i downloaded the workflow from myexperiment.org. I had to find out that i have to set the ending of the file to .zip.
    I think it was removed due to the long name or sm (maybe a hint about that would be nice for other people, just a advice to alleviate for beginners  :D , maybe i didnt see it)

    ___________________________________________________________

    The problem is, your solution is not extensible for a mapping of
    three cases (ok i didnt mention it, i have to appologize) for example:
    apple, orange -> fruit
    tomatoes, carrots -> wegetables
    chicken, pork -> meat


    Instead i use the following:  Generate Copy -> Map -> result
    Generate Copy just copies the column and the values of the
    copied column are mapped to the lower ones (generalisation)

    I thank u for ur help, because i wouldnt get that far on my own. I really appreciate it.

    greetings

    User

  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    i downloaded the workflow from myexperiment.org. I had to find out that i have to set the ending of the file to .zip.
    I think it was removed due to the long name or sm (maybe a hint about that would be nice for other people, just a advice to alleviate for beginners  Cheesy , maybe i didnt see it)
    This is not necessary if you would just install and use the Community Extension as I have suggested  :P

    Search in the forum for "Community Extension" or read our latest blog entry for more information.


    By the way: the if-statement would also work for three or more cases:

    if (att1 == "value0" || att1 == "value1", (if (att1 == "value2" || att1 == "value3", "T2")), "T3")
    Cheers,
    Ingo
  • choose_username
    choose_username New Altair Community Member
    ok now u owned me  ::)

    buts thanks alot for ur help

    greetings

    User