🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

[SOLVED] Example Set Transformation - Performance Problem

User: "South2wood"
New Altair Community Member
Updated by Jocelyn
Hi,

I have a large data set (10 millionn lines)  where each line/example of the table is identified by two id-attributes, for example like this:

domain (id), week (id), value (attribute)
facebook.com; 1; 0,5
facebook.com; 2; 0,6
google.com; 1; 0,9
google.com; 2; 0,4
...

Now I want to transform the table into a time series, like this:
domain (id), value_week1 (attribute), value_week2 (attribute)
facebook.com; 0,5; 0,6
google.com; 0,9; 0,4

I tried to solve it by importing my data into a mysql database and do the transformation process with a php script, but it lasts for hours and didn't terminate yet. Is there a transformation operator in rapidminer which can do the job?

Another question. If I define two attributes as ID, does rapidminer understand that each line is defined by two IDs? I tried some set operations (Set Minus, Join) and it doesn't seem to work?

Happy New Year by the way ;-)

Best regards,
Matthias




Find more posts tagged with

Sort by:
1 - 4 of 41
    User: "earmijo"
    New Altair Community Member
    For your first question: Use the operator Pivot (using domain as the group attribute and week as the index attribute).

    Regards,

    E
    User: "South2wood"
    New Altair Community Member
    OP
    Thanks, I dont know why I didn't see it. Exactly what I needed!
    User: "misanthropic789"
    New Altair Community Member
    I was unable to get it to see more than one ID field.  Not sure if I was doing something wrong or if its a limitation of the software.
    User: "MariusHelf"
    New Altair Community Member
    If you need one unique ID you could use the Generate Attribute operator to create a new attribute from the two ID attributes.

    Best, Marius