[SOLVED] Pivoting several lines into one single field

MacPhotoBikerUser: "MacPhotoBiker"
New Altair Community Member
Updated by Jocelyn
Hi guys,
this is my first post in this forum, I hope it´s on the right spot.

I´m using currently RapidMiner mainly for ETL processes, and it´s truly impressive what RM is capable of. I´m now looking for a particluar feature which I could not find, and I´d appreciate somebody could put me on the right track. Let me make a very simple example.

Let´s say there are currently two columns, Column 1 shows several cheque numbers, column two shows which invoices were paid with each cheque. A cheque can pay one or several invoices (i. e. a typical payment allocation table).

CHEQUE     INVOICE_PAID
C1               Inv1
C1               Inv2
C2               Inv3
C3               Inv4
C3               Inv5
C3               Inv6

What I try to achieve is a table with only two columns, where the first column is grouped by Cheque number, and the second column contains ALL invoices paid by a particular cheque:


CHEQUE     INVOICES_PAID
C1              Inv1, Inv2
C2              Inv3
C3              Inv4, Inv5, Inv6

It´s very important for me that all paid invoices are contained in one single field per record because I want to use it on a report (a statement, to be precise). Unfortunately, I wasn´t able to figure it out myself.
Any help would be greatly appreciated.

Find more posts tagged with

Sort by:
1 - 8 of 81
    Anyone?
    Andrew2User: "Andrew2"
    New Altair Community Member
    Hello

    Something like this would work...

    Loop values for the cheque attribute

    Inside the loop, filter examples where cheque = the current value

    Create a macro containing null

    Start another loop for all filtered examples

    Inside this loop update the macro with the value of the invoice attribute for the current example

    Regards,

    Andrew
    Hi Andrew,

    thank you very much for your answer.

    Frankly, something really hilarious just happened! Based on your answer I wanted to comment that I believe this functionality should be built into the "aggregate" functionality. I wanted to use the correct terminology, so I looked up the operator in RapidMiner, and - shame on me - I actually figured that it is already there, it´s the aggregation function "concatenation".

    That´s precisely what I was looking for, and SO simple!

    Again Andrew, I do thank you for your answer, after all you pointed me in the right direction, and you also gave me a glimpse of a more flexible approach. (Not that I could do that (yet), but down the road... who knows? ;) )
    Andrew2User: "Andrew2"
    New Altair Community Member
    Hello MacPhotoBiker

    How cool - I've learnt something new too.

    regards

    Andrew
    rowan_gUser: "rowan_g"
    New Altair Community Member
    Thanks for that solution MacPhotoBiker.
    Did you get the " |" between your values? For some reason I've got those characters and I can't get rid of them. Any ideas?

    Cheers,
    Hi rowan.g,

    "|" is  simply a field delimiter, and RM decided to choose one that usually does not occur in any other context, which is a very good thing in order to keep the different values (Inv1, Inv2 etc. in my example) clearly separated.

    It's very easy to change those, simply do the following:

    - After the aggregate function, add a "Generate Attribute" operator
    - "Attribute name" is the field you just generated, e.g. "MyListInOneField"
    - "function expression" is "replace(MyListInOneField","|",",")

    This replaces all occurrences of "|" with "," in the field "MyListInOneField".

    In my example, I want one field that reads "Inv. # 4, 5, 6", here's how I do that:

    - My initial "ListOfInvoices" is "4|5|6" -> replace(ListOfInvoices,"|",", ") -> "4, 5, 6"

    - To add "Inv. # " I use the "concat" function: concat("Inv. # ", replace(ListOfInvoices,"|",", ")) -> "Inv. # 4, 5, 6"

    That did the trick for me, I hope it helps.
    rowan_gUser: "rowan_g"
    New Altair Community Member
    Thanks MacPhotoBiker. Worked perfectly.
    Glad to hear that it worked.