Excel column/row transform - unpivot?

Jamie_21251
Jamie_21251 Altair Community Member
edited January 2021 in Community Q&A

Hello

I'm trying to transform an excel sheet using Monarch, so that I can take columns and transpose them to rows.

Essentially I'd like to take the Writer and Publisher columns below and split them out into rows. It's much easier if I showed an example to highlight what I'm looking to achieve rather than try to describe it, so here is a before and after of a small section of the data file:

BEFORE

image

..and what I'm looking to achieve using Monarch:

AFTER

image

I've also attached the excel file with both Before and After tabs so that you can see it, as I know the BEFORE screenshot above has some of the columns cut off!

I've tried using the unpivot function within Monarch, but I just can't get to the right result.

Once I can get from this BEFORE to the AFTER above, I'll then be able to use the join function within the software to join the linked publisher code to the correct publisher code to bring the publisher info up next to the respective composer rows, however I need to get through this first hurdle first!

Any help or suggestions would be much appreciated.

Thanks!

Answers

  • CPorthouse
    CPorthouse
    Altair Employee
    edited January 2021

    I believe you were on the right path.  My solution to your problem would be as follows:

    1. Load your file in to DPS (book1.xls).  Depending on the version you are using, select "No" to open the Excel file as a database.
    2. Unpivot the columns that repeat (last name, first name, etc) and select the other columns (ID for import, Title, Original Title ID).
    3. I then did some cleaning of the data by removing the numeric characters from the attribute column and collapsing the internal spaces.
    4. Add a row number column.
    5. Duplicate your "Unpivot" table.  For one of the "Unpivot" tables, filter on the attribute column to just include "Writer Last Name".
    6. Join your two unpivot tables as a "Lookup".  Your left table should be the unfiltered unpivot table, and the right table is the filtered unpivot. Use the following columns from both tables as key pairs:
    • ID for import
    • Title
    • Original Title ID
    • Row Number
  • For the right table, deselect all columns and then only select Row Number.
  • This shows where each new information starts.  Now ditto the row number(1) column.  This is going to allow you to correctly use the pivot transformation to group on each unique work.
  • Pivot the table on the "Attribute" column and group on:
    • ID for Import
    • Title
    • Original Title ID
    • Row number(1)
      The only output column you should have to select is the "Value" column and for the Aggregate type, select "First"

     

    I have included the workspace and the sample workbook so you can follow along.  Hope this helps.  Let me know if you have any questions.

  • Jamie_21251
    Jamie_21251 Altair Community Member
    edited January 2021

    Hey Chris, thank you very much for the detailed steps, really appreciate the fast and helpful response!

    I’ve tried to replicate your steps a few times in the last couple of days and I’ve gotten close, however mine is not bringing over all composer and publisher info like yours does. I’m only getting one composer row for each title, it's missing out the others (as well as some of the publisher info too):

    image

     Going through the steps from your file, the only part of the process where I get different outcome to you, is step 8 when using Ditto on the “row number (1)” column, there aren’t any blank cells to Ditto at my end! Here’s a screenshot of what it looks like for me at that stage, and attached is the workspace file I have in case you wanted to compare.

     image

    That said, when looking at the solution in your workbook, although it’s very close, it's not actually quite what I was looking for. It’s only bringing over one set of publisher info, and that is actually incorrectly matched to the wrong composer! For example, “The Last Resort” publisher should not be on the “Sam Hastings” row, if anything it should be on the “Terry Coltrane” row as described by the “linked publisher” column. Matching publisher to composer rows isn’t an issue at this stage though, I can easily do that at a later stage using a join, once we have the composers and publisher info unpivoted and then repivoted correctly. In the first instance for this particular query, how can we ensure both composer and publisher info gets unpivoted and then repivoted so that all composer and publisher info is on separate rows, looking just like the “AFTER” example screenshot:

    image

    If we can get the data looking like that, I'll then be able to bring over the publisher info to its respective boxes on the correct composer lines using a join matching the "Linked Publisher" column to the "Publisher Code" column.

    Thanks again for your help, we’re so close, I’m sure we can get there!
    Jamie

  • CPorthouse
    CPorthouse
    Altair Employee
    edited January 2021

    Hey Chris, thank you very much for the detailed steps, really appreciate the fast and helpful response!

    I’ve tried to replicate your steps a few times in the last couple of days and I’ve gotten close, however mine is not bringing over all composer and publisher info like yours does. I’m only getting one composer row for each title, it's missing out the others (as well as some of the publisher info too):

    image

     Going through the steps from your file, the only part of the process where I get different outcome to you, is step 8 when using Ditto on the “row number (1)” column, there aren’t any blank cells to Ditto at my end! Here’s a screenshot of what it looks like for me at that stage, and attached is the workspace file I have in case you wanted to compare.

     image

    That said, when looking at the solution in your workbook, although it’s very close, it's not actually quite what I was looking for. It’s only bringing over one set of publisher info, and that is actually incorrectly matched to the wrong composer! For example, “The Last Resort” publisher should not be on the “Sam Hastings” row, if anything it should be on the “Terry Coltrane” row as described by the “linked publisher” column. Matching publisher to composer rows isn’t an issue at this stage though, I can easily do that at a later stage using a join, once we have the composers and publisher info unpivoted and then repivoted correctly. In the first instance for this particular query, how can we ensure both composer and publisher info gets unpivoted and then repivoted so that all composer and publisher info is on separate rows, looking just like the “AFTER” example screenshot:

    image

    If we can get the data looking like that, I'll then be able to bring over the publisher info to its respective boxes on the correct composer lines using a join matching the "Linked Publisher" column to the "Publisher Code" column.

    Thanks again for your help, we’re so close, I’m sure we can get there!
    Jamie

    Sorry for the delay.  I accidentally left out part of step 6.  Edit your join and add row number to row number as a key pair.  

    image

    It is in my original workspace provided, but the way the Edit Join wizard is displayed, you don't see unless you scroll the key pairs up a bit.  I am going to edit my original post to add the missing piece.

     

    Good luck and let me know if you have any other questions

  • Jamie_21251
    Jamie_21251 Altair Community Member
    edited January 2021

    Hey Chris,

    Thank you so much for identifying that and correcting it in the steps! I've now replicated it successfully, thank you! This is great.

    To get to the exact "AFTER" screenshot output that I was looking for, I ended up having to split the original sheet into two parts, one tab for the composer info and one tab for the publisher info. I then followed your steps to unpivot and repivot the composer info sheet first, then did the same unpivotting and repivotting for the publisher info sheet. Once I had those done, I then combined both vertically via "append", which gave me the final grouped sheet, unpivotted and all under one table.

    Would you say that's the best way to get to the "AFTER" solution, or is there a way of doing this without having to separate the composer and publisher columns into two separate sheets to begin with?

    Thanks so much for your help with this :)

    Jamie