Column order problem after pivoting

sharmar6
sharmar6 New Altair Community Member
edited November 2024 in Community Q&A

Hi, 

I am facing an issue of ordering the cloumns after pivoting. The index attribute is Week no. Now when I pivot the data, the columns go like : Week-1, week-10, Week-11......Week-2, Week-20,.....Week-3, Week-30,....

Whereas I need it to be :  Week-1, Week-2, Week-3.....

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_macro" compatibility="7.5.001" expanded="true" height="82" name="Generate Macro" width="90" x="112" y="85">
<list key="function_descriptions">
<parameter key="end_date" value="date_str_custom(date_add(date_now(), -1, DATE_UNIT_DAY), &quot;yyyy-MM-dd&quot;)"/>
</list>
</operator>
<operator activated="true" class="subprocess" compatibility="7.5.001" expanded="true" height="82" name="Subprocess" width="90" x="246" y="85">
<process expanded="true">
<operator activated="true" class="jdbc_connectors:read_database" compatibility="7.5.001" expanded="true" height="68" name="Read Database (2)" width="90" x="45" y="34">
<parameter key="connection" value="DVDH_zDMaqsud"/>
<parameter key="query" value="SELECT `Candidate`&#10;FROM `Candidate_Name_List`&#10;-- where Candidate like '%cliff%'"/>
<enumeration key="parameters"/>
</operator>
<operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Execute R" width="90" x="179" y="34">
<parameter key="script" value="# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;rm_main = function(input_data)&#10;{&#10; library(gtrendsR)&#10; for (i in 1:nrow(input_data)) {&#10; &#9;&#9;trend &lt;- data.frame(gtrends(input_data$Candidate[i], time = &quot;2017-01-01 %{end_date}&quot;)$interest_over_time)&#10; &#9;&#9;if ( i == 1) {&#10; &#9;&#9;&#9;all_trends &lt;- trend&#10; &#9;&#9;} else {&#10; &#9;&#9;&#9;all_trends &lt;- rbind(all_trends, trend)&#10; &#9;&#9;}&#10; }&#10; all_trends$date &lt;- as.character(all_trends$date)&#10; return(all_trends)&#10;}&#10;"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="7.5.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="313" y="34">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="date|keyword|hits"/>
</operator>
<operator activated="true" class="nominal_to_date" compatibility="7.5.001" expanded="true" height="82" name="Nominal to Date (2)" width="90" x="447" y="34">
<parameter key="attribute_name" value="date"/>
<parameter key="date_format" value="yyyy-MM-dd"/>
</operator>
<connect from_op="Read Database (2)" from_port="output" to_op="Execute R" to_port="input 1"/>
<connect from_op="Execute R" from_port="output 1" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_op="Nominal to Date (2)" to_port="example set input"/>
<connect from_op="Nominal to Date (2)" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="date_to_numerical" compatibility="7.5.001" expanded="true" height="82" name="Date to Numerical" width="90" x="380" y="85">
<parameter key="attribute_name" value="date"/>
<parameter key="time_unit" value="week"/>
<parameter key="keep_old_attribute" value="true"/>
</operator>
<operator activated="true" class="sort" compatibility="7.5.001" expanded="true" height="82" name="Sort" width="90" x="514" y="85">
<parameter key="attribute_name" value="date_week"/>
</operator>
<operator activated="true" class="rename" compatibility="7.5.001" expanded="true" height="82" name="Rename" width="90" x="648" y="85">
<parameter key="old_name" value="keyword"/>
<parameter key="new_name" value="Candidate"/>
<list key="rename_additional_attributes"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="7.5.001" expanded="true" height="82" name="Generate Attributes (3)" width="90" x="782" y="85">
<list key="function_descriptions">
<parameter key="Scaled_Value" value="((hits)*100)"/>
</list>
</operator>
<operator activated="true" class="select_attributes" compatibility="7.5.001" expanded="true" height="82" name="Select Attributes" width="90" x="916" y="85">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="Candidate|date_week|hits|Scaled_Value"/>
</operator>
<operator activated="true" class="aggregate" compatibility="7.5.001" expanded="true" height="82" name="Aggregate" width="90" x="1050" y="85">
<list key="aggregation_attributes">
<parameter key="Scaled_Value" value="sum"/>
</list>
<parameter key="group_by_attributes" value="date_week|Candidate"/>
</operator>
<operator activated="true" class="pivot" compatibility="7.5.001" expanded="true" height="82" name="Pivot" width="90" x="1184" y="85">
<parameter key="group_attribute" value="Candidate"/>
<parameter key="index_attribute" value="date_week"/>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="7.5.001" expanded="true" height="82" name="Rename by Replacing" width="90" x="1318" y="85">
<parameter key="include_special_attributes" value="true"/>
<parameter key="replace_what" value="(sum\(Scaled_Value\)_)"/>
<parameter key="replace_by" value="Week-"/>
</operator>
<connect from_port="input 1" to_op="Generate Macro" to_port="through 1"/>
<connect from_op="Generate Macro" from_port="through 1" to_op="Subprocess" to_port="in 1"/>
<connect from_op="Subprocess" from_port="out 1" to_op="Date to Numerical" to_port="example set input"/>
<connect from_op="Date to Numerical" from_port="example set output" to_op="Sort" to_port="example set input"/>
<connect from_op="Sort" from_port="example set output" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Generate Attributes (3)" to_port="example set input"/>
<connect from_op="Generate Attributes (3)" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Aggregate" from_port="example set output" to_op="Pivot" to_port="example set input"/>
<connect from_op="Pivot" from_port="example set output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

It seems it is sorting out based on first digit of the week no.

Please advise.

 

 

Tagged:

Best Answer

  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓

    Another solution is to modify the values of week BEFORE the pivot.  If you use "Format Numbers" you can specify that week number should always have 2 digits by using the pattern "00" (assuming you have fewer than 100 weeks, if not just expand accordingly).  That will force a leading zero into week number where needed, so then after you Pivot, it will appear in the correct order.

     

Answers

  • sgenzer
    sgenzer
    Altair Employee

    hello @sharmar6 - yes that's a very common annoyance.  The quick-and-dirty fix is to use "Reorder Attributes" and then select via "user specified".  The long-term fix is to post the idea in the Product Ideas forum and hope the dev team picks it for revision.  :)

     

    Scott

     

  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓

    Another solution is to modify the values of week BEFORE the pivot.  If you use "Format Numbers" you can specify that week number should always have 2 digits by using the pattern "00" (assuming you have fewer than 100 weeks, if not just expand accordingly).  That will force a leading zero into week number where needed, so then after you Pivot, it will appear in the correct order.

     

  • sgenzer
    sgenzer
    Altair Employee

    @Telcontar120 why, after almost 5 years of using RM, have I never seen this operator?  Does this happen to other users or is it just me?  Thanks.

     

    Scott

     

  • Telcontar120
    Telcontar120 New Altair Community Member

    Haha, @sgenzer it's definitely not just you.  I'm still finding new things all the time (or creative misuse of existing operators), and it's going on 7 years using RapidMiner for me!

  • sharmar6
    sharmar6 New Altair Community Member

    Thanks a lot.

  • BetsyC
    BetsyC New Altair Community Member
    I'm seeing the posts here and wondering if you are correct, @sgenzer .  I can't reorder the attributes after I pivot because they show as a cluster and not as separate attributes.  Sorry, I'm kind of a beginner - can someone help me understand how to reorder after a pivot?  Thanks a million.
  • Telcontar120
    Telcontar120 New Altair Community Member
    You may not be able to do it inside the wizard, which it sounds like is what you are using.  But if you open the actual process and then add the Reorder Attributes operator and select user specified order you should be able to put them in any order you want.  Try that and see if it solves your problem.