Rankings Attribute is being tagged by RapidMiner as Real data type?

roras
roras New Altair Community Member
edited November 5 in Community Q&A

Hi guys! 


I'm new to RapidMiner and I need your expert opinion regarding my problem. 

 

I have a dataset with Ratings attributes (1-5 and 1-10) which I consider to be Ordinal type of data however, RapidMiner recognizes them as Real data type. I checked the csv file that I uploaded and the ratings are in whole numbers but everytime I try to upload it in RM using the Import Configuration Wizard (I tried the Add Data Option as well), I see them as decimal (e.g. 1.0, 7.0, 9.0, 5.0). Thus, RM recognizes it as Real type of data. See photos below or you can try helping me by uploading the dataset yourself (just make sure to choose UTF-8 as File Encoding type whenever you import it in RM. :))

 

Excel.JPG

 

RM.JPG

 

Is it just me or is there something wrong with my laptop? Can you please help? I attached the dataset together with this post. 

 

P.S. Can you help me with the date as well? On the csv the format of the date is dd/mm/yyyy but when it goes to RM, the format turns into yyyy-mm-dd and there's no date format like that in RM.

 

Thanks in advance! 

Rem

Best Answer

  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓

    You need to Sort your dataset first by Ranking in order to get it to display in the correct order in the graph.

    In terms of whether you want the decimal or not for numerical attributes, you can use the Format Numbers operator to remove it if it bothers you.  RapidMiner is set to display it by default.

Answers

  • sgenzer
    sgenzer
    Altair Employee

    hi @roras - welcome to the community.

     

    So first it's pretty easy to override an import to integer rather than real. Just drill down on this menu:

     

    Screen Shot 2018-03-27 at 9.00.07 AM.png

     

    I don't see your data set attached but if you are able to upload it, I can show this to you again with your data.

     

    As for dates, they're always tricky because there are so many ways to write them. You must use "MM" for months, not "mm" (that stands for minutes, not months!).

     

    Scott

     

  • roras
    roras New Altair Community Member

    Hi @sgenzer!


    Thanks for responding! Oh no! The dataset is too large for it to be uploaded (33MB) :(

     

    Anyway, I tried to tag the ratings as "Polynominal" but when I was checking the charts, the order of the x-axis is messed up. (Please check screenshot below.). If I use the "Real" data type, it would be in order but i'm feeling that it is a wrong data type to use for rankings since it disregards the order (1 is the least and 5 is the highest).

     

    Please let me know if I am doing the right analysis with this dataset or not. It would be very much appreciated. :)

     

    And regarding the date, yes you are right it should be "MM" instead of "mm". My bad. But still the format YYYY-MM-DD is not in RM. 

    Capture.JPG

     

    Once again, thanks in advance for your help! :)

     

    Rem

     

     

  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓

    You need to Sort your dataset first by Ranking in order to get it to display in the correct order in the graph.

    In terms of whether you want the decimal or not for numerical attributes, you can use the Format Numbers operator to remove it if it bothers you.  RapidMiner is set to display it by default.

  • sgenzer
    sgenzer
    Altair Employee

    hi @roras yes exactly what @Telcontar120 said. :) 

     

    And just to clarify, there is not one set of date formats pre-programmed into RapidMiner. You can use the yyy, MM, dd, etc... symbols to describe ANY date/time format you like. Those listed in the pull-down menu are examples only.


    Scott