Rankings Attribute is being tagged by RapidMiner as Real data type?
Hi guys!
I'm new to RapidMiner and I need your expert opinion regarding my problem.
I have a dataset with Ratings attributes (1-5 and 1-10) which I consider to be Ordinal type of data however, RapidMiner recognizes them as Real data type. I checked the csv file that I uploaded and the ratings are in whole numbers but everytime I try to upload it in RM using the Import Configuration Wizard (I tried the Add Data Option as well), I see them as decimal (e.g. 1.0, 7.0, 9.0, 5.0). Thus, RM recognizes it as Real type of data. See photos below or you can try helping me by uploading the dataset yourself (just make sure to choose UTF-8 as File Encoding type whenever you import it in RM. )
Is it just me or is there something wrong with my laptop? Can you please help? I attached the dataset together with this post.
P.S. Can you help me with the date as well? On the csv the format of the date is dd/mm/yyyy but when it goes to RM, the format turns into yyyy-mm-dd and there's no date format like that in RM.
Thanks in advance!
Rem
Best Answer
-
You need to Sort your dataset first by Ranking in order to get it to display in the correct order in the graph.
In terms of whether you want the decimal or not for numerical attributes, you can use the Format Numbers operator to remove it if it bothers you. RapidMiner is set to display it by default.
2
Answers
-
hi @roras - welcome to the community.
So first it's pretty easy to override an import to integer rather than real. Just drill down on this menu:
I
I don't see your data set attached but if you are able to upload it, I can show this to you again with your data.
As for dates, they're always tricky because there are so many ways to write them. You must use "MM" for months, not "mm" (that stands for minutes, not months!).
Scott
0 -
Hi @sgenzer!
Thanks for responding! Oh no! The dataset is too large for it to be uploaded (33MB)Anyway, I tried to tag the ratings as "Polynominal" but when I was checking the charts, the order of the x-axis is messed up. (Please check screenshot below.). If I use the "Real" data type, it would be in order but i'm feeling that it is a wrong data type to use for rankings since it disregards the order (1 is the least and 5 is the highest).
Please let me know if I am doing the right analysis with this dataset or not. It would be very much appreciated.
And regarding the date, yes you are right it should be "MM" instead of "mm". My bad. But still the format YYYY-MM-DD is not in RM.
Once again, thanks in advance for your help!
Rem
0 -
You need to Sort your dataset first by Ranking in order to get it to display in the correct order in the graph.
In terms of whether you want the decimal or not for numerical attributes, you can use the Format Numbers operator to remove it if it bothers you. RapidMiner is set to display it by default.
2 -
hi @roras yes exactly what @Telcontar120 said.
And just to clarify, there is not one set of date formats pre-programmed into RapidMiner. You can use the yyy, MM, dd, etc... symbols to describe ANY date/time format you like. Those listed in the pull-down menu are examples only.
Scott1