Which operator?
Hello,
I am considering using Rapidminer for a piece of PhD research on webforums and I'm feeling my way around the program.
What I want to do is use Rapidminer to test a large data set drawn from web forum databases to see three things:
a) how often certain phrases that I am interested in appear;
b) whether this reduces over time - depending on the date of posting in the forum);
c) and whether references to these phrases are favourable.
My dataset is several CSV files that contain 7 colums, and thousands of rows. Each row contains posting details of a forum posting, and the complete text of that posting, meaning that the "Message" field can be hundreds of words long. Colums are: "MessageID" "ThreadID" "ThreadName" "MemberID" "MemberName" "P_Date" "Message".
My question is, which operator should I use to load this kind of CSV that would allow me to use all seven columns?
I am using both Rapidminer 4.6 and 5 to see which is the easiest to learn, and would appreciate any guidance members have on this.
I am considering using Rapidminer for a piece of PhD research on webforums and I'm feeling my way around the program.
What I want to do is use Rapidminer to test a large data set drawn from web forum databases to see three things:
a) how often certain phrases that I am interested in appear;
b) whether this reduces over time - depending on the date of posting in the forum);
c) and whether references to these phrases are favourable.
My dataset is several CSV files that contain 7 colums, and thousands of rows. Each row contains posting details of a forum posting, and the complete text of that posting, meaning that the "Message" field can be hundreds of words long. Colums are: "MessageID" "ThreadID" "ThreadName" "MemberID" "MemberName" "P_Date" "Message".
My question is, which operator should I use to load this kind of CSV that would allow me to use all seven columns?
I am using both Rapidminer 4.6 and 5 to see which is the easiest to learn, and would appreciate any guidance members have on this.