"How to do text classification only by java code"

lucky_q
lucky_q New Altair Community Member
edited November 5 in Community Q&A
Hi all,

I am new to RapidMiner communuty. Recently, I'm planning to use Rapidminer for text classification. I want to develop a small demo system (which means do not write xml file) in order to get familiar with the source code of Rapidminer. I tried rapidminer5.0 at first, as there isn't enough documentation and sample for rapidminer5.0, I decided to use 4.6 instead. Unfortunately, I still do not know how to finish that only by java code.

I meet 2 problems :'(:

1: Which operator could help me in transforming all the original messages stored in particular folder into single file which contains the word vector or feature vector. I know the Text Processing plugin, but I'm not sure how to do that from reading original file and only using java code. could anybody show me how to do that?

2: For training the feature vector, which is the easiest way for me to do if I want to use only java code? Is there any sample code could show me how to reading a feature vector file and generate a mod file. (like using weka)

I know these are all stupid questions, it's just I have know idea how to do this. I would be very very appreciated if somebody could give me some sample code (for rapidminer4.6) to show me how the whole process work. Thanks.

Answers