"getting a text file into a text object loader"
sheridany
New Altair Community Member
I am new to rapid miner and want to use it for text mining. I have a ascii text file and when I create the text object loader and select the file it runs for a while and then seems to hang or is taking a very long time. the number of records is about 30K.
Am I doing something wrong?
A later update. I have spent the better part of the day trying to get the miner tool to load the text file which contains the body of about 30K emails.
My latest attempt has resulted in a memory failure.
Root[0] (Process)
+- TextInput[0] (TextInput)
+- TextObjectLoader[0] (TextObjectLoader)
P Aug 11, 2009 5:00:10 PM: [Warning] TextInput: Warning: Encoding unknown. Using default.
G Aug 11, 2009 5:00:44 PM: [Fatal] OutOfMemoryError occured in 53rd application of TextObjectLoader (TextObjectLoader)
G Aug 11, 2009 5:00:44 PM: [Fatal] Process failed: Java heap space
Root[1] (Process)
+- TextInput[1] (TextInput)
here ==> +- TextObjectLoader[53] (TextObjectLoader)
This machine has at least 2 gigs of memory. is there a workaround?
I current have two operators Text input and and text object loader.
Am I doing something wrong?
A later update. I have spent the better part of the day trying to get the miner tool to load the text file which contains the body of about 30K emails.
My latest attempt has resulted in a memory failure.
Root[0] (Process)
+- TextInput[0] (TextInput)
+- TextObjectLoader[0] (TextObjectLoader)
P Aug 11, 2009 5:00:10 PM: [Warning] TextInput: Warning: Encoding unknown. Using default.
G Aug 11, 2009 5:00:44 PM: [Fatal] OutOfMemoryError occured in 53rd application of TextObjectLoader (TextObjectLoader)
G Aug 11, 2009 5:00:44 PM: [Fatal] Process failed: Java heap space
Root[1] (Process)
+- TextInput[1] (TextInput)
here ==> +- TextObjectLoader[53] (TextObjectLoader)
This machine has at least 2 gigs of memory. is there a workaround?
I current have two operators Text input and and text object loader.
Tagged:
0
Answers
-
Hi,
there is no workaround for your problem, but it isn't a problem at all, since this is not the way things work. Your process setup simply doesn't make sense. Please take a look into the sample process of the text plugin for getting insight into how things work. I think this will help you a lot understanding what to do and will save you the better part of the next day
Greetings,
Sebastian0