"Text mining ( creat a bag of word)"
nabilophone11
New Altair Community Member
Hi every body
Please can you tel me how to insert a list of word ( about 600) automaticly (as attribut) in rapidminer 5.1.11, i find just a manual way with " Generate attribut "
thanks for help
N
Please can you tel me how to insert a list of word ( about 600) automaticly (as attribut) in rapidminer 5.1.11, i find just a manual way with " Generate attribut "
thanks for help
N
Tagged:
0
Answers
-
Hi,
this all depends on how the data is available. You probably have the words in some kind of list (Excel, CSV, Database)!? You can use the import wizard or the import operators like "Read Database" to import the data into RapidMiner. If you have the words as plain text you need operators from the text processing extension like "Read Document" or "Process Documents from Data". But you will have to provide more details for further help.
Regards
Matthias0 -
Thanks Matthias,
Actualy, i have an excel file with 30000 lines and 3 attribute : id, text attribut(adresse), and label (yes/no), i have a bout 600 word who help me to say yes for the adress, sow i want to creat 600 attribut automaticly, i don't have probleme to import data, my pb is with how to creat the 600 attribut ( i didn't find a way to creat a list of word....
Thanks
N0 -
Please i need helpp0
-
Hi,
I'm not really sure about your intention. Do you need the RapidMiner wordlist format? In this case I don't know how to create one instead of creating word vectors by one of the "Process Documents" operators. What do you mean by "help me to say yes"? What classifier do you want to use?
Regards
Matthias0 -
Hi,
i want to creat a matrix with this 600 attribut, if one of them is true, my class(label) is positive else negative...sorry about my english so i didn't find a way to insert all this attribut automaticly. i think that rapidminer word list format can help yes so i tried to instal WVTOOL but i doesn' work with rapidminer 5.1.11 ?
regards,
N
0 -
Hi,
creating a wordlist for these words should be possible by writing them into a single document (e.g. one word per line or separated by some other whitespace), importing this to RapidMiner, creating a word vector using "Process Documents" (with tokenization inside). The "Process Documents" operator should deliver the desired wordlist. But I have my doubts, if this will really help you, since your classifier seems only to depend on the word lookup. I'm not sure which approach would make sense and my time is limited at the moment... sorry.
Perhaps someone else may help?
Regards
Matthias0 -
Thanks Matthias
after creating the word list, i'm thinking about using SVM model for learning...i will let you know about the result...
Best,
N0 -
now i get my 800 attribut ( bag of words)...success....but not finish yet because i have to find the way to get a matrix with 0/1 for evey attribut of my bag of words...
Do you have an idea about the perfect way to get the result ?
Best,
N0 -
Hi every body,
I get the result with 10% of error...i'm trying to perform my model...do you have any suggestion ?
i wan't to know how to get a new attribut who give me all attribut with value = true by line ? is that possible
Thank you for your help
Best,
N0 -
Hi,
It is possible in rapid miner to creat an result attribut who regroup all the attribut with value = Yes
Ex :
ID Label AT1 AT2 AT3 AT4.... (what i need)
row1 : 1 Yes YES NO Yes NO... AT1, AT3
row2 : 2 Yes NO NO Yes NO... AT3*
.
.
Please need your help ! thank you very much !0