Document Classification by single word and its position?
Axel_Borowitsch
New Altair Community Member
I would like to get the class of a document out of the document structure consider single words and their position.
I have Training-Data within an Excel-Document:
Training-data:
I would like to use Rapidminer to identify different document-classes in a medical environment. I thought about a neuronal network but I can't handle the polynominal attribute WORD.
How can I use the string WORD as an attribute? I assume the nominal2numeric conversion would provide too many rows.
The training has to consider all lines of one DOC_ID and its entries as one document. The occurrence and position of the words in one document have to be part of the neuronal network. How can I deliver this information to the training process, keeping the information together?
Thank you very much in advance!
All the best
Axel
I have Training-Data within an Excel-Document:
Training-data:
I would like to use Rapidminer to identify different document-classes in a medical environment. I thought about a neuronal network but I can't handle the polynominal attribute WORD.
How can I use the string WORD as an attribute? I assume the nominal2numeric conversion would provide too many rows.
The training has to consider all lines of one DOC_ID and its entries as one document. The occurrence and position of the words in one document have to be part of the neuronal network. How can I deliver this information to the training process, keeping the information together?
Thank you very much in advance!
All the best
Axel
Tagged:
0