Greetings - My question concerns what I imagine is something very simple that as a newbie, I am merely overlooking. However, after reading the manual and similar posts (like this one:
http://rapid-i.com/rapidforum/index.php/topic,5518.0.html ), I am still at a loss.
I am reading data from a DB with the following columns:
My current process grabs each row, turns it into a doc, processes the doc, the attempts to cluster them using the k-means clustering operator. My goal is to have the docs clustered, but show the entity_id value instead of the id value generated by rapidMiner. I have attempted the following with no luck:
- Add Set Role operator for the attribute entity_id to id, after Process Documents From Data operator
- Doesn't work as in order for the entity_id to show up after the Process Documents From Data operator, it appears I need to check the "Add meta information" box. If I do this, the k-means clustering operator complains about the non nominal values. Specifically, values such as title, language, etc. These values do not exist in my data and appear to be added by the Process Documents From Data operator.
- Add Set Role operator for the attribute entity_id to id, before Process Documents From Data operator
- Same issue as above. Entity_id doesn't make it through without checking the "Add meta information" box. As a result, the k-means cluster complains about the title, langauge, robots, attributes that I did not create.
Many thanks in advance for helping me through what I imagine is a total noob oversight.