Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
[SOLVED] Select attributes only shows metadata and no variables?
kasper2304
Hi out there.
I am working on a text mining project where i need to create a subset of variables for further dimensionality reduction before using training my model. Having watched the videos online i have come to the conclusion that the "select attributes" node is the one i have to use.
Here is what i have done so far.
I have created two folders on my hard drive. One folder containing positive cases and another folder containing negative cases giving me a total of 300 cases. Somehow RapidMiner manages to get two extra cases which i believe is the "folders" themselves which i will have to remove, but first things first.
I used "Process documents from files" and loaded the two directories with class name "1" and "0". Within the "process documents from files" node i have "transform cases", "tokenize", "filter stop words", "extract token number", "extract length", "aggregate token length", "stem snowball" and "filter tokens".
The settings of "process documents from files" node are:
use file extension as type = TRUE
create wor dvector = TRUE
add meta information = TRUE
prune method = PERCENTUAL
This gives me around 150 variables where i need to kick some of them out before doing dimensionality reduction. As an example "names" does not make much sense to do any analysis with in my case.
THE PROBLEM:
The problem arises when i use the "select attribute" node. It should in my world be straight forward to attach the node to my "process documents from files" node and then simply select/de-select the variables i want to continue with. BUT the only variables that is displayed when i try to use subset option is four metadata attributes... In my world all the 150 variables should be displayed... So is this a bug or do i have some settings wrong somewhere?
Best
Kasper
Find more posts tagged with
AI Studio
Accepted answers
All comments
Andrew2
Hello
The attribute names are determined from the data at run time so the meta data can't get hold of them. A work around is to store the example set in the repository using "Store" and fetch it again using "Retrieve".
regards
Andrew
kasper2304
Thanks Andrew.
I was actually just about to try that work around.
Best
Kasper
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups