Two simple questions
I teach Data Mining at a Business School and I'm considering using Rapid-Miner as the official software (last year I used XLMiner and Rattle/R). I'm translating everything I did with those two packages to Rapid-i. I have two very simple questions.
1) After running a cluster algorithm (say k-means), I'd like to get some basic stats (means, medians, st devs) BY cluster membership. Can I do that?
2) Suppose I have a set of variables (beer=label, income, education, age, woman, etc = attributes) and I want to run a simple linear regression. I want to be able to manually leave some variables out. For instance, I want to omit "age" and "woman". How could I do that? I've tried to use FeatureNameFilter but I can only list one of the two. (I've tried to separate the list of variables I want to omit with commas, semi-colons, etc with no success).
Thanks in advance for any help,
E.
1) After running a cluster algorithm (say k-means), I'd like to get some basic stats (means, medians, st devs) BY cluster membership. Can I do that?
2) Suppose I have a set of variables (beer=label, income, education, age, woman, etc = attributes) and I want to run a simple linear regression. I want to be able to manually leave some variables out. For instance, I want to omit "age" and "woman". How could I do that? I've tried to use FeatureNameFilter but I can only list one of the two. (I've tried to separate the list of variables I want to omit with commas, semi-colons, etc with no success).
Thanks in advance for any help,
E.
Find more posts tagged with
Sort by:
1 - 4 of
41
Thanks Tobias for your quick response. I teach at the Rotterdam School of Management in Europe and INCAE Business School in Latin America. The answer about Filtering solved my problem perfectly. The one about clustering I couldn't make it work. Here it is applied to one of the sample programs. The program complains that 'cluster' is not a valid variable (but that's the name given by the program to the cluster_id).
<operator name="Root" class="Process" expanded="yes">
<parameter key="logverbosity" value="warning"/>
<operator name="ExampleSource" class="ExampleSource">
<parameter key="attributes" value="../data/iris.aml"/>
</operator>
<operator name="KMeans" class="KMeans">
<parameter key="k" value="3"/>
</operator>
<operator name="Aggregation" class="Aggregation">
<list key="aggregation_attributes">
<parameter key="a1" value="average"/>
</list>
<parameter key="group_by_attributes" value="cluster"/>
</operator>
</operator>
<operator name="Root" class="Process" expanded="yes">
<parameter key="logverbosity" value="warning"/>
<operator name="ExampleSource" class="ExampleSource">
<parameter key="attributes" value="../data/iris.aml"/>
</operator>
<operator name="KMeans" class="KMeans">
<parameter key="k" value="3"/>
</operator>
<operator name="Aggregation" class="Aggregation">
<list key="aggregation_attributes">
<parameter key="a1" value="average"/>
</list>
<parameter key="group_by_attributes" value="cluster"/>
</operator>
</operator>
Hi,
the problem here is that the [tt]Aggregation[/tt] operator does not look for special attributes when matching the names given as parameters. Hence, you have to make the special cluster attribute (named cluster) to a regular attribute. You can do this by placing a [tt]ChangeAttributeRole[/tt] operator between the clustering operator and the aggregation operator. You can use this code ...
Regards,
Tobias
the problem here is that the [tt]Aggregation[/tt] operator does not look for special attributes when matching the names given as parameters. Hence, you have to make the special cluster attribute (named cluster) to a regular attribute. You can do this by placing a [tt]ChangeAttributeRole[/tt] operator between the clustering operator and the aggregation operator. You can use this code ...
Hope that solves the problem.
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="cluster"/>
</operator>
Regards,
Tobias
Now back to your questions, they are actually ... well .. quite simple!
Hope that helps,
Tobias