"Applying Feature Selection on text input"

User: "jebadiah"
New Altair Community Member
Updated by Jocelyn
Hello. I am new to using  RapidMiner so please excuse my ignorance.

I am trying to perform K-Means Clustering on a set of text files. I have downloaded and installed the plug-in needed to input text files. Now, I want to apply Feature Selection to it. However, when I try to, it seems that it needs an ExampleSet to be able to perform the Feature Selection function. Is there a way for me to apply Feature Selection on text input?

Here is how my xml looks like right now:

<operator name="Root" class="Process" expanded="yes">
    <operator name="TextInput" class="TextInput" expanded="yes">
        <list key="texts">
          <parameter key="blogs" value="D:\Text-files"/>
        </list>
        <parameter key="vector_creation" value="TermFrequency"/>
        <operator name="StringTokenizer" class="StringTokenizer">
        </operator>
        <operator name="StopwordFilterFile" class="StopwordFilterFile">
            <parameter key="file" value="D:\stop.txt"/>
        </operator>
        <operator name="StopwordFilterFile (2)" class="StopwordFilterFile">
            <parameter key="file" value="D:\punctuations.txt"/>
        </operator>
    </operator>
    <operator name="KMeans" class="KMeans">
        <parameter key="k" value="8"/>
    </operator>
</operator>


When I try to add the ff:

<operator name="BackwardElimination" class="FeatureSelection" expanded="yes">
            <parameter key="selection direction" value="backward"/>
</operator>

The ff. error occurs:

Error in: TextInput (TextInput) Error in experiment setup: com.rapidminer.operator.MissingIOObjectException: The operator needs some input of type com.rapidminer.example.ExampleSet which is not provided


Can anyone please suggest something to help me do this. Thank you very much. :-*