"About text clustering."

New Altair Community Member
Updated by Jocelyn
Hi,
Now I am working with KMedoids clustering with Text data.I have input 10 different texts in the text input operator.But RM dividing each text files into 10 rows and applying clustering on the divided data.Is there any way to take the whole text as a single row.
Thanks
Maria
Now I am working with KMedoids clustering with Text data.I have input 10 different texts in the text input operator.But RM dividing each text files into 10 rows and applying clustering on the divided data.Is there any way to take the whole text as a single row.
Thanks
Maria
Find more posts tagged with
Sort by:
1 - 3 of
31
Hi Simon,
Thanks for your help.
Here I am attaching the xml.
<operator name="Root" class="Process" expanded="yes">
<operator name="TextInput" class="TextInput" expanded="yes">
<list key="texts">
<parameter key="fcontact" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="odiary" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="diary" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="clipr" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="updated" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="crts" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="field" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="aplan" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="subrogation" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="closing" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
</list>
<list key="namespaces">
</list>
<operator name="StringTokenizer" class="StringTokenizer">
</operator>
<operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
</operator>
<operator name="TokenLengthFilter" class="TokenLengthFilter">
</operator>
<operator name="PorterStemmer" class="PorterStemmer">
</operator>
</operator>
<operator name="KMedoids" class="KMedoids">
<parameter key="k" value="3"/>
</operator>
</operator>
Thanks
Maria
Thanks for your help.
Here I am attaching the xml.
<operator name="Root" class="Process" expanded="yes">
<operator name="TextInput" class="TextInput" expanded="yes">
<list key="texts">
<parameter key="fcontact" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="odiary" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="diary" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="clipr" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="updated" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="crts" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="field" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="aplan" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="subrogation" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
<parameter key="closing" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
</list>
<list key="namespaces">
</list>
<operator name="StringTokenizer" class="StringTokenizer">
</operator>
<operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
</operator>
<operator name="TokenLengthFilter" class="TokenLengthFilter">
</operator>
<operator name="PorterStemmer" class="PorterStemmer">
</operator>
</operator>
<operator name="KMedoids" class="KMedoids">
<parameter key="k" value="3"/>
</operator>
</operator>
Thanks
Maria
I guess there is something wrong with your process setup. RM does not divide texts into lines normally. Can you post your process?
Cheers,
Simon