Find Similarities in documents and group them into clusters

Rapid_Ibrahim
Rapid_Ibrahim New Altair Community Member
edited November 5 in Community Q&A
Hi all,

i am new to rapid miner and data mining in general. i run the support team in my organisation and we have some much data from previous resolved cases that can be useful to find slimier issues and present the solution to people encountering the same issues. what we have is a free text filed for the engineer to write the RCA "summery of the issue" and of course the Product filed. my question how can i use Rapid miner to achieve this.

example 
RCA column contains:
1) Client wanted new product key for setting up new environment.
2) customer wanted new appliance product keys for their dev environment
3) new key request
4) new product key
5 ) Request for IB Product keys
6) The new product key for test appliance was requested.
7)  appliance low space
8) The disk was out of space.
9) no space on appliance
10) Appliance went down with 100% disk space filled.

once processed through rapid miner i would like the output to be 


1) Client wanted new product key for setting up new environment.  group 1
2) customer wanted new appliance product keys for their dev environment   group 1
3) new key request  group 1
4) new product key   group 1
5 ) Request for IB Product keys   group 1
6) The new product key for test appliance was requested.   group 1


7)  appliance low space  group 2
8) The disk was out of space.group 2
9) no space on appliance. group 2
10) Appliance went down with 100% disk space filled. group 2

also if anyone has used rapidminer to do support case analysis examples would be much appropriated  


Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,
    have a look at the operator Extract Topics from Documents (LDA) which is part of Operator toolbox extension. That may do the trick

    Best,
    Martin