Hi,
I am trying to detect expense claim fraud using rapidminer. I am not too sure what is the suitable modelling technique, thus I tried out k-mean clustering.
I have a huge data containing the following attributes, basically only amount is numeric and from my understanding k-mean can only use to analyze numeric.
- date
- employee
- amount
- expense type
etc
I have done the process and output as below: Basically, I just filter one employee at a time and select the amount attribute.


Qn: How can I analyze from the output to detect if there is any fraud claim?
Thanks.