First steps. Need help in clustering

Antonios1
Antonios1 New Altair Community Member
edited November 2024 in Community Q&A

hi,

I create a fictious dataset using Excel RANDBETWEEN function. The dataset is composed of 18000 rows and two columns. Columns A contains IDs  with values ranging between 1 and 100. Column B contains an hypothetical expense amount between 0 and 50000 for each ID numbers except for ID number 100 whose column B corresponding expense range is narrower and comprised between  48000 and 50000.

Let’s suppose I don’t know how the dataset is composed and I’d wanted to see it there is one ore more IDs with anomaly concentration (I mean I would like the analysis to spot ID number 100 with its concentration between 480000 and 50000), what kind of analysis I should perform? I tried with clustering (k-means),  but without success; probably I do not know the steps to follow to perform the analysis. Might somebody help me?

Welcome!

It looks like you're new here. Sign in or register to get started.

Best Answer

  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓
    Try some of the operators in the anomaly detection methods available in the free extension of that name.  LOF might be particularly useful in this type of context. 

Answers

  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓
    Try some of the operators in the anomaly detection methods available in the free extension of that name.  LOF might be particularly useful in this type of context. 
  • Antonios1
    Antonios1 New Altair Community Member
    Thanks for helping Brian. I am really new at Rapidminer and AI, so forgive me if I do not use the relevant terms. Anyway, I am sorry I was unable  to test the LOF operator. I downoload the anomaly detection extension and used the LOF operator. I connected my file through the  out port to  the exe port on the LOF operator and connected the exa operator port to the res port. The process seemed  to take a lot of time to give an output so I stopped it after a few hours, I run it again this morning before going to work and  once back at one, I found the software crashed. I have launched it again to see how it proceed. Now it has been running for about 1 hour and still going. Pc is an i7 with  16GB Ram.


  • Antonios1
    Antonios1 New Altair Community Member
    Thank you, Brian. It works. I had the possibility to run the operator on a different pc and it worked correctly. It also seems to be quite immediate to interpret the result..

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.