-
How to i use optimal K in applying the model when i’m supposed to classify a market category?
I’m supposed to split the dataset into 70% training and 30% test datasets. Apply the k-Nearest Neighbor algorithm using the optimal k. Interpret the results. Discuss theperformance of the predictive-type or classification-type learning. I tried using KNN clustering but i just ended up analysing the financial ratios. Please…
-
Clustering k-means
Hello everyone, I am looking for a way to cluster data. With the tools I am using, I cannot directly find the right number of k, so the data is put into the number of clusters I have set k to. Is there any way/tool I can find the right number of clusters without knowing it beforehand? And what kind of function should I use…
-
DBSCAN performance evaluation
Hi! I'm facing the following problem. I have to compare the performance of K-Means and DBSCAN on a given dataset. I can easily do it with K-Means: But I can't fit the DBSCAN block in place of the K-Means block. The outputs are different and there is no "Cluster Model" to feed the Performance block. The old version of…
-
Why are DBI using Local Random Seed and Determined Good Start Value same in K-Means
Hi, i'm working on text clustering using K-Means and Singular Value Decomposition (SVD). And i'm using parameter Local Random Seed and Determined Good Start Value to show the different. But the DBI value generated using Local Random Seed is always the same as Determine Good Start Value even though I have tried to enter a…
-
How to use optimize parameter (Grid) for Text Clustering (K-Means)?
i tried using optimize parameter grid module for K Means but its not working. I dont know how to use it properly. i searched online but all i can find is for clasification algorithm.
-
How to make K-Means Clustering using Month/Date
Hello I am writing a paper named "A Clustering Analysis: Political Violence Events and Fatalities in the Philippines" and I would like to make clusters based on month and date if possible, I am relatively new to using rapidminer, so here's what I have so far, I mapped the nominal months to make it numeric but now I have…
-
Applying a Cluster Model to a Different Path
Hello, I'm trying to use a cluster analysis to group regions within Texas that are not receiving cable spend with similar regions in Texas that are receiving cable spend to create a test & control experiment and then analyze web traffic among the regions with spend vs without to measure traffic lift. I used the multiply…
-
How do I create balanced clusters?
Hi guys, I'm pretty new to the community so sorry if my question will seem quite elementary, but how do I create balanced clusters (k-means) - meaning that each cluster will have the same size of items in it? Or is there a way to force a minimum cluster size to anything else than 1? (What I am trying to do is to create…
-
Cluster comparison
Hi, In clustering, is there a way to produce cluster evaluation scores? such as the silhouette score. Or other scores that tells you the within-cluster variability?
-
Clustering reviews into three categories
Hello everybody, I have a quick question, I have a reviews (they are all stored inside one attribute) I need to split them into 3 categories( so I assume clustering?) into like positive, neutral and negative reviews,everything I tried ( with different k-means clustering, different parameters ) I cant get the right results,…
-
RFM scores - milliseconds
Hello, i am trying to calculate RFM and i am stuck. 1)I saw that before the Aggregate operator (where calculate the R=max of InvoiceDate or min of InvoiceDateMillisecond, F=count of InvoiceNo, M=sum of TotalPrice) i need to calculate first the milliseconds from the InvoiceDate. Then should calculate also the difference…
-
Analyzing Categorical Data or Polynomial Data in Rapid Miner
Hi! I need help. I am currently task to analyze and cluster the data that I've got in our Learning Management System. But, I have a problem. I am really new to R and I do not know how to perform analysis on the Polynomial data. It doesn't allow me to normalize the data or put K means clustering to it. Please help the…
-
How to convert k-means clusters into a prediction with labeled data
Hi, for an assignment in the form of a Kaggle competition (called 2nd Assignment DMT, 2022 VU Data Mining Techniques Cup) I have a very big labeled dataset with data about customers that search and book hotels on a website. Each row has a search ID (so one customer can have multiple searches). A search is a hotel which has…
-
How do i change Color of my result of Clustering in Rapid miner?
-
Cannot find the Expectation Maximization Operator
I have been through the documentation and know where the operator should be, but its missing from the list. Also oddly the documentation can only be reached through a direct search and is not listed on the operation ToC. I also searched the marketplace for extensions but found nothing useful. Am I missing something simple…
-
How to create Davies Bouldin Index for algoritm DBSCAN ?
Hello... I have a problem, i want to try different algorithms K-Means and DBSCAN on my dataset and compare the result to find Davies Bouldin Index best validation cluster. i can already to make K-means with result DBI but, i can't make modeling design for DBSCAN to find DBI (Davies Bouldin Index). anyone know to create…
-
evaluating clustering algorithms?
We are working on text clustering for the data science project we find a few algorithms that can work with text like-K-means-K-medoids These two are centroid clustering and we use Davies Bouldin evaluation metrics to evaluate them -Agglomerative clustering-Top-down clusteringThese two are hierarchical clustering but we…
-
Cluster Parameter
Hi!I have a qustion with clustering k-Means. I entered 40 parameters in the clustering and got cluster result. Is there any way to know the effect of these parameters on the results? That is, which parameters are most relevant and which parameters are not so important to the clustering result? Thank you!
-
Some Questions around Clustering
Dear all, Since im neither mathematician nor a computer scientist the answer to the following question might be an easy one for you guys here but for me it doesn't make sense at the moment. So I would kindly ask for your support on the following questions: My goal is to do a clustering with this data a…
-
Replace missing values with average in each cluster
Hello, I'm new to Rapidminer and I would like to replace missing values based on clustering, which means I have used k-means on columns which have no missing values and divide the original exampleset into 5 clusters. Now I would like know how to replace each row's missing values by the averages of the cluster it belongs to…
-
k-means and fp-growth
Hello to all friendsI intend to use FP-growth With k-means clustering and compare the result with when I use only PF-growth. I have done this implementation but the results of both implementations are the same while they should be different. Can anyone tell me where I's wrong and what I need to change? I have used the…
-
use k-means and fp-growth together
Hi every body My dataset is movielens 100k. I want to use k-means clustering on this dataset then use Fp -Growth on each cluster but idont know how I can do this work. I hope to use from you re help. thank you .
-
Computations for Cluster Distance Performance operator
I am having trouble replicating the computations of the "avg. within cluster distance" metrics produced by the Performance (Cluster Distance Performance) operator. The operator documentation states - "avg._within_centroid_distance: The average within cluster distance is calculated by averaging the distance between the…
-
how to get performance than k-Means Clustering?
Hi everybody I have a problem. I want to use performance after k-Means Clustering. For this aim I must to use map clustering on labels after clustering and when I run this project I saw an error and I must to changing the number of K, while I am not allowed to change the number of K because I am doing thesis and it not…
-
How to Deploy non-predictive model such as Outlier detection/ Clustering model using custom deploy
I have an RM process that contains the K-Means and CMGOS model for clustering and outlier detection, but when I try to deploy these model using Custom Deploy it prompts me to upload either predictive model or Group model containing the last model as Predictive model, Since I do not have a Predictive model in my use case…