Home
Discussions
Community Q&A
Right values for k and max run and dbscan epsilon and min point issue
Elu
Hi All,
Please i would like to know when one can tell the right values for k and max run when using kmeans algorithm. how do i also evaluate and interprete the results to know when k values is right? Is there a comprehensive video/material showing this? also how would i know the right value for epsilon and min point in DBSCAN. How do i evaluate and interprete results. Is there also a comprehensive video/material showing this? Thanks
Find more posts tagged with
AI Studio
Accepted answers
rfuentealba
Hello,
@Elu
Well, this is a tough question: although popular, establishing the value of k for a k-Means algorithm is a frequent topic of discussion and it depends on your experience. I can share two things with you today, though. One is that you may want to use x-Means, which is the same as a k-Means but it determines k based in a heuristic method rather than a manually added value. The other one is that you may want to use the
elbow method
to determine k, which is reasonable.
A good tutorial on this can be found here
.
Calculating epsilon and the min points on DBSCAN is the same principle, but using the k-NN distances in a matrix of points. Calculate the average distances of every point to the k-nearest neighbors, sort those in ascending order, plot the result and find where the knee cuts the Y value, that is your
epsilon
setting. The knee is the threshold where a change happens in the k distance curve. Now, I don't know how to determine k for this, as I've mostly used the same k as in a k-Means.
Hope this helps,
Rodrigo.
All comments
rfuentealba
Hello,
@Elu
Well, this is a tough question: although popular, establishing the value of k for a k-Means algorithm is a frequent topic of discussion and it depends on your experience. I can share two things with you today, though. One is that you may want to use x-Means, which is the same as a k-Means but it determines k based in a heuristic method rather than a manually added value. The other one is that you may want to use the
elbow method
to determine k, which is reasonable.
A good tutorial on this can be found here
.
Calculating epsilon and the min points on DBSCAN is the same principle, but using the k-NN distances in a matrix of points. Calculate the average distances of every point to the k-nearest neighbors, sort those in ascending order, plot the result and find where the knee cuts the Y value, that is your
epsilon
setting. The knee is the threshold where a change happens in the k distance curve. Now, I don't know how to determine k for this, as I've mostly used the same k as in a k-Means.
Hope this helps,
Rodrigo.
Elu
Where do you then input the k means value and squared errors value? Do not fully understand the elbow method
rfuentealba
Hi
@Elu
,
I didn't read this before. The elbow method is just a method to be used to determine the value of
k
in a graphical manner. Basically you put a value for
k
and run the algorithm, showing if the amount of examples on each k change as dramatically as possible. When that point happens, you have your best k. But it's trial and error, as there is no way for us to make sense of each k value.
All the best,
Rodrigo.
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)