optimization of number of topic on an LDA topic analysis
Studentul_86
New Altair Community Member
I am quite fresh user of Rapid Miner.
Currently I'm trying to run an LDA topic analysis. Looking on the comments up-to-now, I tried to use the Optimization Operator grid setting up the number of topics as the parameter to be minimized, based on the results obtained from the LDA analysis.
However, I get an error showing "The setup does not seem to contain any obvious error, but you should check the log messages or activate the debug mode in the settings dialog in order to get more information about this problem." I tried to change the set-up of the optimization operator, by making the error to be ignored when appeared during running the process. However, it does not work.
I'm attaching the error appeared and the process and the sub-process considered.
I would like to seek help on this issue. Your help is highly appreciated. Thanks.
Currently I'm trying to run an LDA topic analysis. Looking on the comments up-to-now, I tried to use the Optimization Operator grid setting up the number of topics as the parameter to be minimized, based on the results obtained from the LDA analysis.
However, I get an error showing "The setup does not seem to contain any obvious error, but you should check the log messages or activate the debug mode in the settings dialog in order to get more information about this problem." I tried to change the set-up of the optimization operator, by making the error to be ignored when appeared during running the process. However, it does not work.
I'm attaching the error appeared and the process and the sub-process considered.
I would like to seek help on this issue. Your help is highly appreciated. Thanks.
Tagged:
0
Best Answer
-
Hi @Studentul_86,
Strange in deed ....
Can you share your process and your dataset so that we can reproduce and understand what is going on ?
EDIT :
I observe that you have not connected the performance output of the LDA operator to the performance output of the subprocess Optimize parameters operator. Please connect this output port to the relevant port.
Regards,
Lionel0
Answers
-
Hi @Studentul_86,
Strange in deed ....
Can you share your process and your dataset so that we can reproduce and understand what is going on ?
EDIT :
I observe that you have not connected the performance output of the LDA operator to the performance output of the subprocess Optimize parameters operator. Please connect this output port to the relevant port.
Regards,
Lionel0 -
Hello Lionel,
Thank you four feedback.
I've made the modification you mentioned me.
Is there a number of text limit for this operator to work properly, as for instance for the sample of 70 PDFs with an average of 50 pages, it gives me again the same error. I tried to upload part of the files I was using for analysis, but seem the zip files are not allowed. Or is there the maximum number of topics to be set up to a lower level...?
Thank you.
Best regards,
Valentin.0 -
Hello colleagues,
I really need your support for this weekend, as the due date for my paper is end of next week.
Can somebody help me on how to set-up this optimization parameter? Or is there an alternative Rapidminer tool for this purpose?
Thank you for your support.
Valentin.0 -
@Studentul_86,
Can you please share some of your .pdf files so that we can run your process...?
Regards,
Lionel0 -
Hello Lionel,
This is just one of the samples I need to figure out the optimal number of topics for further LDA topic modelling.
Thank you for your support.
Valentin.0 -
the second part of files I've struggled to analyze for optimization of the number of topics.0
-
I tried to upload a zip file, but seem the platform does not support such a format.0
-
@Studentul_86,
I think the problem is due to the huge numbers of .pdf files you want to analyze
The process works just fine with 2 .pdf files...
Try to decrease the number of documents you want to analyze at the same time or try to run your process with a more powerful machine (more memory)
Regards,
Lionel0