Hi,
I really need the help of you as a community. I already tried out all solutions that were suggested to others in community posts regarding the filter stopwords operator but nothing worked so far. I have reviews from which I want to extract topics with LDA. I followed tutorials on how to pre-process the data and filter stopwords etc. but unfortunately, it does not seem to work. Despite the transform cases into lowercase I still have words with capital letters in my output and it does not filter out the stopwords I attached in the .txt file. Also, the replace token operator does not seem to work. As I have the filter Tokens by POS operator (that takes a lot of time) I used a sample of only 100 (what can be enabled any time). I also tried it without the filter tokens by POS and with the whole data set. Unfortunately, it just does not seem to work. I attached all my files and processes. Could you please help me with my process? Thank you so much!
I am not sure if this goes too far for one post but can someone also tell me how to find out the ideal number of topics for LDA?
Thank you, Larissa