Stuck at LDA process. No results are coming

lambamanika07
lambamanika07 New Altair Community Member
edited November 2024 in Community Q&A
I updated my Rapidminer and from that instant I can not get any result from my LDA process. I am attaching the screenshot for the process and the sub-processes I am trying out for LDA for last 2-3 days but 'NA' as results is showing. Kindly help.

Tagged:

Welcome!

It looks like you're new here. Sign in or register to get started.

Best Answer

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓

    your file is coded in UTF-8. If you are using windows, you want to change the Encoding of Read Document to UTF-8. Otherwise strange things happend with signs like é.

    Further you should use a tokenize operator before your text mining operators. Operators like 'Stem' or 'n-grams' are working on the tokens. This may have duplicated your data.

    Lastly: Can you quickly confirm that the number of topics you search is < then the number of documents? If you search for 5 topics in 2 documents, that is doomed to fail.

    Best,
    Martin

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,
    can you please check if the collection of documents contains proper documents? I.e there are items in and there is also text?

    Best,
    Martin
  • lambamanika07
    lambamanika07 New Altair Community Member
    Hi Martin

    Yes, I have checked many times. I tried with text files and pdf files both. I tried even with different text samples but I had no luck! The results were coming like in the screenshot as NA.
  • MartinLiebig
    MartinLiebig
    Altair Employee
    edited September 2019
    Hi,
    is this 'western' text? LDA uses a default tokenization on this tokens like spaces and so on. This may totally fail if this is not in latin alphabet?

    Best,
    Martin
  • lambamanika07
    lambamanika07 New Altair Community Member
    Hi Martin

    The text is in English language. I have run the same samples before also for testing few weeks ago and it worked fine. That time I was using the 8 version of Rapidminer. I am facing this problem from the moment I upgraded to the latest 9 version. I do not think the up gradation of the version would be creating any problem but I am telling you just in case. 
  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,
    can you maybe share data and processes via private message? I would love to have a look at this.

    BR,
    Martin
  • lambamanika07
    lambamanika07 New Altair Community Member
    Hi Martin

    I have sent you a personal message with the sample text and the process. Thank you for you help in advance.
  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓

    your file is coded in UTF-8. If you are using windows, you want to change the Encoding of Read Document to UTF-8. Otherwise strange things happend with signs like é.

    Further you should use a tokenize operator before your text mining operators. Operators like 'Stem' or 'n-grams' are working on the tokens. This may have duplicated your data.

    Lastly: Can you quickly confirm that the number of topics you search is < then the number of documents? If you search for 5 topics in 2 documents, that is doomed to fail.

    Best,
    Martin
  • lambamanika07
    lambamanika07 New Altair Community Member
    It worked! Thank you so much. 
  • MartinLiebig
    MartinLiebig
    Altair Employee
    what was the problem here? UTF or the tokenization?

    BR,
    Martin
  • lambamanika07
    lambamanika07 New Altair Community Member
    Hey Martin

    I made both the changes regarding UTF selection and adding tokenization operator as suggested in the process and then it worked. 

    With regards
    Manika

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.