"text mining"

Deepu
Deepu New Altair Community Member
edited November 5 in Community Q&A
Hi, I tried to do some text mining but had much problems understanding the processes. Here is what I wanted to do.
I have about 10 PDF files. The PDF files are about the research articles on gene study. What I wanted to do is, I want to extract the mentioned gene names from the articles. How do I do it. please help. Thank you in advance.

Answers

  • land
    land New Altair Community Member
    Hi,
    actually you are going to do Named Entity Recognition. RapidMiner itself does not support this yet. Please have a look at the extension of the AI chair of the technical university of dortmund, where a NER extension has been published.
    Unfortunately there's no golden button, where you have to click and RapidMiner does what you want to do it, even with this extension.
    So you need to have a lot of experience on the field of RapidMiner and/or NER, without it will be an nearly impossible task to solve for you. You will have to make yourself familiar with RapidMiner first before going for such a hard field and this will take some time.

    I would recommend to take a look at all available tutorial videos on RapidMiner available on YouTube, then go through the sample processes and take a deeper look. Then you have more orientation and can ask detail questions that can be answered. General Questions like "How do I do it" with it being a complete field of techniques can't be answered, I'm sorry.

    Greetings,
      Sebastian