Information for bachelorthesis

hoetzels
hoetzels New Altair Community Member
edited November 5 in Community Q&A

Hello everybody,

at the moment I'am writting my bachelorthesis for a german company.

My subject is to show some possibilities how huge amounts of data can be summarized. The data aren't stored in a database, they arrive for example in a email box with pdf-format or office(word/excel)format. The person who sends the data shouldn't have any work to change or fit the data in a special format.

Is it possible to use a rapidminer programm to get the crucial information out of a mass of data? and can I track information back to the document??

I would be very greatful if i get some inforamtions.

Thanks

Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,

    yes this is possible in general. All you need is to design a process that can extract the important content from the text documents. If you then install a RapidAnalytics, it can automatically listen to an email box and retrieve and process each incoming mail.
    The real problem lies in finding a good data mining process for the content extraction...

    Greetings,
      Sebastian
  • hoetzels
    hoetzels New Altair Community Member
    Thanks for the response,


    what do you mean with good data mining process (Just in a few words)?
  • land
    land New Altair Community Member
    Hi,
    that's easy: A good process is a process that fulfills all goals of a given task with a low memory consumption and runtime. Some non functional properties like an easy process setup to make it easy to maintain can be added, too.

    Greetings,
    Sebastian