Information for bachelorthesis
hoetzels
New Altair Community Member
Hello everybody,
at the moment I'am writting my bachelorthesis for a german company.
My subject is to show some possibilities how huge amounts of data can be summarized. The data aren't stored in a database, they arrive for example in a email box with pdf-format or office(word/excel)format. The person who sends the data shouldn't have any work to change or fit the data in a special format.
Is it possible to use a rapidminer programm to get the crucial information out of a mass of data? and can I track information back to the document??
I would be very greatful if i get some inforamtions.
Thanks
Tagged:
0
Answers
-
Hi,
yes this is possible in general. All you need is to design a process that can extract the important content from the text documents. If you then install a RapidAnalytics, it can automatically listen to an email box and retrieve and process each incoming mail.
The real problem lies in finding a good data mining process for the content extraction...
Greetings,
Sebastian0 -
Thanks for the response,
what do you mean with good data mining process (Just in a few words)?0 -
Hi,
that's easy: A good process is a process that fulfills all goals of a given task with a low memory consumption and runtime. Some non functional properties like an easy process setup to make it easy to maintain can be added, too.
Greetings,
Sebastian0