[SOLVED] Information Extraction Plugin TreeCreatorAndPreprocessor Problem

Question

Hi, I'd like to use Information Extraction Plugin 1.0.2 for Rapidminer, but i have a problem. I'd like to create the parse tree with TreeCreatorAndPreprocessor Operator and visualize it with ParseTreeVisualizer Operator. I'm using only one sentence stored in an text-type-attribute (the attribute name is textattribute in my workflow). I have two workflows: 1; if the tree is already given by its string representation like Stanford Parser: (ROOT (S (NP (NNP Felix)) (VP (VBD went) (PP (TO to) (NP (NNP New) (NNP York))) (S (VP (TO to) (VP (VB visit) (NP (NP (DT the) (NN statue)) (PP (IN of ) (NP (NN liberty)))))))) (. .))) 2; if a sentence (Felix went to New York to visit the statue of liberty.) is contained in the attribute selected by the parameter valueAttribute The TreeCreatorAndPreprocessor Operator generates an empty (?) structID, and the object-attribute which is used to store the parse tree can not be created, and the parse-tree Visualizer prints nothing. In the 1; case, the "needParsing option" need not to be selected, because my sentence are already parsed. In the 2; case, the "needParsing option" need to be selected, because i have only simple sentence without parsing , but where can i download the MODELFILE if i want to create the parse tree? I found the Stanford Parser modelfile: englisgPCFG.ser file (i attached it in the llink), is this correct for the modelfile? If it is not correct modelfile, where can i download? Could you help me with your answer or correct my workflow that it can create and visualize the parse tree or send me a sample process where the TreeCreatorAndPreprocessor Operator works? I tried to use several verison of Rapidminer (5.0, 5.1, 5.2 and the newest one too). I attached my two worflow rapidminer files with my two input text files, downloaded modelfile, and the plugin. http://dobi.web.elte.hu/rapidminer_workflow.rar Thank you for your help in advance, Best regards Hadobás András first workflow xml file: second workflow:

chaitanya_live · Answer

I've tried a lot of versions of the Stanford parser to see if the older versions can produce models that can be used in this same issue with Tree Creator and Pre Processor with text document. But the result is the same. ID  & structID attributes with all missing "?" values.
Please advice the solution.

dobiHUN · Answer

mahdilashkari  wrote:
Hi
How did you solved your problem? I have also this problem

Hi,

This is Felix Jungermann answer:

Dear Andras,
the first rmp-file is almost correct: the 'bracketed' String is converted into a special attribute type which is contained in the structID-Attribute. That attribute only is used by certain tree-structure-operators. To visualize the tree, you should use the Visualizer like I did in the attached rmp-file.

For the second experiment, you should use the attached model. Unfortunately, the used Stanford parser is not up to date, which means that only older models can be used.

Additionally, I do not work at the university any more so I am not really able to keep the plugin up to date.
But I will try to help you if you have further questions.

Best,
Felix

Yout can download
the old one: http://dobi.web.elte.hu/rapidminer_workflow.rar
the new one: http://dobi.web.elte.hu/Andras.rar
Use the modelfile, which is in the Andras.rar, and this will work.

Bye,
Andras

lashkari_ma · Answer

Hi
How did you solved your problem? I have also this problem