"Java integration - web mining: Get Page"
Dear All,
I'm interested in using RM's capability in my own Java program. I want to use the 'Web Mining' extension available in RM in order to retrieve web pages and do further processing afterwards.
Through my program I can initialize RM already. Then I created the 'Get Page' operator and set up all the required parameters. OK, so now my problem is, how can I retrieve the result of 'Get Page' operator? In RM5 I can see that the output of 'Get Page' is a Document. But unfortunately, I cannot find Document even in the javadoc. Below is the my code so far.:
Thank you!
I'm interested in using RM's capability in my own Java program. I want to use the 'Web Mining' extension available in RM in order to retrieve web pages and do further processing afterwards.
Through my program I can initialize RM already. Then I created the 'Get Page' operator and set up all the required parameters. OK, so now my problem is, how can I retrieve the result of 'Get Page' operator? In RM5 I can see that the output of 'Get Page' is a Document. But unfortunately, I cannot find Document even in the javadoc. Below is the my code so far.:
Can anyone help me on how to obtain the result from 'Get Page' operator?
RapidMiner.init();
try
{
// Load dataset
Operator op = OperatorService.createOperator("web:extract_html_text_content");
op.setEnabled(true);
op.setExpanded(true);
op.setParameter("random_user_agent", "true");
op.setParameter("url", "http://news.google.com");
IOContainer container = op.apply( new IOContainer() );
}
Thank you!