Hello Repidminer Community,
since two weeks or so I started to work with Rapidminer for educational purposes. I really like this tool. But right now, I´m really stuck since two days. My problem relates to the usage of the crawling, extract information or cutting operator. To be precisely it´s about using the Xpath query (at the moment I have to use work arounds with regex but they´re are not really consistent).
My problem is the following:
Considerung for example the famous imdb reviews, e. g. http://www.imdb.com/title/tt0307901/reviews?start=0. If I´m trying to extract some specific element I really get stuck with my query. For example if i want to extract single review texts, I tried to use the following selector " .//*[@id='tn15content']/p[1] " as it is suggested by the developer tools of chrome but employing this in Rapidminer I get no single result.
As you might see I´m a total beginner / noob with Xpath (sorry for that, data science in general is a total new area for me and I know I´m getting old with leraning new stuff) but I really couldn´t find for me an understandable answer of such a question in previous threads, they always seem to be to highly sophisticated for my limited personal understanding. So, if you give some hints, examples or resources how to use and practice the Xpath query in Rapidminer, it would be very nice.
Kind regards
Morgan!