Hello. I am new to rapidminer and I am wondering if rapid mine is suitable for my project.
My project needs to analyze the similarity or dissimilarity between the meta keywords contained in web pages.
My basic questions for this type of analysis are:
- Can rapid miner take a list of URLs and crawl those domains grabbing ONLY the meta-keywords. I am not interested in analyzing the content of those entire websites, only the categorization/analysis of the meta keywords contained in the web sites.
- Can rapdi miner do some standard categorization on the meta keywords providing frequency lists and themes of words?
- Can it then produce a graph of that analsyis?
- Can rapid miner be configured to apply more weight to certain words, i.e., the word employment, if contained in a meta keyword on a web page would "weigh" heavier in results than any other words in this analsyis. If so, how is that feature accomplished?
- What would be the general steps to take to import the data and provide this analysis?
Thank you.