-
Process Mining in Rapidminer
Hey! I'm currently working on my Master's thesis, which focuses on improving Process Mining visualizations to enhance user interaction within Moodle, the Learning Management System. I'm using RapidMiner for this project and have been experimenting with the following operators: Attached are a) the visualizations I've…
-
Get all links from Li tags on a web page from Li tag
Hi, Trying to get all the links in html li tag. any help please? Tried few approach's. I am not getting any result. Process document or extract information did not work Here is the sample data. Need link in a tag in each * <div><div class="container"></div><div><div class="catproducttitle"></div><div><h2>All Furniture…
-
Can't use Crawl Web with Rapidminer Studio 10.3
Crawl web instantly throws an error when I click the run: "org/apache/tika/parser/html/HtmlParse Exception: java.lang.NoClassDefFoundError ". Some people said it was working fine in version 9.x but now there is something strange and rapidminer can't access the htmlparser api.
-
Get Page operator stalls Rapidminer (SOLVED)
When running the process below (with web mining and text mining extensions loaded) RapidMiner stalls when trying to display the results. It eventually shows the results but something seems to be running in the background and it makes RapidMiner very sluggish. I've been using this for years. Also tried version 10 and I'm…
-
Operator Crawl: Process failed
Hi, I have installed the lastest version 10.1.001 and I have a problem with the operator Crwal. The process fail and here the error message. I have checked the version of Java and the version is 1.8.0_361 * Exception: java.lang.NoClassDefFoundError * Message: org/apache/tika/parser/html/HtmlParser * Stack trace: *…
-
Currently training with Text Mining, always receive Process Failed during Extract Content
Every single training example this happens whenever it gets to Extract Content from data. I always disable it so that i could continue with the tutorial as i was plugging in my own data sets. Now that it's a little complicated, i decided to use the samples but it errors here. I've also encountered this on another machine.…
-
Operator 'Get Pages' not running on AI Hub
Hi I have a process running on an AI Hub where I have the operator 'Get Pages' (ext. Web Mining) embedded. When I run the process in RM Studio everything is fine. When I run the process on AI Hub but started it from the RM Studio ('Run Process on AI Hub'), everything is fine. But when I kick off the web-service I created,…
-
Problems with processing the answer from a GET request
Hi guys,I want to mine performance data of footballers for an essay. As a source I found Goaloo1 (I cant post links yet). The problem is that they don't provide the information in a file, so I want to use the Web Mining Extension instead. I managed to identify the GET request URL that provides all the data for a given…
-
Rapidminer Go
I have some problems with active license Rapidminer Go . I used Rapidminer Go trial before and that Expired (Tue, Jun 15th 2021)(Education account) So I want to come back to use it at present but the problem is I can't click "Subscribe Now" button i don't know why? So I try to go fill Billing Information for subscription…
-
How to web scrap and process data?
Hi Community. I am new to rapidminer. I'm trying to scrape Income tax rates from https://www.gov.uk/government/publications/rates-and-allowances-income-tax/income-tax-rates-and-allowances-current-and-past#tax-rates-and-bands into a table format containing all the various rates on this page. I want guidance on which…
-
Twitter Sentiment Analysis from RapidMiner Template
Dear community, I am a beginner in machine learning field and I have just started with my first project on sentiment analysis. I have decided to use the template provided by RapidMiner -> It works perfectly, however I am unsure whether I have chosen right input data. In the 1st box "Retrieve Historical Sentiment", I put my…
-
Read Excel Table with 300+ URLs and get Page Informations
I would like to get Informations such as the Response Code, Response Message, Content Type etc. of the URLs in my Excel Table. I used - Read Excel -> Store -> Handle Exception (Get Pages) -> Store - as my Process Chain. For some reason I only get the URL as my Result instead of all the Information I want. Hopefully someone…
-
Not getting any results for "Process Documents from Web"
I'm trying to perform web scraping on a URL by using "Process Documents from Web" operator, and have set a xpath query using "Extract information" operator. I have tested the xpath query at google spreadsheet "importxml" function and it seemed to work fine. However, when I run the process in rapidminer, it does not return…
-
Read Rss Feed Operator: Adding multiple URLs to the operator
Hey there RM community, I'm working on a project for school. The idea is use the RSS feed reader operator and extract different pieces of information from it. The problem I'm encountering is that I don't know how to input multiple URLs into the operator. I've searched this up, and the results have practically flown over my…
-
Supervised Sentiment Analysis - Removing @
Hi there, I'm currently working on doing a supervised sentiment analysis with Instagram comments. One of the issues I'm having is that there are a lot of comment replies, which start by mentioning the name of the person that the reply is directed at. So one person comments on something and another person replies to this…
-
Webmining: need help for webcrawling with
Hello community members, I am looking for a way to do web crawling. Now I have read in the forums that https websites cannot easily be crawled using the operator "Web Crawl". You would have to use a combination of "get pages" and "loop", like described (from Telconstar) , but I haven't found anything about this approach…
-
Comparing texts.. is Cross distance - the right approach?
Hello community, we have a project in which we want to compare learning contents of our university course (script) with different Udemy courses. Reference set: We have read in the script of our professor and a book of the lecture as a .PDF document and generated a list of words with the Text Processing Extension, which are…
-
Issue with Web Mining
Hi, for the purpose of web mining, I couldn't change the website links to "file_path". I guess it was straightforward in the older version but i dont know how to get it done in the new version. Can you please help? Can auto model help me with that? I have the link for video and a screen shot as well.…
-
University Project: Compare similarity of two data sets
Hello everyone, for a university project in the 1st semester we want to match data from lecture notes with appropriate Udemy courses.We have already done the crawling of the lecture contents and the Udemy courses. Now the questions would be, which procedure would be the best for us. How can the "best" or most suitable…
-
How to enrich my data using doi as a feature and generating a new feature with the abstract?
Hi my RM friends How to generate a new attribute with the abstract text of PUBMED, when using doi? In short from ID DOI to ID DOI ABSTRACT Thanks and stay healthy!!!
-
What Do The Pink and Blue colors mean in a Web Mining Extract Content operator?
After running a basic WebMining model in RapidMiner, using the Get Page Operator, what is the distinction between the pink highlighted and blue highlighted text in the results section?
-
Can I get solution for job related web scrawling from SEEK,INDEED?
Hi , I am doing research on job skills assessment as academic project. I am looking for web crawling script or solution for Job post details like job roles,location,skills and knowledge from job portal web sites like Indeed, Seek.Kindly help me in this matter.
-
How do I use
I'm new to RapidMiner and currently I'm using it for my bachelor thesis. I want to analyze websites of specific companys. If I use the operator "Get Page" I receive just the first page but I want to use every page of the website. How can I solve this problem? Thank you very much for your help in advance!
-
Web Crawling for contact directory
I'm trying to crawl this site to create an Excel document containing the the names, locations, phone numbers, and specialty type of individual practitioners on https://www.psychologytoday.com/us/therapists The link above has links underneath for each state, and each state has about 50 pages or so of contacts. I'm just…
-
Keep ID in Extract Structured Data output objects + combine in one ExampleSet in Object Collectio
I have tried many different threads to solve what seems like it should be simple - (https://community.rapidminer.com/discussion/18154/solved-joining-examplesets-of-a-collection / https://community.rapidminer.com/discussion/38582/problem-with-combining-all-example-set-from-io-object-collection seeming close) The basic…