-
Can process documents calculate term occurences of all words without having to give it a word list?
I want process document to calculate for ALL the words in the document I send him, but I don't want to have to right them all manually. If someone has a solution, I would gladly take it!
-
Use pdf file name as attribute
Hello everyone :smile: I want to do some simple Text Mining using pdf files in RM but I'm a little stuck right now. I created a process using the loop files and process document operator for reading in several pdf files. As I have a lot of files to analyze, which I also want to compare, I would like to create an attribute…
-
Boilerplate text analysis - text mining
Dear community I'm new to text mining with RM and would like to know, if it's even possible to build a process in RM which suits my research question. I would like to create a process which searches for boilerplate language in documents. In detail I'd like to input management reports from different companies (pdf files)…
-
Getting a Sentiment Score after Processing Documents operator?
Hello RapidMiner Community, I'm trying to read pdf-documents and then perform a Sentiment Analysis using the Process Documents operator. However, I'm having trouble connecting the Process Document Operator to the Apply Model operator. I have tried using the Data to Documents operator, but to no success. Here is a…
-
Rapidminer Process as document
Hey everyone, I created a rapidminer process and I want to get it as an image for showing it in a paper. Is there a possibility to extract the process diretctly as a pdf or png, or do i have to make screenshots from the process?
-
Tokenize operator issue - help request
I have to process some documents where the double exclamation !! when followed by a word should be an individual token by itself (e.g., sentence!! as a token, not 'sentence' and '!!' separate). Similarly, the smiley character : ) is expected to be a separate token. When I use the non-letters mode in Tokenize, the words get…
-
process documents operators report an error
What’s the difference between the "data table" and doc??? Before I imported other cvc files can be run. I don't know why this cvc file is not "document" What am I going to do :'( :'(
-
Problems with processing the answer from a GET request
Hi guys,I want to mine performance data of footballers for an essay. As a source I found Goaloo1 (I cant post links yet). The problem is that they don't provide the information in a file, so I want to use the Web Mining Extension instead. I managed to identify the GET request URL that provides all the data for a given…
-
Process Help: Correlation and Regression
Hello! I have the following data (74 observations) and am trying to identify the process and operators to conduct an exploratory analysis (Correlation and Regression) of the relationship between the president's political party affiliation and this consumer good's Producer Price Index value which is averaged over each year…
-
process documents from files
Hello. I used the "process documents from files" from the text processing library. In the results view, if I check the text column, it seems that there is only a portion of each file. Is this some sort of abbreviated view, but actually the entire file is stored there or is it likely that I have not entered the parameters…
-
process documents error the example set must contain at least one text attribute
Hi, I am new to Rapidminer. I am getting an error that tells the example set must contain at least one text attribute. But, I set those as text attribute in Set Role operator: By the way, there was no type 'Text' at the Import Configuration Wizard: RapidMiner version: 9.8.001 OS: Windows 10
-
Regarding Semantic Analysis
Can someone tell how to implement or add semantic analysis into my process? means i need to compare not only similar words but also the meanings of 2 words are same in a single document.
-
How can I plot the frequency of word?
Hello everyone! I'm trying to use the operator Generate Gaussian in order to plot the frequency of words, but comparing my results (calculated manually) with them they're really different. I need this operation to understand which values to discard through the pruning. What's the formula that RapidMiner uses to create…
-
The role of dimensionality reduction with regard to Clustering approaches
Hello Community, I plan to evaluate several Clustering techniques on a TF-IDF bag of words representation where I've previously executed a feature selection to efficiently reduce the number of dimensions of my vector space. In this sense, I've read that Feature Extraction/Transformation approaches get better results with…
-
Interpretation of ROC Analysis
Hello Community, I have derived the following ROC curves by considering four classification models: As you see, SVM and k-NN generates a curve where shades respectively exist. Would it be a correct implication out of the graph to say that only k-NN and SVM were able to learn based on the given dataset and the resting two…
-
Error: I can't import a process that I created from an education license
I currently am integrating RapidMiner to my Java application, I already have a process but when I imported it through the open source version of RapidMiner it has error messages: Can someone please help me? Thanks
-
Matching item codes
I have the following issue: I have a list of products (incl. item name, liter size and degree of alcohol) from a supplier and have to match it with the ones from our company (with a list that shows comparable product attributes (however, exact product names can vary, but will be similar). Currently it's a manual process,…
-
Concatenate words in comments
Hello there! We are currently writing a research project on microtransactions using natural language processing. We have a Excel file containing 450.000 comments. As to capture as many comments related to microtransactions, we would like to concatenate som variations of the spelling e.g. Microtransactions = "micro…
-
"How to get Meta Data with the Data to Documents and Process Documents operators"
I'm performing text analytics and am struggling with Meta Data. In the toy process below, there should be meta data available to the Data to Documents operator, but if you want to specify weights and click on Edit List, the source attribute doesn't populate, so you have to type it manually. Seems an unnecessary chore if…
-
"Error with Process Documents from Data when attempting text analysis"
Hello, I am very new to RapidMiner, so please forgive any base ignorance on my part. But I am trying to do basic text processing of an excel spreadsheet of tweets and I cannot get the "Process Documents from Data" operation to work correctly. I watched the video tutorial but I am still having problems What I did: 1) Read…