-
How to convert polynomial data into numerical
Hi all, I try to read a csv file which has a Polynomial data type for the currency amount. Is there any operator, which convert the polynomial data type to real or numerical? For example: The polynomial value "$12,052.00" to numeric value "12,052.00". So that I can do some calculation. I tried the operator "Nomial to…
-
How to I sort out various unstandardised data from a single cell ?
Hello everyone, For context, I'm trying to find out which marketing medium is the most effective to be used by Starbucks. Attached is the dataset used and the column pertaining to my part is question 19. As you can see, the data are retrieved from survey forms, as such, the format of their answer varies (e.g. some people…
-
how processing data exploration and data preparation with given dataset
I would like to present my first steps about data exploration and data preparation by presantation slides and provided dataset. My issues are: - lack of exploration knowledge (how to see problematic data in order to optimize data quality) - lack of data preparation (creating baseline model "linear regression" with…
-
add identifier attribute for joining datasets
data set A: a1,b1,c1a2,b2,c2 data set B:x1,y1,z1x2,y2,z2 right now I am manually adding a final row column of data after exporting; for which dataset the data came from ie: data set A: a1,b1,c1,setAa2,b2,c2,setA data set B:x1,y1,z1,setBx2,y2,z2,setB before joining them together, Im wondering if there is a way to do that…
-
How to filter out examples with missing values!
Hello everyone! I am using the data set "Titanic" and am wondering how to filter out the examples with missing values. I see how to filter and see the ones with missing values, but not the other way around!
-
Community repository error, cannot connect to AI Hub repository.
Hi, I'm new to Rapidminer and I've not been able to access Rapidminer community datasets. First off it takes so long to open the drop down then afterwards i get an error. I tried reinstalling the program but i got the same error. "Unable to fetch folder content from repository" "Cannot connect to the RapidMiner AI Hub…
-
Dummy-Encoding of Movie Genres
Hello Community,I am new to Rapidminer and currently trying to prepare a dataset through turbo-prep and I am stuck at the movie genres.The genres are given like : ['Action', 'Drama', ...] I would like to preserve the genres as a attribute, to do this I tried to dummy encode them but then each combination of genres gets…
-
Solved: Data check wizard does not find malformatted data that would allow processing of data files
Hello, I have a csv file with over 200 columns and thousands of rows. Very few of the numerical values are malformatted like 20n7ž¾=5â038525 in single, not adjacent cells. How to perform automatic data cleaning, or labeling, or handling at all? Solved: I forgot to check the option box "Replace errors with missing values"…
-
Changing Data in Rapidminer
Hello everyone, I am currently working with rapidminer and I got a question for which i didnt find any helpful information yet. For preparing my data I have a huge Excel file which i uploaded in Rapidminer. Since I want to analyse the data I need some new rows which I would like to create in Rapidminer. For example I have…
-
Cannot retrieve md data
for a university project we have downloaded data with an existing project with the extension .md but Unfortunately I can't retrieve this data. The following error message always appears: rieve this data. The following error message always appears:
-
Text Mining Extracting Data from text
Dear all, I have a small problem concerning Text Mining with rapidminer. I have a bunch of press releases, all structured the same way. Now I want to extract the headline of the press releases (1st line), the date it was published (2nd line) and the coloured parts of the releases same as the whole paragraph where the…
-
Algortihms are "cheating" and copying right label from other instances
Hi everyone, I have a problem with my model. It should predict a monthly product volume from some given attributes. My (training)data consists of data from ~ 60 past month. Each instance in the dataset represents one day. Two given attributes are the "month" and the "year". The label is the product volume at the end of the…
-
Turning JSON Data into Spreadsheed
Hi there RapidMiner Community, I'm currently trying to load JSON data into RapidMiner to use it for a sentiment analysis I'm working on. Browsing threw a lot of the forum content on here regarding the work with JSON data and trying out a lot of the solutions, I sadly haven't found any that worked for me. The data should…
-
Removing mentions with "@" and emojis from Excel Data
Hello RapidMiner Community, I am currently working on a supervised sentiment analysis. I had success doing the sentiment analysis itself, but I'm not quiet happy with the data it uses. As part of the data preparation, I wand to remove mentions (thus names following an "@" ) and I have tried out some suggestions. The…
-
Can I import a Kinect file to Rapidminer?
Hello, My name is Maulana, currently live in Indonesia and a master degree student. I wanna ask some question because I am a newbie in rapidminer. Can I import a Kinect censored file to rapidminer? Or maybe some extension that can help and solve this case? Thank you for your answer, appreciate it. Stay healthy everyone!
-
How to resolve 100% Data accuracy in rapid miner ?? [Urgent]
Hello everyone, The aim is to catch and predict fraud cases with optimum accuracy based on the dataset provided. For example, cases that are nominated to be fraudulant and turn out to be non fraudulant are not as critical as cases which are predicted to be non fraud and turn out to be. For this, I wanted to use the…
-
how to make a football ranking?
Hi guys, so i'm stuck to make a football ranking with my data. i did the data processing but now i don't have any idea how to group each team with the points. So if home team won + 3 Home team lose + 0 scoreless + 1 away team + 3 away team lose + 0 it would help me a lot :)
-
Data Set for Tutorial
Hi, can somebody please share with me the Passenger data set that Dr. Ingo Mierswa is using in his tutorials ?
-
How to select the right data for prediction?
Hi All, I have about 2 years of historical data which I can probably use to predict responses. For example if I have to predict my response rate for Jan 2020 how can I say how much data would be enough to come close to actual rate. ------ should I look at how my data performed in Jan 2018, Jan 2019 and may be last 4 months…
-
Combine two files
hi Is it possible to combine two data(train+test) and make a new data which the train part has label and the test with out label? thank you
-
Rapid Miner vs R Decision Tree Classification
Is there are any differences between the decision tree models generated in RapidMiner and R? Is there a comparison between them?
-
Performance Comparison between Naïve Bayes, and Decision Tree?
Both the Naïve Bayesian and the decision trees algorithms are classification algorithms. A Naïve Bayesian predictive model serves as a good benchmark for comparison to other models, while the decision trees algorithm is the most intuitive and widely applied algorithm. Which one has the best accuracy?
-
Creating an attribute with reference values from another ExampleSet
Hi, everyone. I'm stucked some days with the creation of an attriibute that must be filled with the values from another example set, result from an aggregation operation. There is a "CNPJ" attribute on the main example set, that has it's values repeated over the 25.000+ rows. The aggregation set is made of 700+ rows of…
-
How to invert the order of data?
Hello, I am sure this is quite simple but I just do not know how to do it because I am completely new to RapidMiner. This is a graph of XRP (digital asset) price: ( I was not allowed to post it because I am a new account ) But basically: The CSV file that I downloaded goes in the DESCENDING ORDER (from 2018 down to 2013)…
-
Nominal to Binominal
why is nominal
to binominal used in a correlation matrix? What significance can this new data type have? Currently within the process for this matrix, I have the following operators: retrieve Set Role Correlation Matrix