How to enrich a data set with columns from other data sets? Merger of three data sets
Mike0985
New Altair Community Member
Hello RM community,
First of all, I´m an absolute beginner in working with RapidMiner, so please be patient with me. I took a Basketball data set from Kaggle to get into Rapid Miner. I have three data sets, one for the "games_raw", one for the "teams_raw" and one for the "ranking_raw" of the teams. I would like to work with the games data set but there are some columns in the teams and ranking data set I would like to use for enrichment of the games data set (see "games_adj" as target data set). I build up a process but it seems to clumsy.
Do you have an idea how to build up the RM process a bit smarter and faster?
Thank you in advance!
Regards
Mike
Tagged:
0
Best Answer
-
Hi @Mike0985,
thanks for sharing your use case! Sounds cool. The join operator is useful for data blending and merges. But it only take two inputs each time, so you need many “join” operators for multiple datasets. The snapshot of workflow looks fine to me.
If you have several data sets that come in the same structure (same column names, same column type), you can leverage “Append” operator for a quick merge. But obviously your input data are not good for quick appending. Another code-free option is of course Turbo Prep. For beginners, I strongly recommend the online documentation and academy pages. https://academy.rapidminer.com/learn/video/turbo-prep-introduction
cheers,
YY
1
Answers
-
Hi @Mike0985,
thanks for sharing your use case! Sounds cool. The join operator is useful for data blending and merges. But it only take two inputs each time, so you need many “join” operators for multiple datasets. The snapshot of workflow looks fine to me.
If you have several data sets that come in the same structure (same column names, same column type), you can leverage “Append” operator for a quick merge. But obviously your input data are not good for quick appending. Another code-free option is of course Turbo Prep. For beginners, I strongly recommend the online documentation and academy pages. https://academy.rapidminer.com/learn/video/turbo-prep-introduction
cheers,
YY
1 -
Hello YY,Thanks for having a look into my case and for your confirmation that my workflow looks fine. I saw the append operator in RapidMiner but as you said, it only works with same columns and therefore, this operator does not work in my case.Regards,Mike0