How to enrich a data set with columns from other data sets? Merger of three data sets

Mike0985
Mike0985 New Altair Community Member
edited November 2024 in Community Q&A
Hello RM community,
First of all, I´m an absolute beginner in working with RapidMiner, so please be patient with me. I took a Basketball data set from Kaggle to get into Rapid Miner. I have three data sets, one for the "games_raw", one for the "teams_raw" and one for the "ranking_raw" of the teams. I would like to work with the games data set but there are some columns in the teams and ranking data set I would like to use for enrichment of the games data set (see "games_adj" as target data set). I build up a process but it seems to clumsy.

Do you have an idea how to build up the RM process a bit smarter and faster?

Thank you in advance!
Regards
Mike



Tagged:

Best Answer

  • YYH
    YYH
    Altair Employee
    Answer ✓
    Hi @Mike0985,

     thanks for sharing your use case! Sounds cool. The join operator is useful for data blending and merges. But it only take two inputs each time, so you need many “join” operators for multiple datasets. The snapshot of workflow looks fine to me.

    If you have several data sets that come in the same structure (same column names, same column type), you can leverage “Append” operator for a quick merge. But obviously your input data are not good for quick appending. Another code-free option is of course Turbo Prep. For beginners, I strongly recommend the online documentation and academy pages. https://academy.rapidminer.com/learn/video/turbo-prep-introduction

    cheers,
    YY


Answers

  • YYH
    YYH
    Altair Employee
    Answer ✓
    Hi @Mike0985,

     thanks for sharing your use case! Sounds cool. The join operator is useful for data blending and merges. But it only take two inputs each time, so you need many “join” operators for multiple datasets. The snapshot of workflow looks fine to me.

    If you have several data sets that come in the same structure (same column names, same column type), you can leverage “Append” operator for a quick merge. But obviously your input data are not good for quick appending. Another code-free option is of course Turbo Prep. For beginners, I strongly recommend the online documentation and academy pages. https://academy.rapidminer.com/learn/video/turbo-prep-introduction

    cheers,
    YY


  • Mike0985
    Mike0985 New Altair Community Member
    Hello YY,

    Thanks for having a look into my case and for your confirmation that my workflow looks fine. I saw the append operator in RapidMiner but as you said, it only works with same columns and therefore, this operator does not work in my case.

    Regards,
    Mike

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.