Splitting output into multiple (many) csv

MichaelWall
MichaelWall New Altair Community Member
edited November 2024 in Community Q&A

Hi

 

Question from a newbee. I have a process built in RapidMiner studio that creates an output containing anywhere between 100 and 5000 rows (depending on starting input). I want to write out the output as one csv per row. At the moment I can get the full data set using the Write CSV operator, but that just gives me one file with everything, when I want 1 csv per record. I've tried doing this in post-processing by adding a new section to the Python script that handles the data after it's been through the process, but the formatting of the CSV is causing problems. I really want it to come out of RapidMiner in separate files to maintain the integrity of the results.

 

Any thoughts appreciated?

 

Thanks

Tagged:

Best Answer

  • bhupendra_patil
    bhupendra_patil New Altair Community Member
    Answer ✓

    Hi @MichaelWall

    Welcome to RapidMiner community.

    See if the attached process helps you. You can open this process from FIle>>Import Process

    You may need to change path of the csv location

    But here is what it does

    I am going to loop examples(rows), basically one row at a time,

    Inside the loop you filter to current row number and then write that one row to one csv

     

    the filename is the rownumber.csv

     

    If you need to name the file differenty, then that should be possible with additonal operator, but hopefully this will get you started

Answers

  • bhupendra_patil
    bhupendra_patil New Altair Community Member
    Answer ✓

    Hi @MichaelWall

    Welcome to RapidMiner community.

    See if the attached process helps you. You can open this process from FIle>>Import Process

    You may need to change path of the csv location

    But here is what it does

    I am going to loop examples(rows), basically one row at a time,

    Inside the loop you filter to current row number and then write that one row to one csv

     

    the filename is the rownumber.csv

     

    If you need to name the file differenty, then that should be possible with additonal operator, but hopefully this will get you started

  • MichaelWall
    MichaelWall New Altair Community Member

    Thanks for this, works really well, much faster than the existing process I am replicating. The key thing was to set the iteration macro on the Loop Examples operator to row_number so it indexed through each row.