🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Grouping lines in input file

User: "Joeyd"
New Altair Community Member
Updated by Jocelyn
All,

I'm new to RapidMiner and the forum so apologies if I ask the obvious in the wrong place - I've looked around on the forum and on the net and could not find what I'm looking for.

The issue I have is this.
One of our suppliers sends us reporting on processed transactions where the data is spread over several lines in the file. It looks a bit like this:
HDR;somedata
DAT1;somedata
DAT2;somedata
DAT3;somedata
DAT1;somedata
DAT3;somedata
DAT1 - etc.
The information I need is in the records DAT1 and DAT3. DAT2 is optional. There is no identifier in the data that binds the 3 records together, the fact that DAT1, DAT2 and DAT3 appear in the file sequentially defines them as a group and DAT1 starts a group.
What I do now is preprocess the file in a bash script that concatenates the 3 DAT lines. This is an annoying extra step on a remote server which complicates and slows down reading in the report file.

Is there a process I can set up in RapidMiner that can group data in the input file by record type so I can process them as a single line? Or am I asking something the tool isn't meant for? What I do with the data is match it against transactions we expect to receive and creating some overviews.

Regards,

Joe

Find more posts tagged with