🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

How to extract a specific part (section) from a large text (txt format)?

Enthusiast21User: "Enthusiast21"
New Altair Community Member
Updated by Jocelyn
Dear RM Friends,

I have 500 txt files containing large Reports and I need to extract only one section of these Reports. As the Reports are each slightly different, the only common patern I can recognise is that the section' headline by all start with the same 3 words, but in the end of each something different is written and the following section is also not the same. My Question is how I can in general extract part of large Texts in RapidMIner (I think I need to use some regular expressions, but so far I could not find anything suitable for my Task).

Thank you very much for your support in Advance! :smile:

Find more posts tagged with

Sort by:
1 - 1 of 11
    kaymanUser: "kayman"
    New Altair Community Member
    Accepted Answer
    Hi @Enthusiast21, as discussed find attached an alternative approach to your problem, first splitting by page (double sided), then filtering on the pages containing your term (REPORT ON THE ANNUAL) and then using a more loose way to figure out what is left or right page content. Seems to be relatively ok this way, and maybe you can take it further from there.