🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Text Processing - Cut Document - Similar entries separated by a number

User: "exmenace"
New Altair Community Member
Updated by Jocelyn
I was having trouble finding the operator documentation that pertains to string matching or cutting documents in general.
I have a few different types of documents (.xml, .csv, .docx, .html) that list records, in order, separated by *Record (n)* in ascending numbers, starting with 1.
Each of these records has similar attributes but it's all unformatted other than the records and attributes being separated by asterisks*.
My hope was to cut the document by record, which I assumed I could do with a string matching query, but I'm not sure how I could do that if each record is different, and the only commonality being the record #, but that's variable so not sure how to input that expression.

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "kayman"
    New Altair Community Member
    Accepted Answer
    Are your records each time on a new line?
    like : 
    Record 1*something*someting else*and again something else
    Record 2*something*someting else*and again something else
    Record 3*something*someting else*and again something else

    or is it more like 

    Record 1*something*someting else*and again something else*Record 2*something*someting else*and again something else*Record 3*something*someting else*and again something else

    In case of the first you could simply use the read csv operator and use the * as the separator. Beware that this is a special character that needs to be escaped, so in order to use it correct you need to enter \* instead of just *

    You could also use the split operator, same here. Use \* to make clear you want to split on the 'normal' asterix.

    If all is in one line I recommend to use the split document into collection from the toolbox extension. 

    I've attached some samples to play around with, hope they get you started.