Replace Token Solution for abbreviates that are part of other words

New Altair Community Member

Nov 20, 2020

Updated Nov 5, 2024 by Jocelyn

I am working on a text analysis. There are abbreviations in my original text, such as cust or cust. for customer. I can put the replace token operator before the tokenize operator and enter multiple replacements such as replace cust space with customer and cust. with customer, but I am curious if there is a way to do it after the tokenization because it has "grouped" the cust abbreviations together. I did try placing the replace operator after tokenization but it replaced all occurrences of cust with customer, including the full word customer. Any thoughts/ideas? thank you for your help.

Image: https://us.v-cdn.net/6030995/uploads/editor/60/biqvhffabaul.png

Find more posts tagged with

AI Studio

Sort by:

1 - 1 of 11

kayman

New Altair Community Member

Accepted Answer

Nov 21, 2020

@lionelderkrikor probably forgot the dot. There are 2 ways to deal with this, either with + or *

(cust).+$ means you need to have cust followed by at least one character
(cust).*$ means you need to have cust and optional additional characters.

So the last one is probably safer to use

View in context

🎉Community Raffle - Win $25

Replace Token Solution for abbreviates that are part of other words

Find more posts tagged with

Quick Links