Remove all HTML labels in a message field

User: "bea11005"
New Altair Community Member
Updated by Jocelyn

Hi everyone!

I want to delete all HTML labels in a message field, so I could count characters from the message without them with lenght operator.

How can I do it?

 

Find more posts tagged with

Sort by:
1 - 8 of 81
    User: "Thomas_Ott"
    New Altair Community Member

    You'll need the Web Mining extension for that. It has the ability to get rid of HTML tags. 

    User: "bea11005"
    New Altair Community Member
    OP

    I have to remove HTML labels of an attribute of a dataset.

    Which operator should I use?

    User: "Thomas_Ott"
    New Altair Community Member

    Depends on how your data is set up but I would look at the Extract Content, Unescape HTML, or Unescape HTML Document operators .

    User: "bea11005"
    New Altair Community Member
    OP

    I will try. Can i do it with a regular expressions tha delete everything between <> symbols?

    User: "Thomas_Ott"
    New Altair Community Member

    Yes you can do RegEx. Just use the Replace operator. 

    User: "bea11005"
    New Altair Community Member
    OP

    What RegEx can I use?

    User: "Thomas_Ott"
    New Altair Community Member

    Without seeing your data, I would guess something like this: \<.*\>

     

    and replace with a space or something else.

    User: "kayman"
    New Altair Community Member

    That's a greedy regex, so that would eat all your tags in one go and leave you with not much content.

     

    Remove tags either with <.*?> (note the question mark that makes it a non greedy regex) or <[^>]+>