"[SOLVED] Two near words"

zahrahnnx
zahrahnnx New Altair Community Member
edited November 2024 in Community Q&A
Hi everyone

I have an excel file including 20 rows... Each row is filled by description regarding to business analysis.
The words "problem" & "solving" are among the common words . But in each document they may come in different order. eg "solving the problems" or " problem solving skills" "solving technical problems" etc

I want to put all of these combinations of "problem " & "solving" into one attribute. For example, I'll add an attribute called "problem-solving". If an document includes the words "problem " & "solving" together or with 1~4 words in between, the value of attribute "problem-solving" set to 1. else 0.

I did similar thing for "Database" related words. eg if a document contains sql,or mysql the value of "Database" will be 1. It works. But I don't know how to do it when there is two words.

image

Please let me know if you have any idea. Thanks
Zahrahnnx

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,

    my first idea would be to do an n-grams and Select Attributes for problem and solving? Maybe use Generate Aggregation after wards.

    Cheers,
    Martin
  • zahrahnnx
    zahrahnnx New Altair Community Member
    Martin Schmitz wrote:

    Hi,

    my first idea would be to do an n-grams and Select Attributes for problem and solving? Maybe use Generate Aggregation after wards.

    Cheers,
    Martin
    Thanks for the response , yes n-gram works :)
    I also came up with another solution. I'll share, maybe someone face with same problem.

    Using "Extract Information" operator inside " Process document from Data". and then use below Regular Expression in "Extract Information"
    (problem\W+(?:\w+\W+){0,5}?solving)|(solving\W+(?:\w+\W+){0,5}?problem)

    It adds new attribute which I called it "Problem_Solving", then in the main process I used "Select Attribute" operator to check "Problem_Solving"

    Both ways works  ;)
    Thanks again
  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,

    i think i like your idea a bit more. Seems to be a bit faster :)

    Thanks for the message!

    Martin

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.