n-Grams with a length of 3-6
Kathi
New Altair Community Member
Hi everyone,
I am currently using the n-Grams operator. If I set the length to 6, all n-Grams will be displayed with a word count of 1-6. I just want to see the n-Grams with a length of 3-6 words. Is that possible?
I am currently using the n-Grams operator. If I set the length to 6, all n-Grams will be displayed with a word count of 1-6. I just want to see the n-Grams with a length of 3-6 words. Is that possible?
Thanks!
Tagged:
0
Best Answer
-
Hi @Kathi
Nice challenge !
Yes, it is possible. You have to duplicate your text processing using :
for the first : max length = 6
for the second : max length = 2
then Transpose the 2 resulting example sets
then use a Set Minus operator to keep only the attribute with 3< max length <6
finally (re)Transpose the final example set.
The process is in attached file.
Regards,
Lionel2
Answers
-
Use filter tokens by content, select match and use a regex like ^.*?_.*?_.*
This will filter on all n-grams having at least 2 underscores, so matching 3 words or more1 -
Hi @Kathi
Nice challenge !
Yes, it is possible. You have to duplicate your text processing using :
for the first : max length = 6
for the second : max length = 2
then Transpose the 2 resulting example sets
then use a Set Minus operator to keep only the attribute with 3< max length <6
finally (re)Transpose the final example set.
The process is in attached file.
Regards,
Lionel2 -
Hi,I first tried the set-minus-process, it's working and looks fantastic! Thank you both!Regards,Kathi
1