prefixing attribute names produced by WVT?
kirke
New Altair Community Member
Hi,
as I work with a text collection containing many, many words it contains also words like "label" and "id" already used for attribute names in an example set. I am getting warnings like the one below from the TextInput and I wonder whether there is an easy way to prefix all attribute names originating from words (similar to a StringToWordVector option -P known from Weka).
Thank you very much!
/kirke
as I work with a text collection containing many, many words it contains also words like "label" and "id" already used for attribute names in an example set. I am getting warnings like the one below from the TextInput and I wonder whether there is an easy way to prefix all attribute names originating from words (similar to a StringToWordVector option -P known from Weka).
Right now I don't believe it causes much trouble, but maybe I just missed some option in WVT TextInput to fix it.
[Warning] TextInput: The original example example set already contains an attribute named "label".
This is likely to cause trouble. Please rename the attribute in the original example set.
Thank you very much!
/kirke
Tagged:
0
Answers
-
Hi,
One way for prefixing is to add "TokenReplace" operator as a child of the "TextInput" and define replacement by regular expressions. For example, if your words will only consist of letters from a to z, you can define "([A-Za-z]+)" as a word pattern and "word_$1" as a replacement, where "word_" is the prefix.
I hope it helps a bit.
0 -
This worked fine, thank you!
0