[SOLVED] Splitting a nominal attribute that has no separator
Ugo
New Altair Community Member
Hello.
I have attributes whose values have the form:
AK234
A112
In other words we have set of a alphabetical characters
followed by a set of digits. The question is: how can I split
the attribute into two attributes each containing either the
alphabetical or numerical characters.
I have attempted to use the Split operator but I seem to be
only able to select either one of the parts but not both.
Is their any way I can do this with an operator?
TIA,
Hugo F.
I have attributes whose values have the form:
AK234
A112
In other words we have set of a alphabetical characters
followed by a set of digits. The question is: how can I split
the attribute into two attributes each containing either the
alphabetical or numerical characters.
I have attempted to use the Split operator but I seem to be
only able to select either one of the parts but not both.
Is their any way I can do this with an operator?
TIA,
Hugo F.
Tagged:
0
Answers
-
Hello,
I have just realized that even though the split via regexp
seems to work in the dialogue box that allows testing, the
output does _not_ split the attribute value.
Any help will be appreciated.
TIA
0 -
Salut Ugo,
does your attribute have a special role, is it e.g. defined as label or as id? In that case, you have to check the "include_special_attributes" parameter at the Split operator.
Otherwise, please post your regular expression such that we can have a look at it.
Best regards,
Marius0 -
All those attributes are marked regular and polynominal.
I think I am close to a solution.
My current attempt uses the expression:
[^a-z]+
so a k100 will have the 100 highlighted.
This means I can generate attributes such as:
att_
att_g
att_k
att_l
att_s
etc.
This looks ok. For the second part I have:
[a-z]+
which results in attributes such as:
att_
att_100
att_985
etc.
I cannot seem to do the split in a single expression
(note that I was able to do this when I had values split
with '/' in another attribute). I am now looking how I can
use a "multiply" and then combine those attributes back into
a single example set. Problem is now I have duplicate
attributes.
Seems way too convoluted. Any easier to do this?
TIA
0 -
Hi,
in fact the Split operator is useful, if you have two values in an attribute which are separated by a fixed string. Referring to your example with A100, AK200 or alike, there is no splitting sequence. The case would be different if you had A_100, AK_200 etc. Since you don't have it, you should use Generate Attributes and create 2 new attributes with the following definition:
a1: replaceAll(a, "[a-zA-Z]", "")
a2: replaceAll(a, "[^a-zA-Z]", "")
This assumes that your original attribute is called a. The process below does what I described above.
Best regards,
Marius<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.005">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.005" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_data_user_specification" compatibility="5.3.005" expanded="true" height="60" name="Generate Data by User Specification" width="90" x="45" y="30">
<list key="attribute_values">
<parameter key="a" value=""AK100""/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.005" expanded="true" height="76" name="Generate Attributes" width="90" x="179" y="30">
<list key="function_descriptions">
<parameter key="a1" value="replaceAll(a, "[a-zA-Z]", "")"/>
<parameter key="a2" value="replaceAll(a, "[^a-zA-Z]", "")"/>
</list>
</operator>
<connect from_op="Generate Data by User Specification" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0 -
Ok, I am going to try this.
Thanks.
0 -
Ok. Worked fine.
Thanks once again.
0