Replace (dictionary)
abevensee
New Altair Community Member
I have a data set that has 30+ attributes. Each data row has numerical codes in each column that correlate to a classification. For example; Gender is an attribute with codes 1-3 meaning Male, Female, and not provided, respectively. There are similar code structures for ethnicity, race, etc. I have set up a dictionary for each one of these attributes so that my model can reference to the specific dictionary and convert the codes to meaningful data. I have 2 questions:
1): Codes mean something different for each attribute conversion I'm performing so I set up separate dictionaries for each. For instance 1 means male for gender but it also means white for race and single for marital status. Is there a way to use the loop operator to have RM run all 30+ conversions using the different dictionaries or do I need to have 30 separate "Replace (Dictionary)" operators in my process?
2): In some dictionaries there are layered codes, for instance in my use case
1 = Latino/Hispanic
4 = N/A
14 = Other Hispanic or Latino
Instead of returning "Other Hispanic or Latino" for codes that equal 14, the operator is returning "Latino/HispanicN/A". I've seen that the regular expression option could prevent this however since I have the operator set up to run on a subset (the various ethnicity related attributes) and I do not want it applied to the whole population, I'm not sure that'd work. How can I go about fixing this?
1): Codes mean something different for each attribute conversion I'm performing so I set up separate dictionaries for each. For instance 1 means male for gender but it also means white for race and single for marital status. Is there a way to use the loop operator to have RM run all 30+ conversions using the different dictionaries or do I need to have 30 separate "Replace (Dictionary)" operators in my process?
2): In some dictionaries there are layered codes, for instance in my use case
1 = Latino/Hispanic
4 = N/A
14 = Other Hispanic or Latino
Instead of returning "Other Hispanic or Latino" for codes that equal 14, the operator is returning "Latino/HispanicN/A". I've seen that the regular expression option could prevent this however since I have the operator set up to run on a subset (the various ethnicity related attributes) and I do not want it applied to the whole population, I'm not sure that'd work. How can I go about fixing this?
Tagged:
0
Best Answer
-
That's because it is first replacing 1 and then 4 rather than using 14
Changing the order (so putting 14 before 1 and 4) may fix it already. When using regex you can force to have a full match, but the standard one replaces basically anything that matches, so 14 is considered as 1 and 4.2
Answers
-
That's because it is first replacing 1 and then 4 rather than using 14
Changing the order (so putting 14 before 1 and 4) may fix it already. When using regex you can force to have a full match, but the standard one replaces basically anything that matches, so 14 is considered as 1 and 4.2