Count UPPER CASE Tokens

Question

I have a spreadsheet with a text column and a label column. I would like to represent text values with some token metadata. I'm using "process documents". In "process documents" I'm tokenizingo the text value. I would like to achieve the following:
1. Add an attribute to the exampleset which contains a count of the number of tokens which were UPPER CASE.
2. Add an attribute to the exampleset which is a count of the number of adjective tokens.
On point (2) I have made some progress by using "filter tokens by pos tag". This doesn't give me quite what I want though. I want a count  of the number of adjectives, not just bag-of-words filtered to only contain adjectives.
On point (1) I have no ideas for how to proceed.
Thank you.

kayman · Accepted Answer

@nfridge1 , The count is fairly easy, Use the generate attribute and use something like this : length(replaceAll([Text],"[^A-Z]",""))

Basically this means replace everything that's not uppercase with nothing, and then count the length of the remainder.

So if your original Text would be "JusT FoR FuN" the replacement would return JTFRFN and length would be 6, which is then what will be returned.

Not sure what you would mean with the adjective tokens, do you have an example of what you have and what you want to achieve?

Edit : If you just want to count the words you could use something similar as above, but now you remove everything that's not the separator (comma or space or whatever you used) and add 1. Should give you the total tokens.

So if you have something like Token1, Token2, Token3 use length(replaceAll([MyTokens],"[^,]",""))+1