Meta data problem
Hey,
I'm using the extension "dictionary based sentiment analysis" and have got a problem with some meta data output at the end. Everything works out fine, but i cannot see the token number. What i wanted to do: Screening text, scoring each text, output is negative/ positive and the number of uncovered tokens - so in order to be able to use the "number of uncovered tokens" i want to know the number of total tokens i have in my text. I'm using the "Extract token number" but it won't display at the end.
Thanks for help
<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<operator activated="true" class="text:tokenize" compatibility="7.5.000" expanded="true" height="68" name="Tokenize" width="90" x="246" y="34">
<parameter key="mode" value="non letters"/>
<parameter key="characters" value=".:"/>
<parameter key="language" value="English"/>
<parameter key="max_token_length" value="3"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<operator activated="true" class="text:filter_stopwords_german" compatibility="7.5.000" expanded="true" height="68" name="Filter Stopwords (German)" width="90" x="447" y="34">
<parameter key="stop_word_list" value="Standard"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<operator activated="true" class="text:filter_by_length" compatibility="7.5.000" expanded="true" height="68" name="Filter Tokens (by Length)" width="90" x="179" y="187">
<parameter key="min_chars" value="4"/>
<parameter key="max_chars" value="40"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<operator activated="true" class="text:transform_cases" compatibility="7.5.000" expanded="true" height="68" name="Transform Cases (2)" width="90" x="313" y="187">
<parameter key="transform_to" value="lower case"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<operator activated="true" class="text:extract_token_number" compatibility="7.5.000" expanded="true" height="68" name="Extract Token Number" width="90" x="514" y="187">
<parameter key="metadata_key" value="token_number"/>
<parameter key="condition" value="all"/>
<parameter key="case_sensitive" value="false"/>
<parameter key="invert_condition" value="false"/>
</operator>
</process>
Answers
-
Your process XML appears to be malformed and won't render. Are you sure this is the XML from a single complete process?
In the meantime, "Extract Token Number" is meant to be used inside "Process Documents" so you'll need to incorporate it there in your workflow.
0 -
Sorry but I have a tangential question... @Telcontar120 - this thing seems to happen a lot. Any idea why people's XML gets corrupted in this way? @Benedict_von_Ah if you could help me understand how you pasted the XML, this would be helpful. Thanks!
Scott
0 -
@sgenzer I really don't know about the XML corruption---I remember discussing this at one point in the past with @Thomas_Ott and I think he thought it was some kind of problem with the Lithium site backend.
0 -
yeah that's what I'm worried about but I don't see that problem when experienced users post code - only new users. I assume this is not corrupted for you?
<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve Iris" width="90" x="112" y="289">
<parameter key="repository_entry" value="//Samples/data/Iris"/>
</operator>
<operator activated="true" class="split_data" compatibility="7.6.001" expanded="true" height="103" name="Split Data" width="90" x="246" y="289">
<enumeration key="partitions">
<parameter key="ratio" value="0.9"/>
<parameter key="ratio" value="0.1"/>
</enumeration>
</operator>
<operator activated="true" class="keras:sequential" compatibility="1.0.003" expanded="true" height="166" name="Keras Model" width="90" x="447" y="187">
<parameter key="input shape" value="(4,)"/>
<parameter key="loss" value="categorical_crossentropy"/>
<parameter key="optimizer" value="Adam"/>
<parameter key="learning rate" value="0.001"/>
<enumeration key="metric"/>
<parameter key="epochs" value="128"/>
<enumeration key="callbacks">
<parameter key="callbacks" value="TensorBoard(log_dir='./logs', histogram_freq=0, write_graph=True, write_images=False, embeddings_freq=0, embeddings_layer_names=None, embeddings_metadata=None)"/>
</enumeration>
<process expanded="true">
<operator activated="true" class="keras:core_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Core Layer" width="90" x="179" y="289">
<parameter key="no_units" value="8"/>
<parameter key="activation_function" value="'relu'"/>
<parameter key="target_shape" value="(1, 1)"/>
<parameter key="dims" value="1.1"/>
<parameter key="repetition_factor" value="2"/>
</operator>
<operator activated="true" class="keras:core_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Core Layer (2)" width="90" x="313" y="289">
<parameter key="no_units" value="3"/>
<parameter key="activation_function" value="'softmax'"/>
<parameter key="target_shape" value="(1, 1)"/>
<parameter key="dims" value="1.1"/>
<parameter key="repetition_factor" value="2"/>
</operator>
<connect from_op="Add Core Layer" from_port="layers 1" to_op="Add Core Layer (2)" to_port="layers"/>
<connect from_op="Add Core Layer (2)" from_port="layers 1" to_port="layers 1"/>
<portSpacing port="sink_layers 1" spacing="0"/>
<portSpacing port="sink_layers 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="keras:apply" compatibility="1.0.003" expanded="true" height="82" name="Apply Keras Model" width="90" x="648" y="289">
<parameter key="batch_size" value="16"/>
</operator>
<connect from_op="Retrieve Iris" from_port="output" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_op="Keras Model" to_port="training set"/>
<connect from_op="Split Data" from_port="partition 2" to_op="Apply Keras Model" to_port="unlabelled data"/>
<connect from_op="Keras Model" from_port="model" to_op="Apply Keras Model" to_port="model"/>
<connect from_op="Apply Keras Model" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>Scott
0 -
@sgenzer yep, that one's fine for me (nice Keras model, btw) :smileyhappy:
1 -
Now that more people are posting XML's I think I might have to rethink my original hypothesis. It appears that new users are posting corrupted XML's mostly.
1