"Concatenate all text from Twitter feed"
I am collecting all of the text from a Twitter feed using the Twitter operator. I am storing this data into a MySQL database. I extract this information from the MySQL table to process through a sentiment analysis engine, as part of the process I need to include the all of the examples into an API format before submitting. I use the Generate Attribute operator to create the API secret and key required, however when I join and then concatinate the date into a single file (Select Attribute used to drop the attributes I do not need) I only have the first tweet that was pulled from the DB included in the API submission file, the rest of the tweets are missing.
What am I doing incorrectly? I have tried turnig the data into documents and combing, I have tried creating collections and flattening them. I have really tried everything but I am just unable to insert the roughly 1,5k tweets that I have pulled down into the file format.
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/hal+json' --header 'X-API-SECRET-KEY:XXXXXXXxXXXxXXXXXX' --header 'X-API-XXXXXXXXXXXXX' -d '{
"name": "account_name",
"gender": 0,
"content": {
"content_handle": "account_handle",
"content_source": 1,2
"content_date": "10/04/2018 22:00:46 PM SAST",
"language_content": "need to insert the free text examples from twitter in here."
},
"person_handle": "key_account_manager"
In the code above I need to insert the Twitter Text into the "language_content" field, but can only ever insert a single line of Twitter data.
@RitualGym Hey guys, the gym here in Illovo was supposed to be opening in April, is it still happening? I’m keen to get stared. 💪🏼 @RitualGym Hey guys, the gym here in Illovo was supposed to be opening in April, is it still happening? I’m keen to get stared. 💪🏼
@OneDayOnlycoza House of Chards 🥦 https://t.co/ljZQD93t3i @OneDayOnlycoza House of Chards 🥦 https://t.co/ljZQD93t3i
@ThatDarnKitteh @Nick_Frost If this isn’t your handle by the end of the day I’m unfollowing 😂 @ThatDarnKitteh @Nick_Frost If this isn’t your handle by the end of the day I’m unfollowing 😂
@Nick_Frost Have you seen what happens when people try combine their names for their kids? 😣😖🤢🤮 @Nick_Frost Have you seen what happens when people try combine their names for their kids? 😣😖🤢🤮
This is the best thing on the internet right now. 😂 https://t.co/TchhxFUGqT This is the best thing on the internet right now. 😂 https://t.co/TchhxFUGqT
@OneDayOnlycoza UK size 7. 😉 https://t.co/DAIaK0V5oF @OneDayOnlycoza UK size 7. 😉 https://t.co/DAIaK0V5oF
Above is the text that I need to insert into the file. This text comes from a MySQL db, so has /r/n characters seperating the fileds (which I believe is causing the issue)
At my wits end, please help me see the wood for the trees.
Best Answer
-
@robin I'm running out the door ATM, but this almost gets you there. You'll need to loop over the combined document to add in the header stuff.
<?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve RobinJSON" width="90" x="112" y="34">
<parameter key="repository_entry" value="//Community Answers/data/RobinJSON"/>
</operator>
<operator activated="true" class="extract_macro" compatibility="8.1.001" expanded="true" height="68" name="Extract Macro" width="90" x="246" y="34">
<parameter key="macro" value="Num"/>
<list key="additional_macros"/>
</operator>
<operator activated="true" class="concurrency:loop" compatibility="8.1.001" expanded="true" height="82" name="Loop" width="90" x="380" y="34">
<parameter key="number_of_iterations" value="%{Num}"/>
<parameter key="enable_parallel_execution" value="false"/>
<process expanded="true">
<operator activated="true" class="extract_macro" compatibility="8.1.001" expanded="true" height="68" name="Extract Macro (2)" width="90" x="112" y="34">
<parameter key="macro" value="insert"/>
<parameter key="macro_type" value="data_value"/>
<parameter key="attribute_name" value="Test"/>
<parameter key="example_index" value="%{iteration}"/>
<list key="additional_macros"/>
</operator>
<operator activated="true" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="246" y="34">
<parameter key="text" value=""language_content": "%{insert}" }, "/>
</operator>
<operator activated="false" class="text:documents_to_data" compatibility="8.1.000" expanded="true" height="68" name="Documents to Data" width="90" x="313" y="187">
<parameter key="text_attribute" value="yumyum"/>
</operator>
<connect from_port="input 1" to_op="Extract Macro (2)" to_port="example set"/>
<connect from_op="Create Document" from_port="output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="text:combine_documents" compatibility="8.1.000" expanded="true" height="82" name="Combine Documents" width="90" x="514" y="34"/>
<connect from_op="Retrieve RobinJSON" from_port="output" to_op="Extract Macro" to_port="example set"/>
<connect from_op="Extract Macro" from_port="example set" to_op="Loop" to_port="input 1"/>
<connect from_op="Loop" from_port="output 1" to_op="Combine Documents" to_port="documents 1"/>
<connect from_op="Combine Documents" from_port="document" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>1
Answers
-
@robin do you have a Loop operator iterate over this?
0 -
@robin I'm running out the door ATM, but this almost gets you there. You'll need to loop over the combined document to add in the header stuff.
<?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve RobinJSON" width="90" x="112" y="34">
<parameter key="repository_entry" value="//Community Answers/data/RobinJSON"/>
</operator>
<operator activated="true" class="extract_macro" compatibility="8.1.001" expanded="true" height="68" name="Extract Macro" width="90" x="246" y="34">
<parameter key="macro" value="Num"/>
<list key="additional_macros"/>
</operator>
<operator activated="true" class="concurrency:loop" compatibility="8.1.001" expanded="true" height="82" name="Loop" width="90" x="380" y="34">
<parameter key="number_of_iterations" value="%{Num}"/>
<parameter key="enable_parallel_execution" value="false"/>
<process expanded="true">
<operator activated="true" class="extract_macro" compatibility="8.1.001" expanded="true" height="68" name="Extract Macro (2)" width="90" x="112" y="34">
<parameter key="macro" value="insert"/>
<parameter key="macro_type" value="data_value"/>
<parameter key="attribute_name" value="Test"/>
<parameter key="example_index" value="%{iteration}"/>
<list key="additional_macros"/>
</operator>
<operator activated="true" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="246" y="34">
<parameter key="text" value=""language_content": "%{insert}" }, "/>
</operator>
<operator activated="false" class="text:documents_to_data" compatibility="8.1.000" expanded="true" height="68" name="Documents to Data" width="90" x="313" y="187">
<parameter key="text_attribute" value="yumyum"/>
</operator>
<connect from_port="input 1" to_op="Extract Macro (2)" to_port="example set"/>
<connect from_op="Create Document" from_port="output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="text:combine_documents" compatibility="8.1.000" expanded="true" height="82" name="Combine Documents" width="90" x="514" y="34"/>
<connect from_op="Retrieve RobinJSON" from_port="output" to_op="Extract Macro" to_port="example set"/>
<connect from_op="Extract Macro" from_port="example set" to_op="Loop" to_port="input 1"/>
<connect from_op="Loop" from_port="output 1" to_op="Combine Documents" to_port="documents 1"/>
<connect from_op="Combine Documents" from_port="document" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>1 -
Yes I did. I copletely ignored the macro I set up at the begining of the process to perform the loop and started doing some crazy things in the end.
Note to self a macro can be used inside Generate Document.
Thank you Thomas
2