How to Properly Use Loop Amazon S3
Community,
I am trying to extract data from S3 using the "Loop Amazon S3" operator. It is Twitter data and the data files are nested pretty deeply - for example: raw_data/2016/10/11/16/file_1.txt
I must not have it configured correctly because RM tells me "Input Missing .... previous operator did not return any output" - if I point the operator to a higher directory like "10" , the process runs a long time before erroring. If I point it to the directory like "16" (i.e. the directory where all my files are located) it still gives an error.
I suspect I need to customize the "macro" fields but the description of the fields don't really make any sense. Right now the "file name" , "file path" and "parent path" macro fields contain the default values.
My layout goes like: [Loop Amazon S3] -> [Read Document] -> [JSON to Data] -> results
Thanks for your help!