Read TSV & Read Multiple files from S3
Hi All,
I have three related questions, I'm tying to read in files from Amazon S3. The Amazon Read S3 operator works fine, but I have three problems:
1) I'm trying to read in tsv files, so I've connected 'Read Amazon S3' to 'Read CSV' but it results in no records. I also tried with 'Read Excel' but that just throws errors. Is there an operator that can handle tsv files?
2) Eventually I want to be able to read all files in an S3 bucket, rather than just selecting one. So is there a way of looping through all the files?
3) Are there any operators I can apply to filter multiple file types through the workflow, so they get routed to the appropriate 'Read' operator? A bucket may have different file formats in it, and the process will throw errors if it try to read in the wrong file extention.
Thanks
Mike
Answers
-
Dear Mike,
for 1): Have you tried Read CSV with tab as a delimiter? The default takes ; as a delimiter.
2) Have you had a look at Loop Amazon S3?
3) Loop Amazon has a filter option where you can use .+tsv as a regex to just include tsv files in the loop.
Best,
Martin
1