Hi,
We use Amazon S3 to load data into Amazon Redshift Database. After data is loaded we want to clean up the S3 text files. I see the Read, Loop, and Write S3 operators. Does RM allow Deleting files from S3, considering that I have access to delete in S3? Otherwise any workaround suggestions?
Thank You
Ok it seems that this is a case where RapidMiner does not have the object but then you can extend it by using external scripts. In this case I have used the Execute Python script and boto library.
from boto.s3.connection import S3Connection
AWS_ACCESS_KEY='myAccesskey'
AWS_SECRET_KEY='mySecretKey'
path_to_file='mysubFolderPath'
# Create connection
conn = S3Connection(AWS_ACCESS_KEY, AWS_SECRET_KEY)
# Connet to my bucket
bucket = conn.get_bucket(S3_BUCKET_NAME)
# Get subdirectory info and delete files (except the subdirectory itself)
for key in bucket.list(prefix=path_to_file, delimiter='/'):
if key.name != path_to_file:
bucket.delete_key(key)
rm_main()
Thank You