How to delete a file from a S3 Bucket folder

PBM
PBM New Altair Community Member
edited November 5 in Community Q&A
Hi, 
We use Amazon S3 to load data into Amazon Redshift Database. After data is loaded we want to clean up the S3 text files. I see the Read, Loop, and Write S3 operators. Does RM allow Deleting files from S3, considering that I have access to delete in S3?  Otherwise any workaround suggestions?
Thank You

Best Answer

Answers

  • PBM
    PBM New Altair Community Member
    Hi all,

    Ok it seems that this is a case where RapidMiner does not have the object but then you can extend it by using external scripts. In this case I have used the Execute Python script and boto library. 

    from boto.s3.connection import S3Connection
    S3_BUCKET_NAME='myBucket'
    AWS_ACCESS_KEY='myAccesskey'
    AWS_SECRET_KEY='mySecretKey'
    path_to_file='mysubFolderPath'
    def rm_main():
      # Create connection
      conn = S3Connection(AWS_ACCESS_KEY, AWS_SECRET_KEY)
             
      # Connet to my  bucket
      bucket = conn.get_bucket(S3_BUCKET_NAME)
         
      # Get subdirectory info and delete files (except the subdirectory itself)
      for key in bucket.list(prefix=path_to_file, delimiter='/'):
        if key.name != path_to_file:
          bucket.delete_key(key)
      return
    if __name__ == "__main__":
       rm_main()

    Thank You