AWS: Java S3 Upload

This entry is part 3 of 5 in the series AWS & Java

If you want to push data to AWS S3 there are a few different ways of doing this. I will show you two ways I have used.

Option 1: putObject

  1. import com.amazonaws.AmazonClientException;
  2. import com.amazonaws.services.s3.model.ObjectMetadata;
  3. import com.amazonaws.ClientConfiguration;
  4. import com.amazonaws.auth.AWSCredentialsProvider;
  5. import com.amazonaws.regions.Regions;
  6. import com.amazonaws.services.s3.AmazonS3;
  7. import com.amazonaws.services.s3.AmazonS3ClientBuilder;
  8.  
  9. ClientConfiguration config = new ClientConfiguration();
  10. config.setSocketTimeout(SOCKET_TIMEOUT);
  11. config.setMaxErrorRetry(RETRY_COUNT);
  12. config.setClientExecutionTimeout(CLIENT_EXECUTION_TIMEOUT);
  13. config.setRequestTimeout(REQUEST_TIMEOUT);
  14. config.setConnectionTimeout(CONNECTION_TIMEOUT);
  15.  
  16. AWSCredentialsProvider credProvider = ...;
  17. String region = ...;
  18.  
  19. AmazonS3 s3Client = AmazonS3ClientBuilder.standard().withCredentials(credProvider).withRegion(region).withClientConfiguration(config).build();
  20.  
  21. InputStream stream = ...;
  22. String bucketName = .....;
  23. String keyName = ...;
  24. String mimeType = ...;
  25.  
  26. //You use metadata to describe the data.
  27. final ObjectMetadata metaData = new ObjectMetadata();
  28. metaData.setContentType(mimeType);
  29.  
  30. //There are overrides available. Find the one that suites what you need.
  31. try {
  32. s3Client.putObject(bucketName, keyName, stream, metaData);
  33. } catch (final AmazonClientException ex) {
  34. //Log the exception
  35. }

Option 2: MultiPart Upload

  1. import com.amazonaws.AmazonClientException;
  2. import com.amazonaws.event.ProgressEvent;
  3. import com.amazonaws.regions.Regions;
  4. import com.amazonaws.services.s3.AmazonS3;
  5. import com.amazonaws.services.s3.AmazonS3ClientBuilder;
  6. import com.amazonaws.ClientConfiguration;
  7. import com.amazonaws.auth.AWSCredentialsProvider;
  8. import com.amazonaws.event.ProgressEventType;
  9. import com.amazonaws.event.ProgressListener;
  10. import com.amazonaws.services.s3.AmazonS3;
  11. import com.amazonaws.services.s3.model.ObjectMetadata;
  12. import com.amazonaws.services.s3.transfer.TransferManager;
  13. import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
  14. import com.amazonaws.services.s3.transfer.Upload;
  15.  
  16. ClientConfiguration config = new ClientConfiguration();
  17. config.setSocketTimeout(SOCKET_TIMEOUT);
  18. config.setMaxErrorRetry(RETRY_COUNT);
  19. config.setClientExecutionTimeout(CLIENT_EXECUTION_TIMEOUT);
  20. config.setRequestTimeout(REQUEST_TIMEOUT);
  21. config.setConnectionTimeout(CONNECTION_TIMEOUT);
  22.  
  23. AWSCredentialsProvider credProvider = ...;
  24. String region = ...;
  25.  
  26. AmazonS3 s3Client = AmazonS3ClientBuilder.standard().withCredentials(credProvider).withRegion(region).withClientConfiguration(config).build();
  27.  
  28. InputStream stream = ...;
  29. String bucketName = .....;
  30. String keyName = ...;
  31. long contentLength = ...;
  32. String mimeType = ...;
  33.  
  34. //You use metadata to describe the data. You need the content length so the multi part upload knows how big it is
  35. final ObjectMetadata metaData = new ObjectMetadata();
  36. metaData.setContentLength(contentLength);
  37. metaData.setContentType(mimeType);
  38.  
  39. TransferManager tf = TransferManagerBuilder.standard().withS3Client(s3Client).build();
  40. tf.getConfiguration().setMinimumUploadPartSize(UPLOAD_PART_SIZE);
  41. tf.getConfiguration().setMultipartUploadThreshold(UPLOAD_THRESHOLD);
  42. Upload xfer = tf.upload(bucketName, keyName, stream, metaData);
  43.  
  44. ProgressListener progressListener = new ProgressListener() {
  45. public void progressChanged(ProgressEvent progressEvent) {
  46. if (xfer == null)
  47. return;
  48. if (progressEvent.getEventType() == ProgressEventType.TRANSFER_FAILED_EVENT || progressEvent.getEventType() == ProgressEventType.TRANSFER_PART_FAILED_EVENT) {
  49. //Log the message
  50. }
  51. }
  52. };
  53.  
  54. xfer.addProgressListener(progressListener);
  55. xfer.waitForCompletion();

AWS: Java S3 Lambda Handler

This entry is part 1 of 5 in the series AWS & Java

If you want to write a Lambda for AWS in Java that connects to S3. You need to have the handler.

Maven:

  1. <dependency>
  2. <groupId>com.amazonaws</groupId>
  3. <artifactId>aws-java-sdk</artifactId>
  4. <version>1.11.109</version>
  5. </dependency>

This is the method that AWS Lambda will call. It will look similar to the one below.

  1. import com.amazonaws.services.lambda.runtime.Context;
  2. import com.amazonaws.services.lambda.runtime.events.S3Event;
  3. import com.amazonaws.services.s3.event.S3EventNotification.S3Entity;
  4. import com.amazonaws.services.s3.event.S3EventNotification.S3EventNotificationRecord;
  5.  
  6. public void S3Handler(S3Event s3e, Context context) {
  7. final String awsRequestId = context.getAwsRequestId();
  8. final int memoryLimitMb = context.getMemoryLimitInMB();
  9. final int remainingTimeInMillis = context.getRemainingTimeInMillis();
  10.  
  11. for (final S3EventNotificationRecord s3Rec : s3e.getRecords()) {
  12. final S3Entity record = s3Rec.getS3();
  13. final String bucketName = record.getBucket().getName()
  14. final String key = record.getObject().getKey();
  15. }
  16. }

The thing to note when you setup you Lambda is how to setup the “Handler” field in the “Configuration” section on AWS. It is in the format “##PACKAGE##.##CLASS##::##METHOD##”.

AWS: Python S3

This entry is part 3 of 3 in the series AWS & Python

If you haven’t already done so please refer to the AWS setup section which is part of this series. As time goes on I will continually update this section.

To work with S3 you need to utilise the “connection_s3” you setup already in the previous tutorial on setting up the connection.

To load a file from a S3 bucket you need to know the bucket name and the file name.

  1. connection_s3.Object(##S3_BUCKET##, ##FILE_NAME##).load()

If you want to check if the check if a file exists on S3 you do something like the below. However you will need to import botocore.

  1. import botocore
  2.  
  3. def keyExists(key):
  4. file = connection_s3.Object(##S3_BUCKET##, ##FILE_NAME##)
  5. try:
  6. file.load()
  7. except botocore.exceptions.ClientError as e:
  8. exists = False
  9. else:
  10. exists = True
  11. return exists, file

If you want to copy a file from one bucket to another or sub folder you can do it like below.

  1. connection_s3.Object(##S3_DESTINATION_BUCKET##, ##FILE_NAME##).copy_from(CopySource=##S3_SOURCE_BUCKET## + '/' + ##FILE_NAME##)

If you want to delete the file you can use the “keyExists” function above and then just call “delete”.

  1. ##FILE##.delete()

If you want to just get a bucket object. Just need to specify what bucket and utilise the S3 connection.

  1. bucket = connection_s3.Bucket(##S3_BUCKET##)

To upload a file to S3’s bucket. You need to set the body, type and name. Take a look at the below example.

  1. bucket.put_object(Body=##DATA##,ContentType="application/zip", Key=##FILE_NAME##)

If you want to loop over the objects in a bucket. It’s pretty straight forward.

  1. for key in bucket.objects.all():
  2. file_name = key.key
  3. response = key.get()
  4. data = response_get['Body'].read()

If you want to filter objects in a bucket.

  1. for key in bucket.objects.filter(Prefix='##PREFIX##').all():
  2. file_name = key.key
  3. response = key.get()
  4. data = response_get['Body'].read()

AWS: Python Setup

This entry is part 1 of 3 in the series AWS & Python

When you want to work with S3 or a Kinesis Stream we first need to setup the connection. At the time of this writing I am using boto3 version 1.3.1.

Next we need to import the package.

  1. import boto3

Next we setup the session and specify what profile we will be using.

  1. profile = boto3.session.Session(profile_name='prod')

The profile name comes from the “credentials” file. You can set the environment variable “AWS_SHARED_CREDENTIALS_FILE” to specify what credentials file to use. You can setup the credentials file like below. You can change the “local” to anything you want. I normally use “stage”, “dev” or “prod”.

  1. [local]
  2. aws_access_key_id=##KEY_ID##
  3. aws_secret_access_key=##SECRET_ACCESS_KEY##
  4. region=##REGION##

Next we need to setup the connection to S3. To do this we will need to use the profile we created above.

  1. connection_s3 = profile.resource('s3')

If we want to also use a Kinesis stream then we need to setup the connection. To do this we will need the profile we created above.

  1. connection_kinesis = profile.client('kinesis')