Amazon S3 support

An AmazonS3FileRepository stores all its resources in an Amazon S3 bucket.

Configuring an AmazonS3FileRepository

To use this AmazonS3FileRepository, you must add the following dependencies to your project:

<dependencies>
	<dependency>
           <groupId>com.amazonaws</groupId>
           <artifactId>aws-java-sdk-s3</artifactId>
       </dependency>
       <dependency>
           <groupId>org.springframework.cloud</groupId>
           <artifactId>spring-cloud-aws-core</artifactId>
           <version>2.1.0.RELEASE</version>
       </dependency>
</dependencies>

You create a new repository by passing an AmazonS3 client and specifying a bucket name:

Configuration of a new AmazonS3 repository
@Bean
public FileRepository documentRepository( AmazonS3 amazonS3 ) {
    return AmazonS3FileRepository.builder()
                                 .repositoryId( "documents" ) (1)
                                 .amazonS3( amazonS3 ) (2)
                                 .bucketName( "my-app-documents" ) (3)
                                 .build();
}
1 the unique repository id, all file descriptors will have this repository id
2 the Amazon S3 client to use
3 the name of the S3 bucket in which to store all objects

Uploading multipart files

The implementation uses infrastructure from Spring Cloud AWS to support multipart uploads and region specific bucket redirects. By default only one thread is used to upload files, but you can attach a org.springframework.core.task.TaskExecutor to the AmazonS3FileRepository to enable parallel uploading and enhance overall throughput.

Configuration of a TaskExecutor
@Bean
public TaskExecutor s3UploadExecutor() {
    ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
    taskExecutor.setMaxPoolSize( 10 );
    taskExecutor.setQueueCapacity( 0 );
    taskExecutor.setRejectedExecutionHandler( new ThreadPoolExecutor.CallerRunsPolicy() );
    return taskExecutor;
}

@Bean
public FileRepository documentRepository( AmazonS3 amazonS3, TaskExecutor s3UploadExecutor ) {
    return AmazonS3FileRepository.builder()
                                 .repositoryId( "documents" )
                                 .amazonS3( amazonS3 )
                                 .bucketName( "my-app-documents" )
                                 .taskExecutor( s3UploadExecutor )
                                 .build();
}

In this example our task executor does not queue uploads but executes them non-parallel in the calling thread if the threadpool is full.

Parallel uploads consume a memory footprint of at least 5MB per thread according to the Spring Cloud AWS documentation. Additional memory will be consumed for every item queued in the task executor.

Folder resource support

Amazon S3 uses a common key prefix as a folder concept for grouping objects. It also supports the optional creation of an actual folder object, this is an object name with a trailing slash (/).

From a FileManagerModule perspective FolderResource is fully supported, with the following important side notes:

  • calls to FolderResource.exists() will always return true even if there is no explicit S3 object matching the folder name

    • querying such a folder will simply return the objects starting with the folder name as prefix

  • calls to FolderResource.create() will create an explicit S3 object matching the folder name

    • true will be returned if the object was created (did not yet exist)

  • calls to FolderResource.delete(false) will only delete the S3 object should it exist, but will keep all objects starting with that folder prefix

    • deleteChildren() will delete the objects starting with the folder prefix, but keep the folder object if it exists

    • delete(true) will delete both

Path generation

Like the LocalFileRepository, the AmazonS3FileRepository supports a PathGenerator for the automatic file resource creation.