Multipart Uploads
Introduction
When uploading large files to S3, using a single PUT
request can be slow, unreliable, and prone to failure if the connection is interrupted. To solve this, S3 provides multipart upload, which allows you to split a large file into multiple smaller parts, upload them independently, and then combine them into a single object.
Important
Multipart uploads are recommended for objects larger than 100MB and required for objects above 5GB.
The benefits of multipart uploads include:
- Resilience: If an upload fails due to network issues or other interruptions, only the affected part needs to be retried instead of restarting the entire upload.
- Pause & Resume: You may stop an upload manually and resume later without losing progress, as already uploaded parts remain stored in S3.
- Simultaneous Uploads: Multiple parts may be uploaded at the same time, speeding up the process.
To perform multipart upload you may either use high level commands which will abstract multipart uploads, retries and concurrency or you may use the low level aws s3api
command (or equivalent) and gain more control over the process.
Easy Multipart Upload
To upload an object simply run:
AWS will automatically split the file into parts if needed, choose the part sizes dynamically based on the total file size and merge all parts into a single object in S3 when the upload is complete.
Rclone handles the multipart upload process automatically. However, you may want to configure chunk sizes and parallel transfers to optimize performance.
To upload a file using Rclone's built-in multipart handling, run:
rclone copy <local_file> <s3_profile>:<my_bucket> \
--s3-upload-concurrency <number_of_concurrent_uploads> \
--s3-chunk-size <chunk_size>
--progress \
--checksum
--s3-upload-concurrency
: Specifies how many parts are uploaded in parallel, improving speed for large files (default: 4
).--s3-chunk-size
: Defines the size of each part before upload (default: 64M
, minimum:5M
). Larger sizes reduce the number of parts but require more RAM.--progress
: Displays real-time upload statistics, including transfer speed, total transferred data, percentage progress, and estimated time remaining.--checksum
: Verifies the integrity of the uploaded file by comparing checksums, ensuring the remote file matches the local one. This prevents data corruption during transfer.
Manual Multipart Upload
A manual multipart upload using AWS CLI consists of three steps:
- Initiate the multipart upload: Request an upload session from S3.
- Upload individual parts: Split the file into parts and upload them separately.
- Complete the multipart upload: Finalize the upload by merging all parts into a single object.
Initiate a Multipart Upload
Before uploading file parts, you must first initiate a multipart upload. This generates an UploadId
, which is needed for all subsequent steps.
Initiate a multipart upload:
Example output
Upload Parts
After initiating the multipart upload, you need to split your large file into parts and upload them separately. Each part must:
- Be at least 5MB, except for the last part.
- Be uploaded with a unique part number (
1
,2
,3
, and so on). - Reference the same
UploadId
from the initiation step.
To upload a part use:
aws s3api upload-part --bucket <my_bucket> --key <my_large_file> \
--part-number <part_number> --upload-id <upload_id> \
--body <part_file>
Important
Each part of a multipart upload gets a unique ETag
. These ETags
are required when completing the upload, as they allow verification that all parts were uploaded correctly.
Refer to list parts to learn how to check which parts are already uploaded.
Complete a Multipart Upload
To complete the part you need to create a parts.json
file containing the ETags
and PartNumbers
from your uploaded parts. For example:
{
"Parts": [
{ "ETag": "\"a444d3c39516fe905583987f3d970788\"", "PartNumber": 1 },
{ "ETag": "\"8f95a5c4d45cc73ec09c764a63cf0503\"", "PartNumber": 2 },
{ "ETag": "\"b97f3207eddc95e0ded55864b0bd6d7e\"", "PartNumber": 3 }
]
}
Then complete the multipart upload:
aws s3api complete-multipart-upload --bucket <my_bucket> --key <my_large_file> \
--upload-id <upload_id> \
--multipart-upload file://parts.json
Example output
Verification
You may now verify the final object exists in the bucket:
You may also check the metadata about the object:
Example output
ContentLength
must match the original file size.ETag
: If multipart upload was used, theETag
will end in a hyphen and a number of parts.LastModified
should match the completion time of your upload.
To check data integrity you could download the file:
Then compute and compare the file hashes using MD5
- hashes must match, otherwise the uploaded file was corrrupted.
Example output
Abort a Multipart Upload
If you do not want to complete the multipart upload, you should abort it. This will delete all uploaded parts and free up storage in S3.
Warning
Incomplete multipart uploads consume storage space.
To abort a multipart upload run the following command (no output if successful):
aws s3api abort-multipart-upload --bucket <my_bucket> --key <my_large_file> \
--upload-id <upload_id>
List Multipart Uploads
You may list multipart uploads in a bucket:
Example output
{
"Uploads": [
{
"UploadId": "2~qai2DvR4o8x5HrMGFWtc6SGIL0Dt5eb",
"Key": "my-large-file2",
"Initiated": "2025-02-03T13:21:54.743000+00:00",
"StorageClass": "STANDARD",
"Owner": {
"DisplayName": "user-example",
"ID": "01926c4c-6707-757c-9592-33ec706f9617"
},
"Initiator": {
"ID": "01926c4c-6707-757c-9592-33ec706f9617",
"DisplayName": "user-example"
}
}
],
"RequestCharged": null,
"Prefix": null
}
List Uploaded Parts
You may list uploaded parts of a specific multipart upload:
Example output
{
"Parts": [
{
"PartNumber": 1,
"LastModified": "2025-02-03T13:54:26.070000+00:00",
"ETag": "\"a444d3c39516fe905583987f3d970788\"",
"Size": 104857600
}
],
"ChecksumAlgorithm": null,
"Initiator": null,
"Owner": {
"DisplayName": "user-example",
"ID": "01926c4c-6707-757c-9592-33ec706f9617"
},
"StorageClass": "STANDARD"
}
Best Practices
-
Choose the Right Chunk/Part Size: Selecting the correct part size helps balance speed and resource consumption.
File Size Recommended Chunk/Part Size < 100 MB Use a single PUT
request100 MB – 1 GB 64 MB 1 GB – 10 GB 128 MB 10 GB – 100 GB 256 MB > 100 GB 512 MB – 1 GB Key Considerations:
- Each part needs to be at least 5 MB (except the last one).
- Larger part sizes reduce overhead but require more RAM.
- Too many small parts increase API request overhead, slowing down uploads.
-
Optimize Concurrency: Parallel uploads improve performance but use more system resources.
Network/Hardware Recommended Concurrency Slow connection (1-10 Mbps) 1-2 Medium-speed connection (10-100 Mbps) 4-8 High-speed network (100+ Mbps) 8-16 Cloud/High-performance servers 16-32 Key Considerations:
- Too many parallel uploads can saturate your network or cause rate limiting.
- Balance concurrency based on CPU, RAM, and bandwidth availability.
- AWS CLI does not handle parallel uploads natively. You must run multiple upload commands in parallel.
-
Avoid Leaving Incomplete Uploads: Regularly clean up incomplete uploads. The easy way to do this is to set up a lifecycle policy. The below policy will delete all incomplete multipart uploads after 1 day. Read more on Lifecycle Policies.