Skip to content

Multipart Uploads

Introduction

When uploading large files to S3, using a single PUT request can be slow, unreliable, and prone to failure if the connection is interrupted. To solve this, S3 provides multipart upload, which allows you to split a large file into multiple smaller parts, upload them independently, and then combine them into a single object.

Important

Multipart uploads are recommended for objects larger than 100MB and required for objects above 5GB.

The benefits of multipart uploads include:

  • Resilience: If an upload fails due to network issues or other interruptions, only the affected part needs to be retried instead of restarting the entire upload.
  • Pause & Resume: You may stop an upload manually and resume later without losing progress, as already uploaded parts remain stored in S3.
  • Simultaneous Uploads: Multiple parts may be uploaded at the same time, speeding up the process.

To perform multipart upload you may either use high level commands which will abstract multipart uploads, retries and concurrency or you may use the low level aws s3api command (or equivalent) and gain more control over the process.

Easy Multipart Upload

To upload an object simply run:

aws s3 cp <local_file_path> s3://<bucket_name>/<object_name>
Example output
upload: .\testfile.bin to s3://my-bucket/testfile.bin

AWS will automatically split the file into parts if needed, choose the part sizes dynamically based on the total file size and merge all parts into a single object in S3 when the upload is complete.

Rclone handles the multipart upload process automatically. However, you may want to configure chunk sizes and parallel transfers to optimize performance.

To upload a file using Rclone's built-in multipart handling, run:

rclone copy <local_file> <s3_profile>:<my_bucket> \
    --s3-upload-concurrency <number_of_concurrent_uploads> \
    --s3-chunk-size <chunk_size>
    --progress \
    --checksum
Explanation:

  • --s3-upload-concurrency : Specifies how many parts are uploaded in parallel, improving speed for large files (default: 4).
  • --s3-chunk-size : Defines the size of each part before upload (default: 64M, minimum: 5M). Larger sizes reduce the number of parts but require more RAM.
  • --progress: Displays real-time upload statistics, including transfer speed, total transferred data, percentage progress, and estimated time remaining.
  • --checksum: Verifies the integrity of the uploaded file by comparing checksums, ensuring the remote file matches the local one. This prevents data corruption during transfer.
Example output
Transferred:          250 MiB / 250 MiB, 100%, 490.420 KiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:      1m11.0s

Manual Multipart Upload

A manual multipart upload using AWS CLI consists of three steps:

  1. Initiate the multipart upload: Request an upload session from S3.
  2. Upload individual parts: Split the file into parts and upload them separately.
  3. Complete the multipart upload: Finalize the upload by merging all parts into a single object.

Initiate a Multipart Upload

Before uploading file parts, you must first initiate a multipart upload. This generates an UploadId, which is needed for all subsequent steps.

Initiate a multipart upload:

aws s3api create-multipart-upload --bucket <my_bucket> --key <my_large_file>
Example output
{
    "Bucket": "my-bucket",
    "Key": "my-large-file",
    "UploadId": "2~CWpOCL7vpbZBfZrcC0fXROmhfC6Lqc3"
}

Upload Parts

After initiating the multipart upload, you need to split your large file into parts and upload them separately. Each part must:

  • Be at least 5MB, except for the last part.
  • Be uploaded with a unique part number (1, 2, 3, and so on).
  • Reference the same UploadId from the initiation step.

To upload a part use:

aws s3api upload-part --bucket <my_bucket> --key <my_large_file> \
    --part-number <part_number> --upload-id <upload_id> \
    --body <part_file>
Example output
{
    "ETag": "\"8f95a5c4d45cc73ec09c764a63cf0503\""
}

Important

Each part of a multipart upload gets a unique ETag . These ETags are required when completing the upload, as they allow verification that all parts were uploaded correctly.

Refer to list parts to learn how to check which parts are already uploaded.

Complete a Multipart Upload

To complete the part you need to create a parts.json file containing the ETags and PartNumbers from your uploaded parts. For example:

{
    "Parts": [
        { "ETag": "\"a444d3c39516fe905583987f3d970788\"", "PartNumber": 1 },
        { "ETag": "\"8f95a5c4d45cc73ec09c764a63cf0503\"", "PartNumber": 2 },
        { "ETag": "\"b97f3207eddc95e0ded55864b0bd6d7e\"", "PartNumber": 3 }
    ]
}

Then complete the multipart upload:

aws s3api complete-multipart-upload --bucket <my_bucket> --key <my_large_file> \
    --upload-id <upload_id> \
    --multipart-upload file://parts.json
Example output
{
    "VersionId": "wr4V1Cvb-LD-0H7jJxi6g61uT.zvyMG",
    "Location": "s3-a.zhw.cloud.switch.ch/my-bucket/my-large-file",
    "Bucket": "my-bucket",
    "Key": "my-large-file",
    "ETag": ""
}

Verification

You may now verify the final object exists in the bucket:

aws s3 ls s3://<my_bucket>
Example output
2025-02-03 13:07:44  262144000 my-large-file

You may also check the metadata about the object:

aws s3api head-object --bucket <my_bucket> --key <my_large_file>
Example output
{
"AcceptRanges": "bytes",
"LastModified": "2025-02-03T12:07:44+00:00",
"ContentLength": 262144000,
"ETag": "\"935318503c1d91e44157172bc29b4831-3\"",
"VersionId": "wr4V1Cvb-LD-0H7jJxi6g61uT.zvyMG",
"ContentType": "binary/octet-stream",
"Metadata": {},
}
  • ContentLength must match the original file size.
  • ETag: If multipart upload was used, the ETag will end in a hyphen and a number of parts.
  • LastModified should match the completion time of your upload.

To check data integrity you could download the file:

aws s3 cp s3://<my_bucket>/<my_large_file> <downloaded_file>

Then compute and compare the file hashes using MD5 - hashes must match, otherwise the uploaded file was corrrupted.

md5sum <original_file> <downloaded_file>
Example output
c91eeeb4e71d092c49473736a91b6d39 *testfile.bin
c91eeeb4e71d092c49473736a91b6d39 *downloaded-file.bin

Abort a Multipart Upload

If you do not want to complete the multipart upload, you should abort it. This will delete all uploaded parts and free up storage in S3.

Warning

Incomplete multipart uploads consume storage space.

To abort a multipart upload run the following command (no output if successful):

aws s3api abort-multipart-upload --bucket <my_bucket> --key <my_large_file> \
    --upload-id <upload_id>

List Multipart Uploads

You may list multipart uploads in a bucket:

aws s3api list-multipart-uploads --bucket <your-bucket-name>
Example output
{
    "Uploads": [
        {
            "UploadId": "2~qai2DvR4o8x5HrMGFWtc6SGIL0Dt5eb",
            "Key": "my-large-file2",
            "Initiated": "2025-02-03T13:21:54.743000+00:00",
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "user-example",
                "ID": "01926c4c-6707-757c-9592-33ec706f9617"
            },
            "Initiator": {
                "ID": "01926c4c-6707-757c-9592-33ec706f9617",
                "DisplayName": "user-example"
            }
        }
    ],
    "RequestCharged": null,
    "Prefix": null
}

List Uploaded Parts

You may list uploaded parts of a specific multipart upload:

aws s3api list-parts --bucket <my_bucket> --key <my_large_file> --upload-id <upload_id>
Example output
{
    "Parts": [
        {
            "PartNumber": 1,
            "LastModified": "2025-02-03T13:54:26.070000+00:00",
            "ETag": "\"a444d3c39516fe905583987f3d970788\"",
            "Size": 104857600
        }
    ],
    "ChecksumAlgorithm": null,
    "Initiator": null,
    "Owner": {
        "DisplayName": "user-example",
        "ID": "01926c4c-6707-757c-9592-33ec706f9617"
    },
    "StorageClass": "STANDARD"
}

Best Practices

  1. Choose the Right Chunk/Part Size: Selecting the correct part size helps balance speed and resource consumption.

    File Size Recommended Chunk/Part Size
    < 100 MB Use a single PUT request
    100 MB – 1 GB 64 MB
    1 GB – 10 GB 128 MB
    10 GB – 100 GB 256 MB
    > 100 GB 512 MB – 1 GB

    Key Considerations:

    • Each part needs to be at least 5 MB (except the last one).
    • Larger part sizes reduce overhead but require more RAM.
    • Too many small parts increase API request overhead, slowing down uploads.
  2. Optimize Concurrency: Parallel uploads improve performance but use more system resources.

    Network/Hardware Recommended Concurrency
    Slow connection (1-10 Mbps) 1-2
    Medium-speed connection (10-100 Mbps) 4-8
    High-speed network (100+ Mbps) 8-16
    Cloud/High-performance servers 16-32

    Key Considerations:

    • Too many parallel uploads can saturate your network or cause rate limiting.
    • Balance concurrency based on CPU, RAM, and bandwidth availability.
    • AWS CLI does not handle parallel uploads natively. You must run multiple upload commands in parallel.
  3. Avoid Leaving Incomplete Uploads: Regularly clean up incomplete uploads. The easy way to do this is to set up a lifecycle policy. The below policy will delete all incomplete multipart uploads after 1 day. Read more on Lifecycle Policies.

{
    "Rules": [
        {
            "ID": "incomplete-mpu-rule",
            "Status": "Enabled",
            "Filter": {
                "Prefix": ""
            },
            "AbortIncompleteMultipartUpload": {
                "DaysAfterInitiation": 1
            }
        }
    ]
}